Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#9638 closed feature request (fixed)

Speed up Data.Char.isDigit

Reported by: dfeuer Owned by:
Priority: normal Milestone:
Component: Core Libraries Version: 7.9
Keywords: Cc: hvr, ekmett, core-libraries-committee@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #11382 Differential Rev(s):
Wiki Page:

Description

isDigit is currently defined like this:

isDigit                 :: Char -> Bool
isDigit c               =  c >= '0' && c <= '9'

This will short-circuit the right way if you're looking for digits mixed with spaces, but the wrong way if you're looking for digits mixed with letters. It also requires a conditional jump to do that short-circuiting (confirmed by inspecting the assembly). It should be better to use an unsigned comparison instead:

isDigit                 :: Char -> Bool
isDigit c               = (fromIntegral (ord c) :: Word) - 48 <= 9

The interesting section looks like this

        movq 7(%rbx),%rax
        addq $-48,%rax
        cmpq $9,%rax
        setbe %al
        movzbl %al,%eax
        shlq $3,%rax
        movq ghczmprim_GHCziTypes_Bool_closure_tbl(%rax),%rbx
        addq $8,%rbp
        jmp *(%rbp)

or like this with -fllvm:

        movq    7(%rbx), %rax
        addq    $-48, %rax
        cmpq    $10, %rax
        sbbq    %rax, %rax
        andq    $8, %rax
        movq    ghczmprim_GHCziTypes_Bool_closure_tbl(%rax), %rbx
        movq    8(%rbp), %rax
        addq    $8, %rbp
        jmpq    *%rax  # TAILCALL

Change History (14)

comment:1 Changed 5 years ago by dfeuer

The same should probably apply to the (non-exported) between function in GHC.IO.Encoding.UTF8.

comment:2 Changed 5 years ago by dfeuer

Cc: hvr ekmett added
Component: Compilerlibraries/base

This approach should also combine well with ways digits are used. For instance, we could write (hopefully with better formatting)

digitToInt :: Char -> Int
digitToInt c = case ord c - ord '0' of
  d | (fromIntegral d::Word) <= 9 -> d
    | otherwise ->
        case ord c - ord 'a' of
          e | (fromIntegral e::Word) <= 5 -> e + 10
            | otherwise ->
                case ord c - ord 'A' of
                  f | (fromIntegral f::Word) <= 5 -> f + 10
                    | otherwise -> error ("Char.digitToInt: not a digit " ++ show c)
Last edited 5 years ago by dfeuer (previous) (diff)

comment:3 Changed 5 years ago by carter

These sound reasonable, do we have any (micro)benchmarks to validate that theres a meaningful improvement in performance to pay down the increased complexity of the implementation?

comment:4 in reply to:  3 Changed 5 years ago by dfeuer

Replying to carter:

These sound reasonable, do we have any (micro)benchmarks to validate that there's a meaningful improvement in performance to pay down the increased complexity of the implementation?

Yes. I tested evaluating nf (map isDigit) testcase with the following test cases:

toosmall = replicate 700 '-'
toobig = replicate 700 'a'
justright = replicate 700 '7'
arb = "12asaa489oneuhl98'c43rub93d'i'i98car278urnatehodth98228hteou84348hcae84---       - -   424  42 hteohu 2   4hto24hth    ehua942dp.dghckbaa---;    48923"
arbitrary = arb ++ arb ++ reverse arb ++ reverse arb

As expected, the fancy code seemed to be very slightly worse for toosmall, certainly not worse enough to really get a proper measurement (it's only one extra addition). The rest did better with the fancy code; the difference was consistently biggest for arbitrary. Obviously we're not talking huge changes, but I think this function and similar ones get a good bit of use, so it's probably worth bothering.

benchmarking toosmall/old
time                 12.31 μs   (12.27 μs .. 12.36 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 12.37 μs   (12.34 μs .. 12.40 μs)
std dev              100.2 ns   (69.41 ns .. 138.7 ns)

benchmarking toosmall/new
time                 12.42 μs   (12.41 μs .. 12.43 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.43 μs   (12.42 μs .. 12.44 μs)
std dev              36.07 ns   (24.75 ns .. 59.63 ns)

benchmarking toobig/old
time                 13.05 μs   (13.04 μs .. 13.06 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 13.05 μs   (13.04 μs .. 13.06 μs)
std dev              26.99 ns   (19.95 ns .. 43.44 ns)

benchmarking toobig/new
time                 12.43 μs   (12.42 μs .. 12.45 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.45 μs   (12.44 μs .. 12.48 μs)
std dev              65.84 ns   (32.99 ns .. 129.0 ns)

benchmarking justright/old
time                 13.07 μs   (13.06 μs .. 13.08 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 13.08 μs   (13.06 μs .. 13.09 μs)
std dev              42.63 ns   (30.33 ns .. 56.90 ns)

benchmarking justright/new
time                 12.41 μs   (12.40 μs .. 12.42 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.41 μs   (12.41 μs .. 12.42 μs)
std dev              25.29 ns   (21.00 ns .. 31.20 ns)

benchmarking arbitrary/old
time                 11.69 μs   (11.68 μs .. 11.71 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 11.69 μs   (11.68 μs .. 11.71 μs)
std dev              50.41 ns   (21.45 ns .. 85.71 ns)

benchmarking arbitrary/new
time                 10.65 μs   (10.64 μs .. 10.66 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 10.65 μs   (10.64 μs .. 10.66 μs)
std dev              21.31 ns   (14.90 ns .. 36.62 ns)

comment:5 Changed 5 years ago by schyler

+1 from me, cool idea.

comment:6 Changed 5 years ago by ekmett

Given the benchmarks it looks good to me.

comment:7 Changed 5 years ago by hvr

Well, fire up a code-revision over at Phab:differential :-)

comment:8 Changed 5 years ago by thoughtpolice

Component: libraries/baseCore Libraries
Owner: set to ekmett

Moving over to new owning component 'Core Libraries'.

comment:9 Changed 5 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:10 Changed 4 years ago by thomie

Cc: core-libraries-committee@… added

dfeuer, is 31571270625a690410b794b7cfe48d866c084e74/ghc sufficient to close this ticket or is there more to be done here?

Author: David Feuer <>
Date:   Tue Oct 21 15:01:14 2014 -0500

    Improve isDigit, isSpace, etc.
    
    Summary:
    Make things less branchy; use unsigned comparisons for range checking.
    Eliminate non-spaces more quickly in common cases in isSpace.
    
    Reviewers: ekmett, carter, austin
    
    Reviewed By: austin
    
    Subscribers: thomie, carter, ezyang, simonmar
    
    Differential Revision: https://phabricator.haskell.org/D340
    
    GHC Trac Issues: #1473

comment:11 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:12 Changed 4 years ago by thomie

Owner: ekmett deleted

comment:13 Changed 4 years ago by bgamari

Resolution: fixed
Status: newclosed

It seems that David's commit has addressed the principle complaint of this ticket. We can track further improvements in #11382.

comment:14 Changed 4 years ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.