Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#4344 closed proposal (invalid)

Better toRational for Float and Double

Reported by: Owned by:
Priority: normal Milestone: Not GHC
Component: Compiler Version: 6.12.3
Keywords: toRational, performance Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


The implementation of toRational in the Real instances of Float and Double is less than ideal.

instance  Real Float  where
    toRational x        =  (m%1)*(b%1)^^n
                           where (m,n) = decodeFloat x
                                 b     = floatRadix  x

The propsed implementation of powers for Rationals (#4337) would (when (^^) is included) alone yield a great boost, but here we can do even better.

I have benchmarked three versions of toRational against the current implementation, first, an inlined version of the proposed power modification:

{-# SPECIALISE toRat :: Float -> Rational,
                        Double -> Rational #-}
toRat :: RealFloat a => a -> Rational
toRat x = case decodeFloat x of
            (m,e) -> case floatRadix x of
                        b -> if e < 0
                                then (m % (b^(negate e)))
                                else (m * b^e) :% 1

If the exponent is nonnegative, we need not reduce (even though that reduction would be comparatively cheap since the denominator is 1, it's not free).

Next, in GHC.Float there is the condition that the floatRadix be 2, hence we can eliminate the call to floatRadix and inline. That allows to skip the reduction also in some cases where the exponent is negative:

{-# SPECIALISE toRat2 :: Float -> Rational,
                         Double -> Rational #-}
toRat2 :: RealFloat a => a -> Rational
toRat2 x = case decodeFloat x of
              (m,e) | e < 0     -> if even m
                                    then m % (2 ^ (-e))
                                    else m :% (2 ^ (-e))
                    | otherwise -> (m * 2^e) :% 1

Finally, powers of 2 can be more efficiently calculated via bit-shifting and the test for evenness is usually faster as a bit-test:

{-# SPECIALISE toRat3 :: Float -> Rational,
                         Double -> Rational #-}
toRat3 :: RealFloat a => a -> Rational
toRat3 x = case decodeFloat x of
            (m,e) | e < 0     -> case 1 `shiftL` (-e) of
                                    !d -> if fromInteger m .&. (1 :: Int) == 0
                                            then m % d
                                            else m :% d
                  | otherwise -> (m `shiftL` e) :% 1

The results vary of course depending on the sample of numbers one converts, but the trend is clear:

Current:      100 ms - 123 ms
Inlined:       32 ms -  40 ms
Specialised:   25 ms -  31 ms
Shifting:      15 ms -  23 ms

Of course, the value of that is limited, the real bottleneck in realToFrac is fromRational, which raises the times for the benchmarks by about 450 ms when added instead of a dummy conversion Rational -> Float. And using realToFrac for the conversion Double -> Float, per the rewrite rule, the benchmarks are done in about 1 ms.

Attachments (7)

buildlog.tar.tar.gz (78.5 KB) - added by 9 years ago.
Log of failed build
criterionFromRational.hs (1.9 KB) - added by 9 years ago.
Criterion Benchmark for fromRational
simpleFromRational.hs (2.2 KB) - added by 9 years ago.
Benchmark for fromRational without criterion
FloatingRationalConversions.tar.gz (6.7 KB) - added by 9 years ago.
Benchmarks and QC tests for new fromRational and toRational implementations
logarithms-integer-gmp.dpatch (11.0 KB) - added by 9 years ago.
logarithms-integer-simple.dpatch (8.2 KB) - added by 9 years ago.
fix4344.dpatch (61.7 KB) - added by 9 years ago.

Download all attachments as: .zip

Change History (24)

comment:1 Changed 9 years ago by simonpj

Sounds great. Do you have a specific patch to propose?


comment:2 Changed 9 years ago by

Well, I would replace the body of toRational in the Real instances of Float and Double with more or less the body of toRat3.

The bang on d makes no difference with optimisations and little difference without, so that'll go. If testing also for e == 0 makes a difference at all, it's minuscule (less than 1%), so I'd leave that out.

I would like to test whether something can be gained by determining the number of trailing 0-bits of m to avoid all gcds before I propose a specific patch.

Since GHC.Float is GHC-internal and maintainer is cvs-ghc and not libraries, should the proposal go through the process nevertheless or would I just attach the patch here and you validate and push directly?

comment:3 Changed 9 years ago by

Okay, how ugly are you willing to accept?

Counting zeros and shifting instead of calling out to gcd in the case of negative exponent and even significand reduces the time needed for my benchmarking loops by a factor of a bit above 2. The code passed over 1.5 million QuickCheck tests, so I'm pretty confident I've got it right. So far, I've used GHC.Integer.GMP.Internals (for the ByteArray#), which makes it not portable (and it doesn't work for Ints shorter than 32 bits).

But the significand is at most 53 bits, so I could use Int64 for the work and stay portable. I'll try whether that is competitive. It would make the code simpler, so it could even be faster.

comment:4 Changed 9 years ago by

Very nice. The simpler code is indeed faster. With

floatRat :: Float -> Rational
floatRat x
    | x == 0    = 0 :% 1
    | otherwise =
      case decodeFloat x of
        (m, e) | e >= 0 -> (m `shiftL` e) :% 1
               | otherwise ->
                 case fromInteger m :: Int of
                    m' | m' .&. 1 == 0 ->
                            shiftOutZeros m' (-e)
                       | otherwise -> m :% (1 `shiftL` (-e))

doubRat :: Double -> Rational
doubRat x
    | x == 0    = 0 :% 1
    | otherwise =
      case decodeFloat x of
        (m,e) | e >= 0 -> (m `shiftL` e) :% 1
              | otherwise -> shiftOutZeros (fromInteger m :: Int64) (-e)

{-# SPECIALISE shiftOutZeros :: Int -> Int -> Rational,
                                Int64 -> Int -> Rational #-}
-- Precondition: first arg nonzero, second arg > 0
-- To be called only from 'toRational'
shiftOutZeros :: (Integral a, Bits a) => a -> Int -> Rational
shiftOutZeros m e
    | e <= t    = fromIntegral (m `shiftR` e) :% 1
    | t < 8     = fromIntegral (m `shiftR` t) :% (1 `shiftL` (e-t))
    | otherwise = shiftOutZeros (m `shiftR` 8) (e-8)
        !t  = zeros `unsafeAt` (fromIntegral m .&. 255)

-- Cache of number of trailing 0-bits
zeros :: Array Int Int
zeros = listArray (0,255)
            [ 8, 0, 1, 0, 2, 0, 1, 0

I'm even faster than with indexIntArray# (5-10%). With an unboxed array for the cache, it's a few percent faster again. Is any kind of unboxed array available in GHC.Float?

Note that on my 32-bit system, for Float it's faster to test for evenness before going to shiftOutZeros while for Double it's not (not if I (.&.) with an Int64 at least, must try whether (.&.)-ing with an Int is faster).

For 64-bit systems, Double should use the exact same code as Float, presumably.

Changed 9 years ago by

Attachment: buildlog.tar.tar.gz added

Log of failed build

comment:5 Changed 9 years ago by

Oops, that was for #4349. How did I get back here?

comment:6 Changed 9 years ago by

not if I (.&.) with an Int64 at least, must try whether (.&.)-ing with an Int is faster

Indeed it is. Not much, about 3%, but hey.

comment:7 Changed 9 years ago by simonpj

Thanks very much for your work on this numeric stuff. It's a real help. Plese sing out when you think you've converged, make a patch (or write out the specific final version), attach to this ticket, and submit to the libraries list.


comment:8 Changed 9 years ago by

Owner: set to

I think I have it (both, toRational and fromRational), currently compiling. When tests and benchmarks pass, I'll submit. Previous benchmarks promise a good speedup.

comment:9 Changed 9 years ago by

Above is an archive containing new implementations for toRational and fromRational from/to Double and Float and QuickCheck tests and benchmarks.

To build the modules, the integer-gmp package is required, the patch will include the logarithm functions for integer-simple too.

The QuickCheck programme accepts (optional) parameters

  1. number of tests to perform for each property
  2. size parameter for the generator
  3. maximal number of tests to discard (irrelevant)

If you use QuickCheck-2.3.*, don't set the number of test cases too high, it eats memory.

The simple* benchmarks require only the core libs shipped with GHC, the test size can be set with "-bd num" on the command line.
The criterion* benchmarks require criterion, you can set the test size with "-bd num" as the first two command line args, all subsequent args are passed to criterion's defaultMain.

All benchmarks measure Float -> Double resp. Double -> Float conversion, I've forgotten to write tests for Rational -> Float/Double conversions for Rational's whose denominator is not a power of 2. I intend to add that soon.

Please test extensively on 64-bits to make sure I haven't made a glitch there.

comment:10 Changed 9 years ago by

Type: bugproposal

Proposal: include faster implementations for

  1. toRational :: Float -> Rational
  2. toRational :: Double -> Rational
  3. fromRational :: Rational -> Float
  4. fromRational :: Rational -> Double

The semantics of these functions shall remain the same as it is now, only their speed will be affected.

For fromRational, a fast integer logarithm is essential. Without access to the internal representation of Integer, it would be significantly slower, so this proposal includes the addition of modules defining the needed integer logarithm functions to the packages integer-gmp and integer-simple.

Since Int is not available in the integer-* packages, those functions would return Int# values. I suggest adding wrappers integerLog2 :: Integer -> Int and integerLogBase :: Integer -> Integer -> Int to some module in base because those functions are far more generally useful. Currently integerLogBase is available from GHC.Float, so we might stick them there, although it's not very intuitive. Suggestions welcome.

Discussion period: one week, until 29th October 2010 (I'd like to see it in 7.0).

Changed 9 years ago by

Attachment: criterionFromRational.hs added

Criterion Benchmark for fromRational

Changed 9 years ago by

Attachment: simpleFromRational.hs added

Benchmark for fromRational without criterion

comment:11 Changed 9 years ago by

Start of thread on the mailing list.

Changed 9 years ago by

Benchmarks and QC tests for new fromRational and toRational implementations

comment:12 Changed 9 years ago by

Included the fromRational benchmarks in the bundle and fixed a glitch in the simple benchmarks (by which double2Float and float2Double were always reported as taking zero time).

Changed 9 years ago by

Changed 9 years ago by

Changed 9 years ago by

Attachment: fix4344.dpatch added

comment:13 Changed 9 years ago by igloo

Milestone: Not GHC

comment:14 Changed 9 years ago by gidyn

Cc: gideon@… added

comment:15 Changed 9 years ago by

Whoops. Out of sight, out of mind. Let me add comments and see what I can do to get the logarithm stuff fast for integer-simple, then I'll open a new ticket and re-propose. There should be enough time for 7.0.2 without a rush.

comment:16 Changed 9 years ago by igloo

Resolution: invalid
Status: newclosed

Proposal tickets are no longer needed as part of the library submissions process. Instead, a normal ticket should be created once consensus has been achieved. Please see the process description for details.

comment:17 Changed 9 years ago by gidyn

Cc: gideon@… removed
Note: See TracTickets for help on using tickets.