Opened 5 years ago

Last modified 4 years ago

#9786 new task

Make quot/rem/div/mod with known divisors fast

Reported by: dfeuer Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.9
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

GHC (with NCG) currently optimizes Int division by powers of two, but not by other known divisors. The Intel Optimization Manual section 9.2.4 describes a general technique for replacing division by known, sufficiently small, divisors with multiplication. LLVM appears to go further in some fashion. There's no reason we can't do something similar.

Change History (4)

comment:1 Changed 5 years ago by dfeuer

It looks like https://gmplib.org/~tege/divcnst-pldi94.pdf is a good source for the algorithms involved. There are algorithms there both for what Haskell calls quot and what it calls div.

comment:2 Changed 5 years ago by dfeuer

The more I think about it, the more likely it seems that we should NOINLINE divMod and quotRem until very, very late, or maybe never inline them. There are two reasons for this:

  1. We would prefer to snag them unmodified when we recognize known divisors.
  1. As described in #9617, it would be very nice to use CSE to merge div with mod and quot with rem automatically.

The major complication, of course, is that in the case of an unknown divisor that is used multiple times, we'd like to avoid repeating the sign test. I still have no clear sense of how to resolve the tension between these desires.

comment:3 Changed 4 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:4 Changed 4 years ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.