Opened 2 years ago

Last modified 2 years ago

#14103 new bug

Retypechecking the loop in --make mode is super-linear when there are many .hs-boot modules

Reported by: duog Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.3
Keywords: hs-boot Cc: niteria
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

At time of writing master is at commit 14457cf6a50f708eecece8f286f08687791d51f7.

In GhcMake.hs, both upsweep and parUpsweep call getModLoop for each module. getModLoop is (at least) linear when called on a .hs-boot ModSummary. See comments in Phab:3646.

This isn't a big deal in practice, since .hs-boot modules are uncommon, but it does irk me, and it will be easy to fix. Fixing this would also be an opportunity to de-duplicate some logic in upsweep and parUpsweep.

ccing niteria as he has worked with this code recently.

Change History (3)

comment:1 Changed 2 years ago by niteria

If there's a cleaner way to do it, I'm all for it. I was conservative with my fix, because there was some code I didn't understand and I didn't want to change the behavior.

For example: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/main/GhcMake.hs;14457cf6a50f708eecece8f286f08687791d51f7$895-905

If there are loops with multiple (say A.hs-boot, B.hs-boot, C.hs-boot) hs-boot modules, it would extract a cycle with A.hs-boot, B.hs-boot, C.hs-boot, a cycle with B.hs-boot, C.hs-boot, and a cycle with C.hs-boot. Why? I don't know.

comment:2 Changed 2 years ago by ezyang

I would say, go for it. I suspect the weird behavior niteria describes was unintentional (or intentional, but only in the sense that it is easier to implement) and you can probably simplify it a lot. But do note that if the set of modules that gets retypechecked *changes*, that could introduce a bug. But there is some code in between getModLoop and the actual retypechecking...

comment:3 Changed 2 years ago by duog

I think the current behaviour is correct.

This is a complicated topic, so the explanation below is complicated, with lots of parenthesis!

For the purposes of retypechecking, the loops are only used for their module names. There is one loop per hs-boot file. When we go to compile a hs file that has an hs-boot file (and therefore finishes a loop), the home package table has a ModDetails for this module (built from the hs-boot file), and all modules that transitively import that hs-boot file have been typechecked against that ModDetails. We retypecheck (i.e. rebuild their ModDetails from their interface) all those modules (but NOT the hs-boot modules) before typechecking the loop closer, then replace it's hs-boot ModDetails in the home package tables with it's full ModDetails. We then retypecheck the loop again, so that Ids in modules that imported the hs-boot module have their unfoldings attached. (although maybe this is wrong, see trac:14092)

Note that it doesn't matter whether the modules we retypechecked were hs-boot interfaces or full hs interfaces. I think this is the key point.

Note: See TracTickets for help on using tickets.