Opened 2 years ago

Last modified 2 years ago

#14663 new bug

Deriving Typeable for enumerations seems expensive

Reported by: niteria Owned by:
Priority: low Milestone:
Component: Compiler Version:
Keywords: Typeable Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

I have a module A10000 that looks like this:

module A10000 where

data A = A
  | A00001
  | A00002
...
  | A10000

Currently compiling it with ./inplace/bin/ghc-stage2 A10000.hs +RTS -s produces:

[1 of 1] Compiling A10000           ( A10000.hs, A10000.o )                                   
   4,133,470,392 bytes allocated in the heap   
   1,194,866,080 bytes copied during GC        
     141,604,816 bytes maximum residency (14 sample(s))                                       
         813,104 bytes maximum slop            
             341 MB total memory in use (0 MB lost due to fragmentation)                      

                                     Tot time (elapsed)  Avg pause  Max pause                 
  Gen  0       329 colls,     0 par    0.551s   0.551s     0.0017s    0.0246s                 
  Gen  1        14 colls,     0 par    0.453s   0.453s     0.0323s    0.1031s                 

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)                                     

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)                             

  INIT    time    0.000s  (  0.000s elapsed)   
  MUT     time    1.498s  (  1.730s elapsed)   
  GC      time    1.004s  (  1.004s elapsed)   
  EXIT    time    0.000s  (  0.006s elapsed)   
  Total   time    2.502s  (  2.740s elapsed)   

  Alloc rate    2,759,911,143 bytes per MUT second                                            

  Productivity  59.9% of total user, 63.4% of total elapsed                                   

I've noticed a lot of code getting generated (>500k lines of ASM), particularly interesting was code that supported TyCons.

I've tried again disabling the generation of TyCons by modifying:

mkTypeRepTodoBinds :: [TypeRepTodo] -> TcM TcGblEnv
mkTypeRepTodoBinds _ = getGblEnv

This is the result:

[1 of 1] Compiling A10000           ( A10000.hs, A10000.o )                                   
   1,731,693,280 bytes allocated in the heap   
     280,362,376 bytes copied during GC        
      41,423,608 bytes maximum residency (10 sample(s))                                       
         746,272 bytes maximum slop            
             102 MB total memory in use (0 MB lost due to fragmentation)                      

                                     Tot time (elapsed)  Avg pause  Max pause                 
  Gen  0       401 colls,     0 par    0.111s   0.111s     0.0003s    0.0065s                 
  Gen  1        10 colls,     0 par    0.124s   0.124s     0.0124s    0.0298s                 

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)                                     

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)                             

  INIT    time    0.000s  (  0.000s elapsed)   
  MUT     time    0.773s  (  0.889s elapsed)   
  GC      time    0.235s  (  0.235s elapsed)   
  EXIT    time    0.000s  (  0.007s elapsed)   
  Total   time    1.008s  (  1.130s elapsed)   

  Alloc rate    2,241,052,377 bytes per MUT second                                            

  Productivity  76.7% of total user, 79.2% of total elapsed

It appears that by default I pay >50% of compile time for a feature that I probably won't use.

I'm sorry if this is a duplicate, I've looked at https://ghc.haskell.org/trac/ghc/wiki/Typeable, but nothing seemed relevant.

Change History (2)

comment:1 Changed 2 years ago by simonpj

It appears that by default I pay >50% of compile time for a feature that I probably won't use.

Yes, we debated this quite a bit. For each data type we generate a bit of static information, just Haskell data structures, to describe the type. That way, anyone (in another module) who needs Typeable T or Typeable 'A0001 can have it.

But mostly they don't need that, so it's wasted bloat.

Another alternative is to generate those data structure on-the-fly in every client module. Before long we'd be saying "let's avoid doing that multiple times in the same module, or doing it in module M if it's already done in one of M's imports", and we'd add machinery to avoid duplication. That is all extra complexity. Maybe it's justified. It'd be interesting to know who else tripped over this.

comment:2 Changed 2 years ago by simonpj

Keywords: Typeable added
Note: See TracTickets for help on using tickets.