Opened 11 months ago

Last modified 10 months ago

#16014 new task

Avoid unnecessary code duplication from String Literals.

Reported by: AndreasK Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.6.2
Keywords: Cc: osa1
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #15113 Differential Rev(s):
Wiki Page:


When we compile a function like this:

foo :: Int -> String
foo 1 = "1_String"
foo 2 = "2_String"
foo 3 = "3_String"
foo 4 = "4_String"
foo 5 = "5_String"
foo _ = "unknown_string"

We get a few things.

Obviously there are the actual strings:

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
Foo.foo10 :: GHC.Prim.Addr#
Foo.foo10 = "1_String"#

There is no way around these so that's fine.

Then we end up with a function to unpack the actual string.

-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
Foo.foo9 :: [Char]
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=True,
         WorkFree=False, Expandable=True, Guidance=IF_ARGS [] 20 0}]
Foo.foo9 = GHC.CString.unpackCString# Foo.foo10

We want to have a closure for each of these. Otherwise we end up constructing a new string each time, which would be a waste. So on the surface this is fine. However we end up with one of these for each string.

Looking at the Cmm/Asm the functions are not THAT small.

[Foo.foo9_entry() //  [R1]
         { info_tbls: [(c2hj,
                        label: Foo.foo9_info
                        rep: HeapRep static { Thunk }
                        srt: Nothing)]
           stack_info: arg_space: 8 updfr_space: Just 8
       c2hj: // global
           if ((Sp + -16) < SpLim) (likely: False) goto c2hk; else goto c2hl;
       c2hk: // global
           R1 = R1;
           call (stg_gc_enter_1)(R1) args: 8, res: 0, upd: 8;
       c2hl: // global
           (_c2hg::I64) = call "ccall" arg hints:  [PtrHint,
                                                    PtrHint]  result hints:  [PtrHint] newCAF(BaseReg, R1);
           if (_c2hg::I64 == 0) goto c2hi; else goto c2hh;
       c2hi: // global
           call (I64[R1])() args: 8, res: 0, upd: 8;
       c2hh: // global
           I64[Sp - 16] = stg_bh_upd_frame_info;
           I64[Sp - 8] = _c2hg::I64;
           R2 = Foo.foo10_bytes;
           Sp = Sp - 16;
           call GHC.CString.unpackCString#_info(R2) args: 24, res: 0, upd: 24;

In actual code size (HEAD/-O2) the code + info table is about 100Byte.

To give an idea in my current ghc-stage2 I have 12,409 calls to unpackCString, and I assume 99% of them are from actual strings. So that comes out to about one megabyte. Out of a binary size of 84MB, so surprisingly significant actually!

Now what can we do about this!

I think the right way to go here is to define a unpackString function which does the actual work. Then we jump into that from a small per-string wrapper.

So we would end up with code like:

[Foo.foo9_entry() //  [R1]
        // R1 = Closure pointer
        R2 = Foo.foo10_bytes;
        call (unpackString)(R1,R2) args: 8, res: 0, upd: 8;

Which should be well under 50 bytes. It does cause an extra indirection when unpacking the String, but if the performance of that matters using String is likely the wrong choice to begin with.

How to actually get GHC to generate code like that I'm not sure yet.

Change History (4)

comment:1 Changed 11 months ago by bgamari

#15113 is another take on this topic.

Last edited 11 months ago by bgamari (previous) (diff)

comment:2 Changed 11 months ago by AndreasK

comment:3 Changed 11 months ago by bgamari

Milestone: 8.6.3

Ticket retargeted after milestone closed

comment:4 Changed 10 months ago by osa1

Cc: osa1 added
Note: See TracTickets for help on using tickets.