3 | | worth looking at. |
4 | | |
5 | | == Infelicity 1: analysis does not take into account extra unboxing done by the CPR transformation == |
6 | | An example is: |
7 | | {{{ |
8 | | e :: (Int, Int) -> Int -> (Int, Int) |
9 | | e x y = x `seq` if y > 10 |
10 | | then x |
11 | | else e x (y + 1) |
12 | | }}} |
13 | | Because x is seqed, the occurrence in the "then" branch gets the CPR |
14 | | property so e has the CPR property overall. However, the overall |
15 | | demand on x is S(AA), i.e. the demand analyser believes the x box is |
16 | | used -- if the box were unused it would get the demand U(LL). So the |
17 | | overall demand type is S(AA)U(L)m, and the worker looks like: |
18 | | {{{ |
19 | | Rec { |
20 | | CPR2.$we [Occ=LoopBreaker] |
21 | | :: (GHC.Types.Int, GHC.Types.Int) |
22 | | -> GHC.Prim.Int# -> (# GHC.Types.Int, GHC.Types.Int #) |
23 | | [GblId, |
24 | | Arity=2, |
25 | | Caf=NoCafRefs, |
26 | | Str=DmdType S(AA)L, |
27 | | Unf=OtherCon []] |
28 | | CPR2.$we = |
29 | | \ (w_srv :: (GHC.Types.Int, GHC.Types.Int)) |
30 | | (ww_srt :: GHC.Prim.Int#) -> |
31 | | case GHC.Prim.># ww_srt 10 of _ { |
32 | | GHC.Types.False -> |
33 | | case GHC.Prim.+# ww_srt 1 of sat_ssS { __DEFAULT -> |
34 | | CPR2.$we w_srv sat_ssS |
35 | | }; |
36 | | GHC.Types.True -> |
37 | | case w_srv of _ { (ww2_srA, ww3_srB) -> (# ww2_srA, ww3_srB #) } |
38 | | } |
39 | | end Rec } |
40 | | }}} |
41 | | But if we demand-analysed $we then GHC would say that it had the |
42 | | demand type U(LL)L and unbox the pair argument! It seems silly that |
43 | | the demand analyser outputs code that can be improved further by just |
44 | | running demand analysis again. |
45 | | |
46 | | Somewhere where this really bit me in practice is in: |
47 | | {{{ |
48 | | d :: M.Map Int Int -> (Int, Int) |
49 | | d m = M.foldrWithKey' (\k v (a, b) -> if k + v > 2 then (a, b) else |
50 | | (b, a)) (0, 1) m |
51 | | }}} |
52 | | Which generates this inner loop: |
53 | | {{{ |
54 | | Rec { |
55 | | CPR2.$wgo2 [Occ=LoopBreaker] |
56 | | :: (GHC.Types.Int, GHC.Types.Int) |
57 | | -> Data.Map.Base.Map GHC.Types.Int GHC.Types.Int |
58 | | -> (# GHC.Types.Int, GHC.Types.Int #) |
59 | | [GblId, |
60 | | Arity=2, |
61 | | Caf=NoCafRefs, |
62 | | Str=DmdType S(AA)S, |
63 | | Unf=OtherCon []] |
64 | | CPR2.$wgo2 = |
65 | | \ (w_srS :: (GHC.Types.Int, GHC.Types.Int)) |
66 | | (w1_srQ :: Data.Map.Base.Map GHC.Types.Int GHC.Types.Int) -> |
67 | | case w1_srQ of _ { |
68 | | Data.Map.Base.Tip -> |
69 | | case w_srS of _ { (ww1_srW, ww2_srX) -> (# ww1_srW, ww2_srX #) }; |
70 | | Data.Map.Base.Bin rb_st1 kx_ss3 x_ssa l_ssk r_ss6 -> |
71 | | case kx_ss3 of _ { GHC.Types.I# x1_ssd -> |
72 | | case CPR2.$wgo2 w_srS r_ss6 of _ { (# ww1_ssi, ww2_ssh #) -> |
73 | | case x_ssa of _ { GHC.Types.I# y_sse -> |
74 | | case GHC.Prim.+# x1_ssd y_sse of sat_st0 { __DEFAULT -> |
75 | | case GHC.Prim.># sat_st0 2 of _ { |
76 | | GHC.Types.False -> |
77 | | let { |
78 | | sat_ssZ :: (GHC.Types.Int, GHC.Types.Int) |
79 | | [LclId] |
80 | | sat_ssZ = (ww2_ssh, ww1_ssi) } in |
81 | | CPR2.$wgo2 sat_ssZ l_ssk; |
82 | | GHC.Types.True -> |
83 | | let { |
84 | | sat_st6 :: (GHC.Types.Int, GHC.Types.Int) |
85 | | [LclId] |
86 | | sat_st6 = (ww1_ssi, ww2_ssh) } in |
87 | | CPR2.$wgo2 sat_st6 l_ssk |
88 | | } |
89 | | } |
90 | | } |
91 | | } |
92 | | } |
93 | | } |
94 | | end Rec } |
95 | | }}} |
96 | | We can save a number of allocations proportional to the size of the |
97 | | Map if the demand analyser didn't have this problem. |
98 | | |
99 | | I hacked up the analyser to "fix" this by changing the lines at line |
100 | | 473 onward to read: |
101 | | {{{ |
102 | | if isTopLevel top_lvl |
103 | | then fn_ty -- Don't record top level things |
104 | | else case dmd of |
105 | | Box (Eval (Poly Abs)) | DmdType _ _ res <- fn_ty, |
106 | | returnsCPR res -> addVarDmd fn_ty var (Eval (Poly lazyDmd)) |
107 | | _ |
108 | | -> addVarDmd fn_ty var dmd |
109 | | }}} |
110 | | So if we demand a CPRish variable (such as bound by a case scrutinee |
111 | | or a U(LL)-demanded lambda-binder) in the evalDmd then I throw away |
112 | | the Box part of the evalDmd on the basis that after CPRing we won't |
113 | | demand the box at all. |
114 | | |
115 | | I doubt this hack is the right solution (and I haven't tried it to see |
116 | | how it affects the libraries) but perhaps it gives you some ideas. |
117 | | |
| 3 | worth looking at. One is #5949. The other is: |