| 182 | Implementation is unfortunately tricky. Simply eliminating the boxing in Stg is |
| 183 | easy, and this by itself saves us two words per value + pointer dereferencing. |
| 184 | However, the generated code will be ugly, and if we could do this in Core |
| 185 | instead of Stg, the simplifier would be able to do some follow-up optimizations |
| 186 | and generate good code. |
| 187 | |
| 188 | To be more specific, we want to do these transformations: |
| 189 | |
| 190 | First: |
| 191 | {{{ |
| 192 | D arg1 arg2 ... argN |
| 193 | ==> |
| 194 | nv_arg (where nv_arg is the only non-void argument) |
| 195 | }}} |
| 196 | |
| 197 | (but we somehow need to bind other args or do substitution. If we do this Stg |
| 198 | though we don't need to bind those args as unarise doesn't care about what a |
| 199 | void argument is as long as it's void it gets rid of it and it can check |
| 200 | void-ness by looking at Id's type) |
| 201 | |
| 202 | Second: |
| 203 | |
| 204 | {{{ |
| 205 | case <exp1> of |
| 206 | D arg1 arg2 ... argN -> <exp2> |
| 207 | ==> |
| 208 | let arg1 = ... |
| 209 | arg2 = ... |
| 210 | arg3 = ... |
| 211 | in <exp2> |
| 212 | }}} |
| 213 | (we know only one of these args will be non-void, but all of them should be |
| 214 | bound as they can be referred in <exp2>) |
| 215 | |
| 216 | If we do this in Stg we lose some optimization opportunities and generate ugly |
| 217 | code. For example, if the first transformation happens in a let-binding RHS |
| 218 | maybe simplifier decides to inline it as it can't duplicate work after the |
| 219 | transformation. Similarly it can decide to inline the non-void argument after |
| 220 | second transformation which may lead to further optimizations etc. |
| 221 | |
| 222 | For an example of an ugly code, suppose we had this: |
| 223 | |
| 224 | {{{ |
| 225 | case <exp1> of |
| 226 | D (T x) -> <exp2> |
| 227 | }}} |
| 228 | |
| 229 | in Stg this looks like |
| 230 | |
| 231 | {{{ |
| 232 | case <exp1> of |
| 233 | D v -> case v of |
| 234 | T x -> <exp2> |
| 235 | }}} |
| 236 | |
| 237 | So now if we do the second transformation we get |
| 238 | |
| 239 | {{{ |
| 240 | let v = <exp1> in |
| 241 | case v of |
| 242 | T x -> <exp2> |
| 243 | }}} |
| 244 | |
| 245 | but ideally we'd get |
| 246 | |
| 247 | {{{ |
| 248 | case <exp1> of |
| 249 | T x -> <exp2> |
| 250 | }}} |
| 251 | |
| 252 | Simplifier would be able to do this after the second transformation. |
| 253 | |
| 254 | So the problem is |
| 255 | |
| 256 | - If we implement this in Stg we generate ugly code, and miss some optimization |
| 257 | opportunities (and arguably it doesn't buy us much, it saves 2 words per |
| 258 | allocation + pointer dereferencing) |
| 259 | |
| 260 | - Implementing this in Core is very hard, if not impossible, without losing |
| 261 | type safety. |
| 262 | |
| 263 | (copied from [https://mail.haskell.org/pipermail/ghc-devs/2016-September/012741.html the mailing list discussion]) |