182 | | I know few details about Core, but I would ''guess'' that the only Core feature we'd need to add is a flag for each data type indicating whether it is eligible for the newtype optimization. If so, then that optimization can be applied in code generation when types are dropped. There might be some future tuning to adjust how the inliner treats the constructor, but that doesn't seem at all urgent for now. |
183 | | |
184 | | That said, it would be nice also to try to avoid "double forcing" when digging through the constructors. |
| 182 | Implementation is unfortunately tricky. Simply eliminating the boxing in Stg is |
| 183 | easy, and this by itself saves us two words per value + pointer dereferencing. |
| 184 | However, the generated code will be ugly, and if we could do this in Core |
| 185 | instead of Stg, the simplifier would be able to do some follow-up optimizations |
| 186 | and generate good code. |
| 187 | |
| 188 | To be more specific, we want to do these transformations: |
| 189 | |
| 190 | First: |
| 191 | {{{ |
| 192 | D arg1 arg2 ... argN |
| 193 | ==> |
| 194 | nv_arg (where nv_arg is the only non-void argument) |
| 195 | }}} |
| 196 | |
| 197 | (but we somehow need to bind other args or do substitution. If we do this Stg |
| 198 | though we don't need to bind those args as unarise doesn't care about what a |
| 199 | void argument is as long as it's void it gets rid of it and it can check |
| 200 | void-ness by looking at Id's type) |
| 201 | |
| 202 | Second: |
| 203 | |
| 204 | {{{ |
| 205 | case <exp1> of |
| 206 | D arg1 arg2 ... argN -> <exp2> |
| 207 | ==> |
| 208 | let arg1 = ... |
| 209 | arg2 = ... |
| 210 | arg3 = ... |
| 211 | in <exp2> |
| 212 | }}} |
| 213 | (we know only one of these args will be non-void, but all of them should be |
| 214 | bound as they can be referred in <exp2>) |
| 215 | |
| 216 | If we do this in Stg we lose some optimization opportunities and generate ugly |
| 217 | code. For example, if the first transformation happens in a let-binding RHS |
| 218 | maybe simplifier decides to inline it as it can't duplicate work after the |
| 219 | transformation. Similarly it can decide to inline the non-void argument after |
| 220 | second transformation which may lead to further optimizations etc. |
| 221 | |
| 222 | For an example of an ugly code, suppose we had this: |
| 223 | |
| 224 | {{{ |
| 225 | case <exp1> of |
| 226 | D (T x) -> <exp2> |
| 227 | }}} |
| 228 | |
| 229 | in Stg this looks like |
| 230 | |
| 231 | {{{ |
| 232 | case <exp1> of |
| 233 | D v -> case v of |
| 234 | T x -> <exp2> |
| 235 | }}} |
| 236 | |
| 237 | So now if we do the second transformation we get |
| 238 | |
| 239 | {{{ |
| 240 | let v = <exp1> in |
| 241 | case v of |
| 242 | T x -> <exp2> |
| 243 | }}} |
| 244 | |
| 245 | but ideally we'd get |
| 246 | |
| 247 | {{{ |
| 248 | case <exp1> of |
| 249 | T x -> <exp2> |
| 250 | }}} |
| 251 | |
| 252 | Simplifier would be able to do this after the second transformation. |
| 253 | |
| 254 | So the problem is |
| 255 | |
| 256 | - If we implement this in Stg we generate ugly code, and miss some optimization |
| 257 | opportunities (and arguably it doesn't buy us much, it saves 2 words per |
| 258 | allocation + pointer dereferencing) |
| 259 | |
| 260 | - Implementing this in Core is very hard, if not impossible, without losing |
| 261 | type safety. |
| 262 | |
| 263 | (copied from [https://mail.haskell.org/pipermail/ghc-devs/2016-September/012741.html the mailing list discussion]) |