Opened 3 years ago

Closed 3 years ago

#12357 closed bug (fixed)

Increasing maximum constraint tuple size significantly blows up compiler allocations

Reported by: bgamari Owned by: bgamari
Priority: normal Milestone: 8.2.1
Component: Compiler Version: 7.10.3
Keywords: Cc: osa1
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D2400, Phab:D2414
Wiki Page:

Description (last modified by bgamari)

In July 2015 (dd3080fe0263082f65bf2570f49189c277b12e28) the maximum constraint tuple size was raised from 16 to 62 to address #10451. It turns out that this change is apparently one of the larger compile-time regressions in recent GHC history. For instance, the nofib real/fulsom/Shapes.hs module regresses by around 15% in both compiler allocations and compile time.

Judging by ticky and the cost-center profiler the performance regression appears to simply be the result of the loading the new declarations from the interface file, as one might expect. Nevertheless, it's quite silly to pay such a large price for a change that only gets occasionally exercised.

Change History (39)

comment:1 Changed 3 years ago by bgamari

Description: modified (diff)

comment:2 Changed 3 years ago by bgamari

Description: modified (diff)

comment:3 Changed 3 years ago by bgamari

Here's a diff of ticky output from fulsom's Shapes module compiled with dd3080fe0263082f65bf2570f49189c277b12e28 and its parent. There's nothing particularly alarming here, as far as I can tell: most of the quantitative differences are in IfaceType, Unique, Binary, and BinIface, which all makes good sense given we now need to load more declarations from the GHC.Classes interface file. The original ticky output can be found here.

Last edited 3 years ago by bgamari (previous) (diff)

comment:4 Changed 3 years ago by bgamari

One thing to point out here is that a significant amount of cost from this patch is due to the construction of the superclass dictionary selectors (e.g. OccName.mkSuperDictSelOcc), since we now have to construct a few thousand more selectors than we did previously.

comment:5 Changed 3 years ago by bgamari

Here are the actual relative changes in allocations.

% change alloc A alloc B name
+79980.0% 320 256256 sat_sjxb (ghc-7.11.20150614:TcRnMonad)
+3860.0% 440 17424 sat_sjxt (ghc-7.11.20150614:TcRnMonad)
+3772.0% 400 15488 sat_sjxu (ghc-7.11.20150614:TcRnMonad)
+1516.7% 432 6984 sat_sjyB (ghc-7.11.20150614:TcHsSyn)
+1253.1% 22040 298224 (ghc-7.11.20150614:OccName.mkSuperDictSelOcc)
+417.8% 58488 302840 ds1 (ghc-7.11.20150614:LoadIface)
+307.9% 499856 2038832 $wa3 (ghc-7.11.20150614:Encoding.)
+287.5% 384 1488 (ghc-7.11.20150614:Unique.mkCTupleTyConUnique)
+287.5% 3072 11904 (ghc-7.11.20150614:TysWiredIn.cTupleTyConName)
+255.6% 936 3328 sat_sjz8 (ghc-7.11.20150614:TcRnMonad)
+165.7% 4477200 11897920 $wa5 (ghc-7.11.20150614:Encoding)
+139.4% 2376 5688 (ghc-7.11.20150614:OccName.mkClassDataConOcc)
+132.4% 17792 41344 (ghc-7.11.20150614:IfaceSyn.ifaceDeclImplicitBndrs)
+77.3% 39520 70064 (ghc-7.11.20150614:FastString.mkFastString)
+51.0% 337216 509256 (ghc-7.11.20150614:IfaceEnv.lookupOrig)
+50.7% 259992 391920 $wa86 (ghc-7.11.20150614:Binary)
+49.6% 63104 94384 updNameCacheTcRn (ghc-7.11.20150614:IfaceEnv)
+48.9% 3760 5600 $cget (ghc-7.11.20150614:CoAxiom)
+47.7% 224336 331240 (ghc-7.11.20150614:IfaceEnv.extendNameCache)
+45.4% 302448 439896 (ghc-7.11.20150614:UniqSupply.takeUniqFromSupply)
+44.6% 245336 354816 (ghc-7.11.20150614:Name.mkExternalName)
+43.6% 350440 503160 (ghc-7.11.20150614:FastString.unpackFS)
+42.7% 143104 204192 $wa2 (ghc-7.11.20150614:Encoding.)
+28.8% 5168 6656 sat_sjzc (ghc-7.11.20150614:TcHsSyn)
+28.1% 318552 407976 (ghc-7.11.20150614:Unique.mkVarOccUnique)
+27.7% 3984 5088 (ghc-7.11.20150614:OccName.mkDataConWorkerOcc)
+27.6% 283320 361520 (ghc-7.11.20150614:FastString.mkFastStringByteString3)
+27.6% 113328 144608 $wa (ghc-7.11.20150614:Encoding.)
+24.2% 4560 5664 (ghc-7.11.20150614:OccName.mkOccName)
+24.2% 905968 1124928 (ghc-7.11.20150614:TysWiredIn.isBuiltInOcc_maybe)
+23.5% 605144 747376 $wa (ghc-7.11.20150614:FastString.)
+23.3% 269056 331616 sat_s29W (ghc-7.11.20150614:UniqSupply)
+23.3% 269056 331616 sat_s29V (ghc-7.11.20150614:UniqSupply)
+23.3% 672640 829040 sat_s29U (ghc-7.11.20150614:UniqSupply)
+23.3% 134528 165808 a5 (ghc-7.11.20150614:UniqSupply)
+22.8% 3274992 4021296 $ccompare1 (ghc-7.11.20150614:Module)

comment:6 Changed 3 years ago by bgamari

It's a bit surprising how much changes in Encoding given that the new and old interface files are identical save the new classes, the names of which aren't terribly long.

comment:7 Changed 3 years ago by simonpj

It'd be great to investigate this Encoding things. There seems to be a LOT of allocation there... but all we are doing is loading an interface file, so what encoding is there to do?

comment:8 Changed 3 years ago by bgamari

Here are the same results from comment:5 but shown in terms of absolute change. All allocations given in these tables are in bytes.

Change alloc A alloc B name
+7420720.0 4477200 11897920 $wa5 (ghc-7.11.20150614:Encoding)
+1869440.0 10348960 12218400 $cgetUnique (ghc-7.11.20150614:Unique)
+1538976.0 499856 2038832 $wa3 (ghc-7.11.20150614:Encoding.)
+998016.0 6526672 7524688 (ghc-7.11.20150614:FastTypes.iBox)
+810336.0 4372400 5182736 (ghc-7.11.20150614:FastString.uniqueOfFS)
+746304.0 3274992 4021296 $ccompare1 (ghc-7.11.20150614:Module)
+458528.0 5368328 5826856 a11 (ghc-7.11.20150614:IOEnv)
+375360.0 2830992 3206352 $ccompare (ghc-7.11.20150614:Module)
+277104.0 1640736 1917840 (ghc-7.11.20150614:Unique.mkUnique)
+276184.0 22040 298224 (ghc-7.11.20150614:OccName.mkSuperDictSelOcc)
+255936.0 320 256256 sat_sjxb (ghc-7.11.20150614:TcRnMonad)
+244352.0 58488 302840 ds1 (ghc-7.11.20150614:LoadIface)
+218960.0 905968 1124928 (ghc-7.11.20150614:TysWiredIn.isBuiltInOcc_maybe)
+172040.0 337216 509256 (ghc-7.11.20150614:IfaceEnv.lookupOrig)
+156400.0 672640 829040 sat_s29U (ghc-7.11.20150614:UniqSupply)
+152720.0 350440 503160 (ghc-7.11.20150614:FastString.unpackFS)
+146832.0 2029744 2176576 (ghc-7.11.20150614:Binary.getByte1)
+145360.0 1439680 1585040 $wa1 (ghc-7.11.20150614:BinIface.)
+142232.0 605144 747376 $wa (ghc-7.11.20150614:FastString.)
+137448.0 302448 439896 (ghc-7.11.20150614:UniqSupply.takeUniqFromSupply)
+131928.0 259992 391920 $wa86 (ghc-7.11.20150614:Binary)
+125120.0 1819648 1944768 (ghc-7.11.20150614:UniqFM.lookupUFM)
+125120.0 640832 765952 a17 (ghc-7.11.20150614:UniqFM)
+112056.0 765296 877352 a10 (ghc-7.11.20150614:IOEnv)
+109480.0 245336 354816 (ghc-7.11.20150614:Name.mkExternalName)
+106904.0 224336 331240 (ghc-7.11.20150614:IfaceEnv.extendNameCache)
+89424.0 318552 407976 (ghc-7.11.20150614:Unique.mkVarOccUnique)
+87952.0 1069664 1157616 $wa3 (ghc-7.11.20150614:Binary.)
+78200.0 283320 361520 (ghc-7.11.20150614:FastString.mkFastStringByteString3)
+74520.0 2068360 2142880 $cget5 (ghc-7.11.20150614:IfaceType)
+72680.0 1294840 1367520 $cget4 (ghc-7.11.20150614:IfaceType)
+72680.0 748080 820760 $cget2 (ghc-7.11.20150614:IfaceType)
+72680.0 787960 860640 $cget1 (ghc-7.11.20150614:IfaceType)
+63296.0 827536 890832 (ghc-7.11.20150614:Unique.mkUniqueGrimily)
+62560.0 269056 331616 sat_s29W (ghc-7.11.20150614:UniqSupply)
+62560.0 269056 331616 sat_s29V (ghc-7.11.20150614:UniqSupply)
+61088.0 143104 204192 $wa2 (ghc-7.11.20150614:Encoding.)
+49128.0 352488 401616 (ghc-7.11.20150614:Name.nameUnique)
Last edited 3 years ago by bgamari (previous) (diff)

comment:9 Changed 3 years ago by bgamari

It looks like much of the trouble is that we are unpacking OccNames in isBuiltInOcc_maybe, which gets called in LoadIface.loadDecl by IfaceEnv.lookupOrig. This results in a great deal of allocations by utf8DecodeString, which eagerly decodes the entire string buffer into a [Char]. This eager behavior seems a bit silly. Unfortunately, I made a few quick attempts at making this decoding lazy but sadly it typically hurt allocations more than it helped due to thunk allocations

It also seems a bit silly that we inspect the name itself to determine whether it is built-in. Afterall, there is a finite universe of built-in OccNames and we already know the FastString hash of the OccName we are testing, so it seems like this really should just be a UniqFM lookup.

comment:10 Changed 3 years ago by bgamari

I've put up a few attempts at reducing some of the costs associated with FastString operations and in particular isBuiltInOcc_maybe. The most promising so far is Phab:D2385, which reworks isBuiltInOcc to avoid unpacking the name being checked. It does this by moving the implementation away from String pattern matching and towards a FastString lookup in a UniqFM. Initial indications suggest this reduces allocations by around 5% when compiling nofib/real/fulsom/Shape.hs.

Another thing I noticed while tracking this down is the behavior of concatFS, which unpacks the the FastStrings being concatenated and then builds a new FastString from the resulting list. It seems like there might be slightly better constant factors to be had by simply concatenating the ByteStrings themselves. I have tried reworking this in Phab:D2384, although I don't yet have solid numbers characterizing its effect.

comment:11 Changed 3 years ago by bgamari

The UniqFM approach appears to reduce allocations on Shape from 130 MByte to 112 MByte. Sadly this is still a fair bit above the allocations before the constraint tuple change, which was around 95 MByte.

comment:12 Changed 3 years ago by bgamari

It appears that much of the remaining difference is due to the construction of the superclass selector OccNames, which looks like this,

mkSuperDictSelOcc :: Int        -- ^ Index of superclass, e.g. 3
                  -> OccName    -- ^ Class, e.g. @Ord@
                  -> OccName    -- ^ Derived 'Occname', e.g. @$p3Ord@
mkSuperDictSelOcc index cls_tc_occ
  = mk_deriv varName "$p" (show index ++ occNameString cls_tc_occ)

This is rather inefficient as we first decode the class's OccName, then reencode it. After fixing concatFS with Phab:D2384 and redefining this helper as

mkSuperDictSelOcc index cls_tc_occ
  = mkOccNameFS varName $ concatFS [fsLit "$p", fsLit $ show index, occNameFS cls_tc_occ]

I find that allocations return to 96MBytes, which is very close to the allocations prior to the superclass constraint patch. I suspect that this rework should be performed to all of the helpers in OccName.

comment:13 Changed 3 years ago by bgamari

Another related efficiency issue can be found in isDerivedOccName, which decodes the entire OccName just to check the first two characters. One potential improvement here would be to use a bit in OccName (e.g. one of the many unused bits belonging to the Namespace field) to encode a "is derived" flag.

comment:14 in reply to:  13 Changed 3 years ago by bgamari

Replying to bgamari:

Another related efficiency issue can be found in isDerivedOccName ...

While this is technically true, isDerivedOccName is only used in a few non-essential places so it is likely fine that it isn't terribly efficient. It's likely not worth adding a flag as proposed.

comment:15 Changed 3 years ago by bgamari

For the record, here are the major allocations changes from dd3080fe0263082f65bf2570f49189c277b12e28,

Test name Absolute change relative change
compile-allocs/PTTrees 25393064 0.529524950607268
compile-allocs/Env 25411992 0.459749707346387
compile-allocs/Diff 25396112 0.454698441513462
compile-allocs/Attributes 25401672 0.443838549572384
compile-allocs/Rotate 25444576 0.425106327638958
compile-allocs/MyList 25454368 0.420214761736029
compile-allocs/QSort 25437696 0.410709377421855
compile-allocs/Defaults 18059728 0.398995946705711
compile-allocs/Result 18067832 0.376234141920803
compile-allocs/Queue 25430872 0.327207720756636
compile-allocs/Type_defs 25385544 0.31841141932715
compile-allocs/Getops 25411824 0.305931952326486
compile-allocs/Vtslib 25406880 0.297904017772972
compile-allocs/Digraph 25447904 0.290609017757
compile-allocs/MyRandom 18052240 0.280234183992874
compile-allocs/BinTest 18069304 0.266927204744095
compile-allocs/RoulSplit 18036648 0.266884025212266
compile-allocs/Prog 18153488 0.26291850611499
compile-allocs/Match 25406472 0.255481066153786
compile-allocs/GamtebMain 18060416 0.254664741449328
compile-allocs/PhotoElec 18076904 0.253060115990819
compile-allocs/Params 25428424 0.251588917607883
compile-allocs/Norm 18077768 0.245577563784965
compile-allocs/Checker 18028920 0.229655342982204
compile-allocs/FiniteMap 25446328 0.229357595736048
compile-allocs/Interact 18121304 0.229086930261854
compile-allocs/Dcore 25456736 0.225956351068074
compile-allocs/Vector 18077632 0.221214660856509
compile-allocs/Elefac 18106240 0.220899224830413
compile-allocs/Postscript 18088816 0.215961664665538
compile-allocs/Basics 18147128 0.213886880509137
compile-allocs/Engine 18064544 0.213754629948678
compile-allocs/Pair 18059280 0.212807877872538
compile-allocs/Subst 25469712 0.212141133436499
compile-allocs/Stdlib 25446584 0.211934106612719
compile-allocs/CharSeq 18020744 0.207692636071785
compile-allocs/Assemble_loadvec 18087200 0.204241417348172
compile-allocs/Tol_cal 18176248 0.201883838406277
compile-allocs/Parsers 25476944 0.199030526945435

As one might expect, the largest changes are in compilation of small modules, since the startup cost imposed by loading GHC.Classes is relatively large in these cases.

Last edited 3 years ago by bgamari (previous) (diff)

comment:16 Changed 3 years ago by Ben Gamari <ben@…>

In eb3d6595/ghc:

OccName: Avoid re-encoding derived OccNames

Previously we would form derived OccNames by first decoding the name
being derived from, manipulating it in [Char] form, and then
re-encoding. This is all very wasteful as we essentially always just
want to concatenate. Instead we now take care to form the final name
with only one concatFS.

Test Plan: Validate, examing compiler allocations

Reviewers: simonpj, austin

Reviewed By: simonpj

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2387

GHC Trac Issues: #12357

comment:17 Changed 3 years ago by Ben Gamari <ben@…>

In 6c7c193/ghc:

DsExpr: Remove usage of concatFS in fingerprintName

This was the only user of concatFS and really just wants the `String`
anyways.

Stumbled upon while looking at #12357.

Test Plan: Validate

Reviewers: austin

Reviewed By: austin

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2386

comment:18 Changed 3 years ago by Ben Gamari <ben@…>

In f53d761/ghc:

TysWiredIn: Use UniqFM lookup for built-in OccNames

Previously we would unpack the OccName into a String, then pattern match
against this string. Due to the implementation of `unpackFS`, this
actually unpacks the entire contents, even though we often only need to
look at the first few characters.

Here we take another approach: build a UniqFM with the known built-in
OccNames, allowing us to use `FastString`'s hash-based comparison
instead.

Reviewers: simonpj, austin, simonmar

Reviewed By: simonmar

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2385

GHC Trac Issues: #12357

comment:19 Changed 3 years ago by simonpj

See Omer's comments on the (now-closed) Phab patch. I'm responding here because Trac is a more durable medium than closed code reviews.

I have not been paying proper attention to this ticket. But the solution embodied here (and now in HEAD) really isn't right. The new situation in lookupOrigNameCache is this:

  1. Look up the OccName in fixed finite map, designed to catch tuples
  2. If that fails, look up the (Module, OccName) pair in the original name cache. This is just a two-level finite map: first look up the Module then the OccName.

So now, for every non-tuple, we do two lookups, in step 1 and step 2 resp. This is obviously bad. If we are going to have a finite map for tuples, just populate the original-name cache with tuples, and do step (2) alone.

Even if tuples are treated specially somehow, as Omer says, it would be be far better to first spot that the Module is GHC.Tuple and only then do tuple-processing.

But what we have now is (a) more complicated and (b) slower than this.

For tuples, maybe it's ok to populate the cache with 62 tycons and 62 datacons. For for unboxed sums we'll have 62*62 data cons, which we might consider too many. It may be that a little hand-written parser, working directly on ByteString would be better. But regardless that should be a decision that affects only the code path when we reallydo have a tuple, not everything.

Another option that we explored in the past was to have a special interface-file representation for these name-families (tuples and sums). Perhaps

data IfaceType = ...
  | IfaceTuple Boxity Arity

meaning just an occurence of the naked TyCon. Now the string GHC.Tuple.(,,,) would never occur, so this entire question would not be relevant for interface file serialisation and deserialisation. (We'd still need that parser for Template Haskell.) I'm not sure why we abandoned that approach.

comment:20 Changed 3 years ago by osa1

Cc: osa1 added

comment:21 Changed 3 years ago by bgamari

Differential Rev(s): Phab:D2400

Phab:D2400 is also relevant here as it reworks the commit mentioned in comment:18.

However, there is further change afoot: On Skype Simon and I discussed the current situation and agreed that it's still not quite right. He has summarized the conclusion on Phab:D2400,

On skype we agreed that

  • the the TcIface path doesn't need to even try to look for tuple names because they are encoded specially in interface files. We only need parse tuple names specially in the template haskell route.
  • We'll put cons and nil and other special cases into the OriginalNameCache so we don't need to parse them specially here; fewer special cases.

The point is that we needn't worry about the cost of including things like (:) and [] in the name cache, so we should just remove their cases from isBuiltInOcc_maybe and handle them in the usual way. The only cases which we should handle in isBuiltInOcc_maybe are those of tuples (of boxed, unboxed, and constraint varieties) and eventually unboxed sums.

Additionally Simon proposed that we break up lookupOrigNameCache into two variants (names up for debate),

  • lookupOrigNameCache', which will merely lookup an OccName in the name cache, without attempting to resolve built-in syntax (with an assert verifying that we aren't looking up something in GHC.Tuple, as definitions defined there aren't in the name cache)
  • lookupOrigNameCache, which will lookup an OccName as above, but also attempt to parse out names corresponding to built-in syntax.

Then we can use lookupOrigNameCache' in the ReadIface codepaths, where we know that we should never see built-in syntax (since these names are encoded specially in the interface file symbol table, see Note [Symbol table representation of names] in BinIface). In the Template Haskell codepaths, it will still be necessary to use lookupOrigNameCache, to ensure that things like mkName "(,)" work.

comment:22 Changed 3 years ago by bgamari

In comment:21 I described the plan. Now I'll describe the reality, which includes a few wrinkles.

As it turns out, we actually do include almost all of the tuple types in the original name cache. We can see this by noticing that newHscEnv initializes the namecache as (initNameCache us allKnownKeyNames), where allKnownKeyNames includes,

knownKeyNames = concat
  [ ...
  , concatMap tycon_kk_names typeNatTyCons

  , concatMap (tycon_kk_names . tupleTyCon Boxed) [2..mAX_TUPLE_SIZE]  -- Yuk
  , cTupleTyConNames
  , ...
  ]

allKnownKeyNames = knownKeyNames ++ ...

Consequently, if you remove the isBuiltInOcc_maybe checks entirely things almost continue to work; almost because you will note that unboxed tuples are missing. This had led me to assume that all tuples were missing from the name cache. However, if you add unboxed tuples to knownKeyNames things validate without any trouble.

So, the question now is whether we want to change this by removing tuples from the original name cache. To answer this let's consider what this would entail: I believe the only sensible way to implement comment:21 would be to filter out tuples from allKnownKeyNames when initializing the name cache. I specifically do not propose to remove tuples from allKnownKeysNames since the TcTypeable implementation relies on them being present so it knows to generate type representations for them. Is filtering them out worthwhile? I'll try to take some measurements to find out, but I'm not entirely convinced (at least for tuples; unboxed sums may be another story).

comment:23 Changed 3 years ago by simonpj

Why/how does the TcTypeable impl rely on them being there??? There jolly well ought to be a comment to say so.

All I see is

  • knownKeyNames used in building knownKeyNamesMap in BinIface. Waste of time having tuples in there because they are treated specially by BinIface.
  • Initialising the name cache in HscMain.

And that's really it.

comment:24 Changed 3 years ago by bgamari

Hmm, while purusing I stumbled upon this comment in TysWiredIn,

wiredInTyCons = [ unitTyCon     -- Not treated like other tuples, because
                                -- it's defined in GHC.Base, and there's only
                                -- one of it.  We put it in wiredInTyCons so
                                -- that it'll pre-populate the name cache, so
                                -- the special case in lookupOrigNameCache
                                -- doesn't need to look out for it
                , ...

Which could be interpreted to imply that wiredInTyCons, not allKnownKeyTyCons, should be used to initialize the original name cache. Was this intentional or just me reading too much into the language?

comment:25 Changed 3 years ago by simonpj

This all dates from when tuples were not in the original name cache (and I think they should indeed not be). But we made a special case for the unit tuple; I forget why, and it probably doesn't matter much.

No, we must initialise the cache will all known-key tycons (tuples aside) otherwise, well, we won't assign the right known key to one of them.

comment:26 in reply to:  23 Changed 3 years ago by bgamari

Replying to simonpj:

Why/how does the TcTypeable impl rely on them being there??? There jolly well ought to be a comment to say so.

It looks like my memory failed me here: TcTypeable relies on primTyCons, not allKnownKeyNames, to create the representations for primitive types.

That being said, excluding tuples from the name cache nevertheless poses a problem for Typeable: Currently type representations for tuples are known key. If we exclude tuple type representations from the known key list then they will be loaded from GHC/Tuple.hi with the wrong unique. I can think of a few ways of dealing with this,

  1. Add a special encoding to the symbol table format for type rep names. However, it seems odd to add special encodings in the interface file format for such a narrow case
  1. Add support to the isBuiltInOcc_maybe parser for identifying $tc names. This seems terribly brittle.
  1. Add only the tuple type representations to the known-key list. This would cut the number of tuple-related things in the name cache in half compared to the current state of affairs. That being said, this sort of solution will still be rather painful with unboxed sums.

All I see is

  • knownKeyNames used in building knownKeyNamesMap in BinIface. Waste of time having tuples in there because they are treated specially by BinIface.
  • Initialising the name cache in HscMain.

And that's really it.

Yes, I agree.

Last edited 3 years ago by bgamari (previous) (diff)

comment:27 Changed 3 years ago by simonpj

If we exclude tuple type representations from the known key list then they will be loaded from GHC/Tuple.hi with the wrong unique

I don't understand. Isn't that what -- Note [Symbol table representation of names]

is all about? What gets loaded with the wrong key?

Add a special encoding to the symbol table format for type rep names. However, it seems odd to add special encodings in the interface file format for such a narrow case

Not really. We already have

--   A tuple name:
--    x is the tuple sort (00b ==> boxed, 01b ==> unboxed, 10b ==> constraint)
--    y is the thing (00b ==> tycon, 01b ==> datacon, 10b ==> datacon worker)
--    z is the arity

Data con workers seems similarly specialised. It's just an encoding for a large family. Maybe 11b for rep-name?

comment:28 in reply to:  27 Changed 3 years ago by bgamari

Replying to simonpj:

If we exclude tuple type representations from the known key list then they will be loaded from GHC/Tuple.hi with the wrong unique

I don't understand. Isn't that what -- Note [Symbol table representation of names]

is all about? What gets loaded with the wrong key?

The TypeRepName (e.g. $tc(,,)).

Add a special encoding to the symbol table format for type rep names. However, it seems odd to add special encodings in the interface file format for such a narrow case

Not really. We already have

--   A tuple name:
--    x is the tuple sort (00b ==> boxed, 01b ==> unboxed, 10b ==> constraint)
--    y is the thing (00b ==> tycon, 01b ==> datacon, 10b ==> datacon worker)
--    z is the arity

Data con workers seems similarly specialised. It's just an encoding for a large family. Maybe 11b for rep-name?

Fair enough. Alright; I'll move ahead with this option then.

comment:29 Changed 3 years ago by bgamari

Data con workers seems similarly specialised. It's just an encoding for a large family. Maybe 11b for rep-name?

Fair enough. Alright; I'll move ahead with this option then.

On second thought, this is actually not entirely trivial; to see why, consider what happens when you are writing out the interface file of GHC.Tuple. Under the proposal, you would need to somehow recognize type representations (which are plain value-level bindings) which belong to a tuple type and encode them with the special symbol table encoding.

Indeed we already do this for encoding TyCon names and it's not hard: just look for TyCons with algTcRhs = TupleTyCon {} (as is done by tyConTuple_maybe). However, distinguishing a value level binding as a tuple's type rep is not as easy. I can think of two options neither being terribly nice,

  1. Build a Map TypeRepName TupleSort and check it for every symbol that we write to the symbol table (yuck)
  2. Add a constructor to IdDetails to encode the fact that the binding is the typerep of a tuple

comment:30 Changed 3 years ago by bgamari

I've moved ahead with option 2 of comment:29 for now.

comment:31 Changed 3 years ago by bgamari

Unfortunately it looks like option 2 is somewhat of a dead-end as it would require that we make all of the tuple type reps WiredIn (for both the tuple type constructors as well as their promoted variants), since we need to be able to look up the Id from the Name in order to inspect the IdDetails from inside BinIface.putName. I don't think we want to do that. However, option 1 seems about as bad (and would come at the cost of a map lookup for each and every symbol written to an interface file symbol table).

This is all very yucky.

Last edited 3 years ago by bgamari (previous) (diff)

comment:32 Changed 3 years ago by bgamari

And as it turns out wiring in the tuple type constructors' representations has another knock-on consequence, requiring that we yet again wire-in Data.Typeable.Internal.Module and Data.Typeable.Internal.TrName. Recall that we unwired these around six months ago (ticket:11120#comment:28) due to breakage in GHCJS due to the varying width of Fingerprint on 32- and 64-bit platforms. For what it's worth, this varying width issue would be fixed if #11953 were implemented.

Last edited 3 years ago by bgamari (previous) (diff)

comment:33 Changed 3 years ago by simonpj

OK, you are right. Let's NOT wire-in the rep-names. That would be bad.

I recant on comment:27. Lesser evil is to add the rep-names of boxed tuples (but still not the tuple tycons or data cons) to the known-key names, and otherwise not treat them specially.

Unboxed and constratint tuples (and uncoming unboxed sums) remain un-representable, but that's a problem for another day.

comment:34 Changed 3 years ago by bgamari

Differential Rev(s): Phab:D2400Phab:D2400, Phab:D2414
Status: newpatch

Here's a first cut at a fix. Currently have a validation running and I likely need to do another pass over the comments when I'm less tired.

comment:35 Changed 3 years ago by bgamari

Sadly I mentioned the wrong commit in the commit message. Here is the merged commit for Phab:D2414,

In 9513fe6/ghc:

Clean up interaction between name cache and built-in syntax

This cleans up various aspects of the handling of built-in syntax in the
original name cache (hopefully resulting in a nice reduction in compiler
allocations),

  * Remove tuple types from original name cache: There is really no
    reason for these to be in the name cache since we already handle
    them specially in interface files to ensure that we can resolve them
    directly to Names, avoiding extraneous name cache lookups.

  * Sadly it's not possible to remove all traces of tuples from the
    name cache, however. Namely we need to keep the tuple type
    representations in since otherwise they would need to be wired-in

  * Remove the special cases for (:), [], and (##) in isBuiltInOcc_maybe
    and rename it to isTupleOcc_maybe

  * Split lookupOrigNameCache into two variants,

     * lookupOrigNameCache': Merely looks up an OccName in the original
       name cache, making no attempt to resolve tuples

     * lookupOrigNameCache: Like the above but handles tuples as well.
       This is given the un-primed name since it does the "obvious"
       thing from the perspective of an API user, who knows nothing of
       our special treatment of tuples.

Arriving at this design took a significant amount of iteration. The
trail of debris leading here can be found in #11357.

Thanks to ezyang and Simon for all of their help in coming to this
solution.

Test Plan: Validate

Reviewers: goldfire, simonpj, austin

Reviewed By: simonpj

Subscribers: thomie, ezyang

Differential Revision: https://phabricator.haskell.org/D2414

GHC Trac Issues: #11357

comment:36 Changed 3 years ago by bgamari

Unfortunately I had to revert the above patch (in 83e4f49577665278fe08fbaafe2239553f3c448e) since it appears to break Haddock. This issue is now back on my queue

comment:37 Changed 3 years ago by bgamari

Owner: set to bgamari

comment:38 Changed 3 years ago by bgamari

comment:39 Changed 3 years ago by bgamari

Milestone: 8.2.1
Resolution: fixed
Status: patchclosed

I think it is safe to call this resolved.

Note: See TracTickets for help on using tickets.