#14742 closed bug (invalid)

Unboxed sums can treat Word#s as Int#s

Reported by: duog Owned by: osa1
Priority: normal Milestone:
Component: Compiler Version: 8.2.2
Keywords: UnboxedSums Cc: osa1
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by duog)

Consider the following module:

{-# language MagicHash, UnboxedSums #-}

{-# options_ghc -ddump-stg -dppr-debug -fprint-explicit-kinds -ddump-to-file #-}

module Bug where
import GHC.Prim
import GHC.Types

mkUnboxedSum :: () -> (# Float# | Int# #)
mkUnboxedSum _ = (# | 9# #)
{-# noinline mkUnboxedSum #-}

foo :: Int
foo = case mkUnboxedSum () of
  (# | i# #) -> I# i#
  (# f# | #) -> 8

The full .dump-stg is attached. An abbreviation of the case statement in foo is:

        case (...)
        of
        (...)
        { ghc-prim:GHC.Prim.(#,,#){(w) d 89} ((us_g1h9{v} [lid] :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                                     (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                                        'ghc-prim:GHC.Types.WordRep{(w) d 63J}))
                                                :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                     (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                        'ghc-prim:GHC.Types.WordRep{(w) d 63J}))
                                             ((us_g1ha{v} [lid] :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                                     (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                                        'ghc-prim:GHC.Types.WordRep{(w) d 63J}))
                                                :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                     (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                        'ghc-prim:GHC.Types.WordRep{(w) d 63J}))
                                             ((us_g1hb{v} [lid] :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                                     (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                                        'ghc-prim:GHC.Types.FloatRep{(w) d 63V}))
                                                :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                     (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                        'ghc-prim:GHC.Types.FloatRep{(w) d 63V})) ->
              case
                  (us_g1h9{v} [lid] :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                         (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                            'ghc-prim:GHC.Types.WordRep{(w) d 63J})) :: Prim IntRep
              of
              ((tag_g1hc{v} [lid] :: ghc-prim:GHC.Prim.Int#{(w) tc 3s})
                 :: ghc-prim:GHC.Prim.Int#{(w) tc 3s})
              { __DEFAULT -> ghc-prim:GHC.Types.I#{(w) d 6i} [8#];
                2# ->
                    ghc-prim:GHC.Types.I#{(w) d 6i} [(us_g1ha{v} [lid] :: ghc-prim:GHC.Types.Any{(w) tc 35K}
                                                                            (ghc-prim:GHC.Prim.TYPE{(w) tc 32Q}
                                                                               'ghc-prim:GHC.Types.WordRep{(w) d 63J}))];
              };

Note that:

  • us_g1h9 :: Any (TYPE WordRep);
  • us_g1ha :: Any (TYPE WordRep);
  • tag_g1hc :: Int#;
  • The 2# alternative passes us_g1ha to an I# constructor.

This seems wrong to me.

It comes about because slotPrimRep . primRepSlot (in RepType) is not the identity.

StgLint found this while I was working on ticket:14541

Attachments (2)

Bug.dump-stg (8.6 KB) - added by duog 20 months ago.
Bug.dump-cmm (11.0 KB) - added by duog 20 months ago.

Download all attachments as: .zip

Change History (10)

Changed 20 months ago by duog

Attachment: Bug.dump-stg added

comment:1 Changed 20 months ago by duog

Description: modified (diff)
Summary: Unboxed sums can treat Int#s as Word#sUnboxed sums can treat Word#s as Int#s

comment:2 Changed 20 months ago by simonpj

Cc: osa1 added
Owner: set to osa1

Omer would you like to look into this?

comment:3 Changed 20 months ago by osa1

I think this is by design and we should relax the STG lint check if it's becoming a problem (or maybe we can add a special case for unboxed sums).

The whole point of unboxed sums is to have a compact and unboxed layout. Compactness here means same memory slot (in registers or stack or heap locations) should be able to used for values of different types. primRepSlot is what decides what slots can a value be put into in an unboxed sum, and mapping larger number of prim reps to smaller number of slot types means we can share same slot for values of different types/prim reps.

This seemingly ill-typed STG happens because we have to give unboxed sum data cons types, but we can't really say "anything that fits into a word slot" in our current type system, so we give it the type Word. This leads to seemingly ill-typed STG.

comment:4 Changed 20 months ago by simonpj

Then the right thing is to document the invariant, and make Lint check it. What is the invariant?

comment:5 Changed 20 months ago by duog

Why does RuntimeRep have both WordRep and IntRep constructors?

I had a grep and the only difference I could find is in foreign calls.

Modifying the program in the description to:

{-# language MagicHash, UnboxedSums, UnliftedFFITypes #-}
{-# options_ghc -ddump-stg -dppr-debug -fprint-explicit-kinds -ddump-to-file -ddump-cmm #-}

foreign import ccall "bar"
  bar :: Int# -> Int

mkUnboxedSum :: () -> (# Float# | Int# #)
mkUnboxedSum _ = (# | 9# #)
{-# noinline mkUnboxedSum #-}

foo :: Int
foo = case mkUnboxedSum () of
  (# | i# #) -> bar i#
  (# f# | #) -> bar 1#

The cmm for the two calls to bar are:

           (_s1hz::I64) = call "ccall" arg hints:  [‘signed’]  result hints:  [‘signed’] (_c1j0::I64)(_c1j1::I64);

and

           (_s1hE::I64) = call "ccall" arg hints:  []  result hints:  [‘signed’] (_c1j9::I64)(_c1ja::I64);

Presumably these arg hints exist because we can get problems without them?

Changed 20 months ago by duog

Attachment: Bug.dump-cmm added

comment:6 Changed 20 months ago by osa1

RuntimeRep is much older than unboxed sums and completely unrelated. Apparently it's used to generate "hints" for ABI calls (see primRepForeignHint). It may be used in C codegen (maybe to give an argument type int32_t instead of uint32_t etc.), but other than that I'm not sure if any of the platforms GHC supports have different ABIs for passing ints vs. words.

comment:7 Changed 19 months ago by andrewthad

comment:8 Changed 10 months ago by osa1

Resolution: invalid
Status: newclosed

As explained in comment:3, this is by design, and this currently does not cause linter errors (probably fixed with 7f389a580f42a105623853adad15ab3323b41ed5). Closing.

Note: See TracTickets for help on using tickets.