Opened 12 years ago

Last modified 2 months ago

#1311 patch feature request

newtypes of unboxed types disallowed - documentation bug and/or feature request

Reported by: Isaac Dupree Owned by:
Priority: low Milestone:
Component: Compiler Version: 7.7
Keywords: Cc: carter.schonwald@…, michalt, RyanGlScott, int-index
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #15219 Differential Rev(s): Phab:D4777, https://gitlab.haskell.org/ghc/ghc/merge_requests/364
Wiki Page:

Description

Tested in 6.6.1 and today's 6.7 (which claims to be version 6.7.20070418),

newtype FastBool = FastBool Int#

doesn't work. However this is not documented in the unboxed-types documentation in the User's Guide that lists similar restrictions http://www.haskell.org/ghc/docs/latest/html/users_guide/primitives.html

Similarly (maybe), in phantom type arguments, even in GADTs where kind inference could be at work

data Boo a where Boo :: Int# -> Boo Int#

doesn't work.

I tried newtype ( FastBool :: # ) = ... , and data Boo ( a :: # ) where... , which just confused GHC.

Is there a reason that coercions of unlifted types shouldn't work, or is it just unimplemented? (inspired by looking at GHC's compiler/utils/FastTypes.lhs and thinking that using newtypes instead of type synonyms in places like that would improve type-safety)

Change History (25)

comment:1 Changed 12 years ago by brian@…

I think newtype FastBool = FastBool Int# could potentially work. There are a bunch of places where the compiler assumes newtypes are lifted but that could be fixed. The newtype would have all the limitations of unlifted types though. You still couldn't use it in polymorphic functions or anything (which might be somewhat surprising to users of your code). Probably a better use of time would be to make SpecConstr turn enumeration types into Int#s, thus eliminating the need for FastBool altogether.

As a side note, isn't FastBool kind of broken by design? You can't really do anything useful with it other than turning it back into a plain old Bool. and#/or# don't make sense as they must be strict in both of their arguments (which probably isn't what you want).

The GADT example can't really work as you can't forall over unboxed tyvars. You can't write:

unBoo :: forall (a::#) . Boo a -> a

as a could have a completely different representation depending on how it was instantiated. You could probably make it so

unBoo :: Boo Int# -> Int#

would work but why not just write BooInt = BooInt Int# at that point.

-Brian

comment:2 Changed 12 years ago by Isaac Dupree

I'm not seeing a way it would actually be useful for GADTs either, since you can't do anything with a general value of kind '#'.

It would be nice to be able to do enumerations more cleanly than that numerical way, but... As far as I have been able to figure out, the advantage of using unboxed values is that the compiler enforces that you can't use them in a non-strict way, potentially making code more efficient (versus the compiler guessing what places to unbox data implicitly) (and having less strictness annotations in the source code, even where it is effectively strict [function arguments, data members, let-bound variables], which has its pros and cons). The trouble with enumeration types is that they are boxed normally: a function can take a Bool thunk that, iff the function decides to evaluate it, will raise an exception.

Being able to newtype for FastInt makes sense, though, for the same abstraction reason of all newtypes - of course with the odd property of it being unlifted and not type-equal to any of the standard unlifted types.

(what if we could declare data ( FastBool :: # ) = FastTrue | FastFalse, or even allow different data sizes and it takes up an amount of space equal to any necessary tag plus the maximum member: data ( FastEither a b :: # ) = FastLeft a | FastRight !b {-members may or may not be strict-} , data ( FastTuple a b :: # ) = FastTuple a b {- would this be equivalent to (# #) ? -} ? Probably at least some of that is a foolish idea... Let's performance-test GHC with/without using those unboxed, maybe)

comment:3 Changed 12 years ago by simonpj

Milestone: _|_
Priority: normallow
Type: bugfeature request

I can't see any reason this would be impossible in principle, but my brain is too small to figure out all the ramifications of dropping the "newtypes are always boxed" assumption that GHC currently makes.

So for now I have simply added the restriction to the user manual, and I'll leave this suggestion as a feature request.

Simon

comment:4 Changed 11 years ago by simonmar

Architecture: UnknownUnknown/Multiple

comment:5 Changed 11 years ago by simonmar

Operating System: UnknownUnknown/Multiple

comment:6 Changed 6 years ago by carter

Cc: carter.schonwald@… added
Type of failure: None/Unknown
Version: 6.6.17.7

bumping the version, this still holds in head

comment:7 Changed 6 years ago by carter

Would a first step towards something in this direction be a Strict Newtype?

comment:8 Changed 4 years ago by simonpj

I see no difficulty in principle, and I don’t think this would be too hard in practice. That is, you can give a new name (via newtype) to an unlifted type like Int#, Float#, Double# etc.

Consequences:

  • newtypes might have kind # (unlifted)
  • The Coercible class would have to work over unlifted types

See email thread

comment:9 Changed 4 years ago by osa1

Owner: set to osa1

comment:10 Changed 4 years ago by osa1

I'd like to give this a try, Richard, where should I start? I don't understand typechecking/ parts too well. As far as I can see, kcTyClGroup is kind checking type declarations, and getInitialKind is generating liftedTypeKind unless a kind signature is given (I don't understand how can kind signatures be used with newtypes, maybe that case is impossible?). Is it possible to do this by just generalizing the current code, instead of adding special cases for newtypes in getInitialKind or kcTyClGroup (or both) ?

comment:11 Changed 4 years ago by goldfire

I think this will be quite easy, actually.

You're hunting in all the right places. getInitialKind does indeed assume a result of kind *. (A kind sig can be provided via newtype Blah :: * -> * where ... as if you're defining a GADT. That is, GADT syntax works with newtypes just fine.) But I don't think anyone gets hurt if you assume that the return kind is TYPE v for a fresh levity metavariable v. You may need to unify that result kind with * in the datatype case somewhere. You'd need to check that the result kind of the newtype matches the kind of the representation type. I think this would happen in kcConDecl. One annoying bit is that we have no surface syntax (other than TYPE 'Unlifted) for #. I suppose you could add type UnliftedType = TYPE 'Unlifted in Data.Kind if you felt it worthwhile.

I really don't think it would be much harder than this. Go for it! :)

comment:12 Changed 4 years ago by osa1

So I think this turned out to be more complicated than we first thought (as usual). Richard, could you tell me if I'm going in the right direction? Here's what I did so far:

  • compiler/typecheck/TcTyClsDecls.hs

    a b tcGadtSigType doc name ty@(HsIB { hsib_vars = vars }) 
    14901490              tcHsTyVarBndrs gtvs $ \ _ ->
    14911491              do { ctxt <- tcHsContext cxt
    14921492                 ; btys <- tcConArgs DataType hs_details
    1493                  ; ty' <- tcHsLiftedType res_ty
     1493                 ; ty' <- tcHsOpenType res_ty
    14941494                 ; field_lbls <- lookupConstructorFields name
    14951495                 ; let (arg_tys, stricts) = unzip btys
    14961496                       bound_vars = allBoundVariabless ctxt `unionVarSet`

Second change:

  • compiler/typecheck/TcHsType.hs

    a b tcDataKindSig :: Kind -> TcM [TyVar] 
    18621862-- We use it also to make up argument type variables for for data instances.
    18631863-- Never emits constraints.
    18641864tcDataKindSig kind
    1865   = do  { checkTc (isLiftedTypeKind res_kind) (badKindSig kind)
    1866         ; span <- getSrcSpanM
     1865  = do  { span <- getSrcSpanM
    18671866        ; us   <- newUniqueSupply
    18681867        ; rdr_env <- getLocalRdrEnv
    18691868        ; let uniqs = uniqsFromSupply us
    tcDataKindSig kind 
    18801879    mk_tv loc uniq occ kind
    18811880      = mkTyVar (mkInternalName uniq occ loc) kind
    18821881
    1883 badKindSig :: Kind -> SDoc
    1884 badKindSig kind
    1885  = hang (ptext (sLit "Kind signature on data type declaration has non-* return kind"))
    1886         2 (ppr kind)
    1887 
    18881882{-
    18891883Note [Avoid name clashes for associated data types]
    18901884~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Those are all for accepting this program:

{-# LANGUAGE MagicHash, KindSignatures, GADTs #-}

module Main where

import GHC.Types
import GHC.Prim
import GHC.Exts

newtype Blah :: TYPE 'Unlifted where
  Blah :: Int# -> Blah

main = return ()

For now I'm not trying to infer unlifted types, I'm trying to make cases with explicit kind signatures working.

With these changes, GHC is failing with this panic:

ghc-stage1: panic! (the 'impossible' happened)
  (GHC version 7.11.20151222 for x86_64-unknown-linux):
        ASSERT failed! file compiler/typecheck/TcTyClsDecls.hs, line 1630

This is because we have some code that assume result types of ... I think GADTs? ... are lifted. (I think newtypes defined this way are type checked using GADT type checker?)

Am I doing this right? How should I proceed after this point? I feel like I shouldn't lift these restrictions and instead I need some special cases for newtypes. Am I right?

comment:13 Changed 4 years ago by michalt

Cc: michalt added

comment:14 Changed 4 years ago by goldfire

I don't see a problem thus far. But as I said in my previous comment, you may need to unify that result kind with * in the datatype case somewhere. Perhaps a _ <- unifyType blah blah (typeKind res_ty) liftedTypeKind would do the trick, but perhaps something more elaborate is necessary. I do think you're pushing in the right direction, though.

comment:15 Changed 4 years ago by simonpj

It'd be easier to help if you had a branch of the GHC repo, or a remote clone thereof, that Richard or I could use to reproduce what you have.

I would really like a wiki page to describe the design, please. Maybe it's a simple page -- if so, so much the better.

Simon

comment:16 Changed 3 years ago by RyanGlScott

Cc: RyanGlScott added

comment:17 Changed 3 years ago by int-index

Cc: int-index added

comment:18 Changed 3 years ago by alexbiehl

I have put together a proof of concept here. It supports newtypes over unlifted data *and* unlifted data types (with unpacking). This is done without much knowledge about what is a right or wrong thing to do. But I would be happy if we could discuss it and maybe make it a patch we all agree with.

comment:19 Changed 3 years ago by simonpj

I would really like a wiki page to describe the design, please. Maybe it's a simple page -- if so, so much the better.

Alex, it's great that you have made a patch but I can't even begin to look at it until I know precisely what it is intended to achieve. Specification before implementation.

You might also (perhaps in parallel) want to formulate a GHC Proposal.

comment:20 Changed 3 years ago by alexbiehl

I'm sorry, I falsely implied the context of this thread. I think the general idea is described in https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes. I noticed the conversation here got stalled so I picked up and experimented where osa1 left:

https://ghc.haskell.org/trac/ghc/ticket/1311#comment:12 describes a construct like the following:

type Unlifted = TYPE 'UnliftedRep

newtype Blah :: TYPE 'IntRep where
  Blah :: Int# -> Blah

data T :: Unlifted where
    T :: Int -> Int -> T

Which allows:

  1. newtype declaration over unlifted types
  2. data declarations which are represented through an unlifted pointer (do not contain bottom)

The approach mentioned on UnliftedDataTypes page introduces a new keyword unlifted. This approach differs in that we give the unlifted representation through an explicit kind signature on the newtype/data type instead of a new syntax.

And that is what the patch allows in https://ghc.haskell.org/trac/ghc/ticket/1311#comment:18 enables us to write.

I can formulate a proposal next week.

comment:21 Changed 3 years ago by simonpj

I'll look forward to the proposal. It's a remarkably small patch! Don't forget to add comprehensive tests etc. Thanks!

comment:22 Changed 14 months ago by osa1

Differential Rev(s): Phab:D4777
Status: newpatch

The proposal was accepted and this feature is currently being implemented.

comment:23 Changed 8 months ago by osa1

Owner: osa1 deleted

comment:24 Changed 6 months ago by RyanGlScott

Differential Rev(s): Phab:D4777Phab:D4777, https://gitlab.haskell.org/ghc/ghc/merge_requests/364

comment:25 Changed 2 months ago by Marge Bot <ben+marge-bot@…>

In effdd948/ghc:

Implement the -XUnliftedNewtypes extension.

GHC Proposal: 0013-unlifted-newtypes.rst
Discussion: https://github.com/ghc-proposals/ghc-proposals/pull/98
Issues: #15219, #1311, #13595, #15883
Implementation Details:
  Note [Implementation of UnliftedNewtypes]
  Note [Unifying data family kinds]
  Note [Compulsory newtype unfolding]

This patch introduces the -XUnliftedNewtypes extension. When this
extension is enabled, GHC drops the restriction that the field in
a newtype must be of kind (TYPE 'LiftedRep). This allows types
like Int# and ByteArray# to be used in a newtype. Additionally,
coerce is made levity-polymorphic so that it can be used with
newtypes over unlifted types.

The bulk of the changes are in TcTyClsDecls.hs. With -XUnliftedNewtypes,
getInitialKind is more liberal, introducing a unification variable to
return the kind (TYPE r0) rather than just returning (TYPE 'LiftedRep).
When kind-checking a data constructor with kcConDecl, we attempt to
unify the kind of a newtype with the kind of its field's type. When
typechecking a data declaration with tcTyClDecl, we again perform a
unification. See the implementation note for more on this.

Co-authored-by: Richard Eisenberg <rae@richarde.dev>
Note: See TracTickets for help on using tickets.