Opened 8 years ago

Closed 8 years ago

Last modified 3 years ago

#5529 closed bug (fixed)

Newtypes with hidden constructors cannot be passed as FFI arguments

Reported by: mikhail.vorozhtsov Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.3
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

I think this limitation (introduced recently in 5b988961338f73af5790bfd365ca79c858249cea) is a bad idea. It is way too common for C libraries bindings to declare "safe" (either by allowing only predefined values or by providing custom constructing functions) newtype wrappers around typedefs. Such wrappers are then used by other FFI libraries (for example, Linux-specific functions on POSIX types), ending up in their foreign imports and triggering the "Unacceptable argument type" error. It can be worked around by exporting constructors from "Internal.*" modules, but I'd rather see the old behavior back.

Change History (16)

comment:1 Changed 8 years ago by simonpj

I can see merit in Mkhail's point of view.

Example

module Colours( Colour, red, draw ) where

newtype Colour = C Int

red :: Colour
red = 7

foreign import "draw" :: Colour -> IO ()

The question is whether this should be legal

module Foo where
import Colours
foriegn import "draw2" :: Colour -> IO ()

That is, a client of Colours can call a draw2 function that unpacks the Colour. Or should the abstract data type Colour be restricted to the functions (foreign or otherwise) that Colours exports?

I don't have a strong opinion. Did we make the change in response to a bug report? Ie was someone arguing the opposite?

Simon

comment:2 Changed 8 years ago by igloo

The Haskell 2010 report says:

the constructor N is visible where T is used
[...]
Consequently, in order for a type defined by newtype to be used in a
foreign declaration outside of the module that defines it, the type
must not be exported abstractly.

This was fixed in response to #3008.

Incidentally, the report also says:

The module Foreign.C.Types that defines the Haskell equivalents
for C types follows this convention

but I don't understand what that means, as those types are exported abstractly. Perhaps they shouldn't be?

comment:3 Changed 8 years ago by simonmar

I think GHC is correct now. This was discussed a long time ago on the FFI list (no link handy, sorry), and I think the principle is that in order to use a newtype in an FFI declaration you have to know its representation, because you're linking against some known C code that depends on the representation. So if we allowed that to happen with an abstract newtype, it would constitute leakage of the abstraction.

We ought to change Foreign.C.Types to export the types non-abstractly.

comment:4 in reply to:  3 ; Changed 8 years ago by mikhail.vorozhtsov

Replying to simonmar:

I think GHC is correct now. This was discussed a long time ago on the FFI list (no link handy, sorry), and I think the principle is that in order to use a newtype in an FFI declaration you have to know its representation, because you're linking against some known C code that depends on the representation.

That's not exactly correct. The C code depends on the particular way the argument is represented (e.g. passing Int64 instead of Int32 is a bad idea), but what GHC checks is a general "representability" (Int64/Int32/Float/..., it doesn't matter as long as it fits the calling convention) of the type. Constructor exporting is just an overkill for that task. What we need is a way to mark (with corresponding C types) appropriate newtypes as being representable (e.g. automatically by the compiler in module interfaces, or by requesting derivation of a special type class) without forcing users to spill the guts and when invent conventions for not hurting themselves ("yes, we export the internals, but never use them in your code, they are here only to pass FFI checks").

comment:5 in reply to:  4 Changed 8 years ago by simonmar

Replying to mikhail.vorozhtsov:

That's not exactly correct. The C code depends on the particular way the argument is represented (e.g. passing Int64 instead of Int32 is a bad idea), but what GHC checks is a general "representability" (Int64/Int32/Float/..., it doesn't matter as long as it fits the calling convention) of the type. Constructor exporting is just an overkill for that task. What we need is a way to mark (with corresponding C types) appropriate newtypes as being representable (e.g. automatically by the compiler in module interfaces, or by requesting derivation of a special type class) without forcing users to spill the guts and when invent conventions for not hurting themselves ("yes, we export the internals, but never use them in your code, they are here only to pass FFI checks").

I don't understand. Using a newtype in a foreign call exposes its representation, because it only works if you match the representation type with the type expected by the C function. So marking newtypes as "representable" doesn't help - you really have to know the representation.

comment:6 Changed 8 years ago by mikhail.vorozhtsov

Yes, that's why I said that types should be marked "with corresponding C types" (maybe "tagged" would be a better word). When a programmer "matches" newtypes from other Haskell libraries with the types of arguments of the C function he wants to import, he does it by name. I don't need to know how exactly, say, CTime is represented to "match" it with time_t, compiler does. My point is that while exporting constructors does expose representation to GHC's FFI checking code, it also exposes it to programmers that use the module. And I don't think that the former should require/force the latter. I want to have control over the export list, compiler can use other means of making representation accessible to it's own code.

comment:7 Changed 8 years ago by igloo

The point is, if you FFI import a C function that takes a time_t, then you can't pass it a MyType unless you know that MyType is represented by CTime (e.g. newtype MyType = MyType CTime). If MyType is abstract then you don't know that.

comment:8 Changed 8 years ago by mikhail.vorozhtsov

Once again, I, as a programmer, am completely fine if the module documentation says that MyType mirrors time_t, exported constructor doesn't help me a bit (one way to ensure that would be by writing something like let unused = MyType (undefined :: CTime), but what about newtype MyType = MyType Int64 (assuming time_t ~ int64_t), how would you check all possibilities?). GHC doesn't ensure that FFI import's types are right (with respect to the C types), programmer does. Well, exporting guts may be seen as a sort of documentation, but I'd rather not encourage/require that practice.

comment:9 Changed 8 years ago by mikhail.vorozhtsov

To sum things up.

  1. Exported constructors do not help programmers to get the FFI import's types right, as GHC provides no way to (automatically) assert that type X is a wrapper (using arbitrary number of layers) around "primitive" type which is equivalent to the desired C type y_t.
  1. GHC's FFI implementation does need to know the representations of the types used in an FFI import. But this information does not necessarily need to be passed around in the form of exported constructors. Store it somewhere else in the module interface and leave the export list to programmers.

comment:10 Changed 8 years ago by simonmar

Mikhail: this is not about whether GHC itself needs to know the representation internally, indeed typically GHC does have the full representation available even if the constructor is exported abstractly (at least when -O is on). The question here is whether GHC should require that the constructor is exported non-abstractly in order to be used in an FFI declaration. So I think point (2) above is not the issue.

I think your point (1) is saying that even if we know that CTime == Int64, what the programmer actually needs to know is that CTime == time_t, correct?

FYI, here's the previous discussion from the haskell-prime list:

http://www.haskell.org/pipermail/haskell-prime/2009-February/002726.html

it addresses most of the points we've covered here.

comment:11 in reply to:  10 ; Changed 8 years ago by mikhail.vorozhtsov

Replying to simonmar:

I think your point (1) is saying that even if we know that CTime == Int64, what the programmer actually needs to know is that CTime == time_t, correct?

Yes. And, more importantly, the programmer needs a way to ensure that CTime == time_t automatically at compile time, because we can't really expect people to recheck that equality by hand every time a new version of the library is released/used. This hypothetical machinery must be able to handle multiple levels of wrapping, like CTime -> X1 -> X2 -> Int64. My other (original) concern is that exporting a constructor does not only reveal the representation of the type, but also, obviously, gives third party non-FFI code a way to construct instances of the type. For CTime it may be ok, but think about flag types, which are allowed to have only limited number of bits set (or, even worse, only limited number of combinations of set bits).

comment:12 in reply to:  11 ; Changed 8 years ago by simonmar

Replying to mikhail.vorozhtsov:

Replying to simonmar:

I think your point (1) is saying that even if we know that CTime == Int64, what the programmer actually needs to know is that CTime == time_t, correct?

Yes. And, more importantly, the programmer needs a way to ensure that CTime == time_t automatically at compile time, because we can't really expect people to recheck that equality by hand every time a new version of the library is released/used. This hypothetical machinery must be able to handle multiple levels of wrapping, like CTime -> X1 -> X2 -> Int64.

This goes way beyond the original ticket, and I must admit I'm having trouble following your arguments.

The fact remains that allowing an abstract newtype to be used in an FFI declaration breaks the abstraction, so we can't allow that. If you want to propose an alternative mechanism that would fix this problem, please go head - but I suggest this ticket is not the right place (a wiki page would be better).

My other (original) concern is that exporting a constructor does not only reveal the representation of the type, but also, obviously, gives third party non-FFI code a way to construct instances of the type. For CTime it may be ok, but think about flag types, which are allowed to have only limited number of bits set (or, even worse, only limited number of combinations of set bits).

But you want to give clients the ability to call arbitrary foreign functions that return the type! Allowing them to construct instances in Haskell is no worse than that.

comment:13 in reply to:  12 Changed 8 years ago by mikhail.vorozhtsov

This goes way beyond the original ticket, and I must admit I'm having trouble following your arguments.

My point is that, given the current level of integration between GHC and C compilers, exporting constructors does not make programmers more confident in their FFI imports signatures.

My other (original) concern is that exporting a constructor does not only reveal the representation of the type, but also, obviously, gives third party non-FFI code a way to construct instances of the type. For CTime it may be ok, but think about flag types, which are allowed to have only limited number of bits set (or, even worse, only limited number of combinations of set bits).

But you want to give clients the ability to call arbitrary foreign functions that return the type! Allowing them to construct instances in Haskell is no worse than that.

I disagree. It's like saying that, since we provide a way to construct instances via unsafeCoerce, we should just export everything. To me FFI is like unsafeCoerce, you can break things with it, but it requires a certain amount of conscious effort to do so. Then the programmer sees unsafeCoerce or "foreign import", he naturally checks things twice. On the other hand, constructing "incoherent" instance using the newtype constructor is just way too easy.

comment:14 Changed 8 years ago by igloo@…

commit fb83cd0239e6d50b0ef0ad5cd9b641f0b4df032c

Author: Ian Lynagh <igloo@earth.li>
Date:   Fri Oct 21 00:47:15 2011 +0100

    Finish fixing #5529: Require that constructors are imported from all types
    
    We used to have a hack for Foreign.C.Types and System.Posix.Types,
    but I've removed that now. We also mark any constructors that we look
    through as "used", so that we don't get warnings about unused imports.

 compiler/typecheck/TcForeign.lhs |   28 ++++++++++++++++++++++------
 compiler/utils/Outputable.lhs    |    5 +++++
 2 files changed, 27 insertions(+), 6 deletions(-)

comment:15 Changed 8 years ago by igloo

Resolution: fixed
Status: newclosed

The constructors are now exported for the C* types.

GHC is now following the language definition. If you'd like to propose changing the definition, please discuss it on the Haskell Prime mailing list.

comment:16 Changed 3 years ago by Ben Gamari <ben@…>

In a28611b/ghc:

Export constructors for IntPtr and WordPtr

This finishes what #5529 started by exporting the constructors for
`IntPtr` and `WordPtr` from `Foreign.Ptr`, allowing them to be used in
`foreign` declarations.

Fixes #11983.

Test Plan: `make TEST=T11983`

Reviewers: simonpj, hvr, bgamari, austin

Reviewed By: simonpj

Subscribers: simonpj, thomie

Differential Revision: https://phabricator.haskell.org/D2142

GHC Trac Issues: #11983
Note: See TracTickets for help on using tickets.