Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#11292 closed bug (fixed)

Generics unboxed types: TEST=GEq1 WAY=optasm is failing

Reported by: thomie Owned by:
Priority: normal Milestone: 8.0.1
Component: Compiler Version: 7.11
Keywords: Generics Cc: RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #11312 Differential Rev(s): Phab:D1714
Wiki Page:

Description

$ make TEST=GEq1 WAY=optasm TEST_HC=ghc-7.11.20151225
...
cd ./generics/GEq && ./GEq1    </dev/null > GEq1.run.stdout 2> GEq1.run.stderr
Actual stdout output differs from expected:
--- ./generics/GEq/GEq1.stdout.normalised	2015-12-26 15:32:22.984189653 +0100
+++ ./generics/GEq/GEq1.run.stdout.normalised	2015-12-26 15:32:22.984189653 +0100
@@ -3,5 +3,5 @@
 True
 True
 False
-True
-True
\ No newline at end of file
+False
+False
\ No newline at end of file
*** unexpected failure for GEq1(optasm)

The test was last touched by Ryan in 6cde981a8788b225819be28659caddc35b77972d (#10868)

Change History (16)

comment:1 Changed 4 years ago by RyanGlScott

This doesn't appear to be specific to generics. Rather, it's an issue with eqAddr# and how it's affected by optimization levels. Here's a simplified example that takes away the complexity of GEq1:

{-# LANGUAGE MagicHash #-}
module Main (main) where

import GHC.Exts (Addr#, isTrue#)
import GHC.Prim (eqAddr#)

data A = A { runA :: Addr# }

a :: A
a = A "a"#

main :: IO ()
main = print (isTrue# (eqAddr# (runA a) (runA a)))

Compiling this with -O0 makes the program return True, but compiling with -O1 or higher returns False. Is pointer equality unpredictable enough that it can fail the reflexive property if optimized enough? If so, I can just modify GEq1 so as not to include Addr#.

comment:2 Changed 4 years ago by rwbarton

It looks like GHC is happy to inline everything in your simplified example leaving

Main.main2 =
  case GHC.Prim.tagToEnum#
         @ GHC.Types.Bool (GHC.Prim.eqAddr# "a"# "a"#)
  of _ [Occ=Dead] {
    GHC.Types.False -> GHC.Show.shows26;
    GHC.Types.True -> GHC.Show.shows24
  }

The value "a"# has been duplicated, so there are now two copies of the string constant in the program, with different addresses.

For your test you can presumably work around this with NOINLINE annotations on u0 and uf0.

This duplication seems like generally undesirable behavior though; leaving aside the question of whether or not it is "semantically correct", it makes the resulting object file unnecessarily large (imagine the string was much longer than a single character). Probably worth opening a separate ticket.

comment:3 Changed 4 years ago by RyanGlScott

Differential Rev(s): Phab:D1714
Status: newpatch

Adding {-# NOINLINE #-} pragmas does indeed do the trick. I've opened Phab:D1714 for that workaround. I've also opened #11312 regarding the inlining issue.

comment:4 Changed 4 years ago by Ben Gamari <ben@…>

In 8e735fd0/ghc:

Fix GEq1 when optimizations are enabled

When optimizations are enabled, primitive string literals can be
inlined, which can create two copies of a string constant with different
addresses. We want to avoid this behavior at all costs in the `GEq1`
test, since the output depends on the result of `eqAddr#`. We prevent
such inlining through use of the `{-# NOINLINE #-}` pragma.

Fixes #11292.

Test Plan: Validate with T11292

Reviewers: thomie, austin, bgamari

Reviewed By: bgamari

Differential Revision: https://phabricator.haskell.org/D1714

GHC Trac Issues: #11292

comment:5 Changed 4 years ago by bgamari

Status: patchnew

RyanGlScott, did you mean to add a T11292 test in this diff?

comment:6 Changed 4 years ago by RyanGlScott

Not originally, but I suppose I did accidentally mention such a test in the Validation field. I should probably add a standalone test for this behavior with the proper flags always set (and possibly adjust it depending on the result of #11312).

comment:7 in reply to:  6 Changed 4 years ago by thomie

Resolution: fixed
Status: newclosed

Replying to RyanGlScott:

I should probably add a standalone test for this behavior

I don't know, but this issue is fixed. TEST=GEq1 WAY=optasm is passing again. Thanks.

comment:8 Changed 4 years ago by simonpj

Where is this eqAddr# test coming from?? It's utterly bogus to compare two strings by address. This is functional programming!

I'm very uncomfortable about "fixing" this by some new delicate stuff like NOINLINE pragmas, especially as they are entirely un-explained.

comment:9 in reply to:  8 ; Changed 4 years ago by RyanGlScott

Replying to simonpj:

Where is this eqAddr# test coming from?? It's utterly bogus to compare two strings by address. This is functional programming!

We have to test eqAddr# here because that's how deriving Eq deals with Addr#. Addr# is one of six privileged unlifted types which a datatype can have and still have a derived Eq/Ord instance. Internally, the deriving machinery compares Addr# fields in a datatype with eqAddr#/ltAddr#/etc., so this test just checks to see if GHC generics is equally as expressive.

I'm very uncomfortable about "fixing" this by some new delicate stuff like NOINLINE pragmas, especially as they are entirely un-explained.

I'm inclined to agree, but if we do fix this, we'd probably need to change the way Addr# is dealt with in datatypes as well for the sake of consistency.

comment:10 in reply to:  9 Changed 4 years ago by simonpj

I'm inclined to agree, but if we do fix this, we'd probably need to change the way Addr# is dealt with in datatypes as well for the sake of consistency.

No... an Addr# might be returned from C-land by malloc, and might be perfectly stable. It's just that primitive strings should not be compared with eqAddr#. The bug is that primitive strings have type Addr#; they should have type String#. And to compare String# you should use strcmp. I suppose that means a new primitive type and family of operations over it... but that looks to me like the Right Thing to do.

This NOINLINE stuff looks desperately fragile.... a bug waiting to happen.

comment:11 Changed 4 years ago by rwbarton

This test just needs a way to generate two different Addr#s so that it can test whether a derived Eq instance compares them correctly. Using string literals guarded by NOINLINE is one easy way to do that, and has the advantage (for the way the test is structured) of being pure.

comment:12 Changed 4 years ago by RyanGlScott

I suppose I could just replace the primitive string literals with nullAddr#, no?

comment:13 Changed 4 years ago by rwbarton

I don't know, do you really only need one Addr#?

comment:14 Changed 4 years ago by RyanGlScott

At the moment I'm only testing if something with unlifted types can be equal to itself, so one Addr# suffices. I could conceivably add a test for two things that aren't equal, in which the two unequal fields are something like nullAddr# and plusAddr# 1# nullAddr# (disclaimer: I don't know how safe that would be in general).

comment:15 Changed 4 years ago by thomie

Milestone: 8.0.1

comment:16 Changed 3 years ago by RyanGlScott

Keywords: Generics added
Note: See TracTickets for help on using tickets.