Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#5250 closed bug (fixed)

SEGFAULT in FFI to C++ library

Reported by: acowley Owned by:
Priority: high Milestone: 7.2.1
Component: Compiler (FFI) Version: 7.0.3
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: x86
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

A binding to OpenCV 2.2 I've been developing works fine on Mac OS X 10.5 and 10.6, but segfaults in Ubuntu 11.04 (32-bit) and Windows 7. On both Ubuntu and Windows, building a Debug or RelWithDebInfo build of OpenCV 2.2 seems to fix the problem. On Ubuntu, OpenCV 2.1 does not have this issue; I have not tried 2.1 on Windows.

To separate the bindings as a source of trouble, I boiled a test case down to a C function void myDilate(void) that allocates two images which are then passed to an OpenCV image processing routine, cvDilate. The myDilate function is defined in MyWrap.c. A shell C program, CTest2.c, defines a main function that calls this function. Building that program with,

gcc MyWrap.c CTest2.c -lopencv_core -lopencv_imgproc -o a.out

or

ghc -no-hs-main MyWrap.c CTest2.c -lopencv_core -lopencv_imgproc -o a.out

produces an executable that runs to completion.

Replacing the C main function with a Haskell main, Test2.hs, that calls the same function imported via the FFI results in an executable that segfaults. GDB + heavy printf'ing lead to a first crash within OpenCV at an assignment to a local that invokes a copy constructor. Changing this line to some other kind of initialization moves the segfault to another seemingly innocuous location within OpenCV. On the Windows 7 test machine, the segfault occurs in a different place, but also in OpenCV.

I have tested this with GHC 7.0.3 and 7.0.3.20110531. The version of gcc on the linux machine is 4.5.2.

The OpenCV libraries are .dylib on Mac, .so on Linux, and .dll on Windows. Of note is that OpenCV underwent a fairly large reorganization between 2.1 and 2.2, and is ever more heavily based around templatized C++.

I realize that OpenCV is a large dependency for a ticket, so I am willing to run any suggested tests.

Attachments (1)

SegFaulters.tgz (537 bytes) - added by acowley 9 years ago.
Test programs.

Download all attachments as: .zip

Change History (12)

Changed 9 years ago by acowley

Attachment: SegFaulters.tgz added

Test programs.

comment:1 Changed 9 years ago by acowley

To produce a segfaulting executable,

ghc MyWrap.c Test2.hs -lopencv_core -lopencv_imgproc

comment:2 Changed 9 years ago by acowley

Calling hs_init in the C program built by GHC does not trigger the segfault.

valgrind does not find any errors or leaks in the C program built by GCC.

comment:3 Changed 9 years ago by simonmar

Milestone: 7.2.1
Owner: set to simonmar
Priority: normalhigh

comment:4 Changed 9 years ago by simonmar

The problem is that GHC doesn't maintain 16-byte alignment of the C stack pointer on Linux or Windows (it does on Mac OS X, where it is a requirement of the ABI). I'm testing a fix for this. In the meantime you might be able to work around the issue by compiling OpenCV with the gcc flag -mstack-realign or -mincoming-stack-boundary=2.

comment:5 Changed 9 years ago by simonmar

Correction - that should be -mstackrealign.

comment:6 Changed 9 years ago by simonmar

Resolution: fixed
Status: newclosed

comment:7 Changed 9 years ago by simonmar

Owner: simonmar deleted
Resolution: fixed
Status: closednew

The bug is not completely fixed, as we also need to maintain stack alignment for callbacks. I'm looking into it.

comment:8 Changed 9 years ago by marlowsd@…

commit 9f61598ce7b0cb3448e8f0c3d627c0ca47b7f55f Author: Simon Marlow <marlowsd@…> Date: Wed Jun 29 11:49:57 2011 +0100

Use the x86/Darwin implementation of Adjustors on all x86 platforms, as it maintains 16-byte alignment of the stack pointer (see #5250)

comment:9 Changed 9 years ago by simonmar

Status: newmerge

Now fixed, and I added a test (testsuite/tests/ghc-regress/rts/5250.hs).

I don't think I've broken Mac, but we'll find out soon enough if I have...

comment:10 Changed 9 years ago by igloo

Resolution: fixed
Status: mergeclosed

comment:11 Changed 9 years ago by marlowsd@…

commit 81eddb4c58c6d4171a46c727574112e2083c4878
Author: Simon Marlow <marlowsd@gmail.com>
Date:   Mon Jul 18 13:48:53 2011 +0100

    Fix Windows breakage (#5322).  When I modified StgRun to use the pure
    assembly version as part of the fix for #5250, we inadvertently lost
    the Windows magic for extending the stack.  Win32 requires that the
    stack is extended a page at a time, otherwise you get a segfault.  The
    C compiler knows how to do this, so we now call a C stub to ensure
    there's enough stack space at each invocation of the scheduler.
Note: See TracTickets for help on using tickets.