Opened 9 years ago

Closed 9 years ago

#5005 closed bug (fixed)

epollCreate: unsupported operation (Function not implemented)

Reported by: nomeata Owned by: igloo
Priority: highest Milestone: 7.2.1
Component: Compiler Version: 7.0.2
Keywords: Cc: debian-haskell@…, johan.tibell@…
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Compile-time crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

When building ghc-7 for Debian, and we are not hit by a linker error related to pthread and the unix package, compiling anything fails with this error:

epollCreate: unsupported operation (Function not implemented)

(from https://buildd.debian.org/fetch.cgi?pkg=ghc&arch=i386&ver=7.0.2-2&stamp=1299583867&file=log&as=raw). It is not very reproducible, though – never happened on my laptop (amd64), and this build on i386, using 7.0.1, worked: https://buildd.debian.org/fetch.cgi?pkg=ghc&arch=i386&ver=7.0.2-1&stamp=1299522621&file=log&as=raw

Happens (at least) on amd64 and i386 on Linux.

Change History (16)

comment:1 Changed 9 years ago by tibbe

Cc: johan.tibell@… added

comment:2 Changed 9 years ago by tibbe

That's odd. Is it always reproducible within a single machine? Linux (and Debian) should support epoll just fine.

comment:3 Changed 9 years ago by tibbe

I'm also not aware of any I/O manager changes that could trigger this kind of bug.

comment:4 Changed 9 years ago by nomeata

Julien Cristau solved the riddle: epoll_create1() (note the 1) is only available on newer kernels, so if ghc is built on a new kernel it will use epoll_create1() instead of epoll_create() and it will not run on older kernels. See http://lists.debian.org/debian-devel/2011/03/msg00407.html.

How should we solve this? Unconditionally use epoll_create()? Or come up with a run-time check as to which one is available?

comment:5 Changed 9 years ago by nomeata

This run-time code is suggested by Julien:

#if defined(HAVE_EPOLL_CREATE1)
fd = epoll_create1(EPOLL_CLOEXEC);
if (fd < 0 && errno == ENOSYS)
#endif
{
  fd = epoll_create(256);
  if (fd < 0 || fcntl(fd, F_SETFD, fcntl(fd, F_GETFD, 0) | FD_CLOEXEC) < 0)
    return -1;
}
return fd;

Is that worth the trouble or is the ability to specify FD_CLOEXEC not worth it?

comment:6 Changed 9 years ago by tibbe

So it's only possible to detect this at run-time? Here's what we currently do:

epollCreate :: IO EPollFd
epollCreate = do
  fd <- throwErrnoIfMinus1 "epollCreate" $
#if defined(HAVE_EPOLL_CREATE1)
        c_epoll_create1 (#const EPOLL_CLOEXEC)
#else
        c_epoll_create 256 -- argument is ignored
  setCloseOnExec fd
#endif
  let !epollFd' = EPollFd fd
  return epollFd'

I'd be fine to include a run-time check if necessary.

Note: I'll be out on vacation and only have very sporadic access to email between March 10-26. If you have more questions about the current design please ask Bryan O'Sullivan.

comment:7 Changed 9 years ago by tibbe

Perhaps we could change this to (untested):

import Foreign.C.Error (eNOSYS, throwErrno, getErrno)

epollCreate :: IO EPollFd
epollCreate = do
#if defined(HAVE_EPOLL_CREATE1)
  fd <- c_epoll_create1 (#const EPOLL_CLOEXEC)
  if fd == -1
    then do
      errno <- getErrno
      if errno == eNOSYS
        then legacyCreate
        else throwErrno
    else return fd
#else
  legacyCreate
#endif
  let !epollFd' = EPollFd fd
  return epollFd'
 where
  legacyCreate = do
    fd <- throwErrnoIfMinus1 "epollCreate" $
          c_epoll_create 256 -- argument is ignored
    setCloseOnExec fd

Could someone please test and see if this works?

comment:8 Changed 9 years ago by nomeata

Hi,

Replying to tibbe:

So it's only possible to detect this at run-time? Here's what we currently do:

epollCreate :: IO EPollFd
epollCreate = do
  fd <- throwErrnoIfMinus1 "epollCreate" $
#if defined(HAVE_EPOLL_CREATE1)
        c_epoll_create1 (#const EPOLL_CLOEXEC)
#else
        c_epoll_create 256 -- argument is ignored
  setCloseOnExec fd
#endif
  let !epollFd' = EPollFd fd
  return epollFd'

I'd be fine to include a run-time check if necessary.

right, I saw that. The problem is that we (at least in Debian) expect a ghc built on a newer kernel to also work on an older kernel.

The alternative to a run-time check is using epoll_create() only. The only difference is, as far I was told, when another thread forks between the call to epoll_create() and setCloseOnExit. Given that this is acceptable behaviour on pre-2.6.27 kernels, woudn’t it be acceptable behaviour on later kernels as well?

comment:9 in reply to:  8 Changed 9 years ago by tibbe

Replying to nomeata:

Replying to tibbe:

I'd be fine to include a run-time check if necessary.

right, I saw that. The problem is that we (at least in Debian) expect a ghc built on a newer kernel to also work on an older kernel.

The alternative to a run-time check is using epoll_create() only. The only difference is, as far I was told, when another thread forks between the call to epoll_create() and setCloseOnExit. Given that this is acceptable behaviour on pre-2.6.27 kernels, woudn’t it be acceptable behaviour on later kernels as well?

Perhaps. I haven't considered the issue in depth so I don't know what the consequences of using epoll_create instead of epoll_create1 are. If the fix I posted fixes the problem we don't even have to make a trade-off.

comment:10 Changed 9 years ago by nomeata

I feel uncomfortable to apply such a complicated (looking) change to the Debian package without thorough review and ideally having the patch applied upstream. OTOH, I also want to push forward with the ghc migration in Debian. Therefore, I’ll remove the use of epoll_create1 for now (assuming that the loss is marginal) and hope that the next ghc release will have a proper solution.

comment:11 Changed 9 years ago by tibbe

Do whatever you feel comfortable with. We might do the more complicated looking change in GHC.

comment:12 Changed 9 years ago by igloo

Milestone: 7.2.1
Priority: normalhighest

One option is to do something different if BeConservative is YES.

So we could:

  • always use c_epoll_create if BeConservative is YES
  • do the test-and-fallback if BeConservative is YES
  • always test-and-fallback

I don't know what the advantages of c_epoll_create1 are, or if test-and-fallback is a significant performance penalty. Which should we do?

comment:13 Changed 9 years ago by tibbe

There shouldn't be a noticeable performance penalty as we only call this function once, when the I/O manager is started. Going with only epoll_create is probably fine but as Bryan originally added this code we should ask him first.

comment:14 Changed 9 years ago by bos

Couple of comments.

  1. Debian packagers shouldn't be building stuff against a new glibc/kernel and then expect it to work against an older one. That's just sloppy practice. If they'd been building GHC against 2.6.26 or older and an appropriate glibc, this wouldn't occur.
  1. It is in fact fine to use epoll_create followed by ioctl, instead of epoll_create1.

comment:15 Changed 9 years ago by igloo

Owner: set to igloo

OK, let's just drop the epoll_create1 option then. Thanks for the help, guys!

comment:16 Changed 9 years ago by igloo

Resolution: fixed
Status: newclosed

Fixed:

Sat Mar 12 21:14:26 GMT 2011  Ian Lynagh <igloo@earth.li>
  * Never use epoll_create1; fixes trac #5005
  There is little benefit to using epoll_create1 (especially if we still
  have the epoll_create code too), and it cuases problems if people build
  a GHC binary on one machine and try to use it on another.
Note: See TracTickets for help on using tickets.