Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#13525 closed bug (fixed)

hWaitForInput with timeout causes program to abort

Reported by: bgamari Owned by:
Priority: highest Milestone: 8.2.1
Component: Compiler Version: 8.0.2
Keywords: Cc:
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: Blocking: #8684
Related Tickets: #12912, #8684 Differential Rev(s): Phab:D3473
Wiki Page:

Description

This program,

import System.IO
import System.Timeout

main = hWaitForInput stdin (5 * 1000)

causes the program to abort (tested on Linux),

$ ghc hi.hs
[1 of 1] Compiling Main             ( hi.hs, hi.o )
Linking hi ...
$ ./hi
fdReady: msecs != 0, this shouldn't happenAborted

Change History (11)

comment:1 Changed 2 years ago by bgamari

This was originally reported as a result of #12912. #8684 is also relevant.

comment:2 Changed 2 years ago by Ben Gamari <ben@…>

In 3d523fd9/ghc:

base: Add test for #13525

Reviewers: austin, hvr

Subscribers: rwbarton, thomie

Differential Revision: https://phabricator.haskell.org/D3419

comment:3 Changed 2 years ago by bgamari

Priority: highhighest

comment:4 Changed 2 years ago by bgamari

One approach to fix for the threaded runtime this would be to use threadWaitRead in conjunction with System.timeout.

comment:5 Changed 2 years ago by bgamari

For the record, this was originally broken by f46369b8a1bf90a3bdc30f2b566c3a7e03672518.

comment:6 Changed 2 years ago by bgamari

Differential Rev(s): Phab:D3473
Status: newpatch

comment:7 Changed 2 years ago by bgamari

Merged to master as,

In e5732d2/ghc:

base: Fix hWaitForInput with timeout on POSIX

This was previously broken (#13252) by
f46369b8a1bf90a3bdc30f2b566c3a7e03672518, which ported the fdReady
function from `select` to `poll` and in so doing dropping support for
timeouts. Unfortunately, while `select` tells us the amount of time not
slept (on Linux anyways; it turns out this is implementation dependent),
`poll` does not give us this luxury. Consequently, we manually need to
track time slept in this case.

Unfortunately, portably measuring time is hard. Ideally we would use
`clock_gettime` with the monotonic clock here, but sadly this isn't
supported on most versions of Darwin. Consequently, we instead use
`gettimeofday`, running the risk of system time changes messing us up.

Test Plan: Validate

Reviewers: simonmar, austin, hvr

Reviewed By: simonmar

Subscribers: rwbarton, thomie

GHC Trac Issues: #13252

Differential Revision: https://phabricator.haskell.org/D3473

comment:8 Changed 2 years ago by bgamari

Status: patchmerge
Last edited 2 years ago by bgamari (previous) (diff)

comment:9 Changed 2 years ago by bgamari

Resolution: fixed
Status: mergeclosed

comment:10 Changed 2 years ago by nh2

Blocking: 8684 added

comment:11 Changed 2 years ago by Ben Gamari <ben@…>

In ba4dcc7c/ghc:

base: Make it less likely for fdReady() to fail on Windows sockets.

See the added comment for details.

It's "less likely" because it can still fail if the socket happens to
have an FD larger than 1023, which can happen if many files are opened.

Until now, basic socket programs that use `hWaitForInput` were broken on
Windows.

That is because on Windows `FD_SETSIZE` defaults to 64, but pretty much
all GHC programs seem to have > 64 FDs open, so you can't actually
create a socket on which you can `select()`.

It errors with `fdReady: fd is too big` even with an example as simple
as the following (in this case, on my machine the `fd` is `284`):

  {-# LANGUAGE OverloadedStrings #-}

  import Control.Monad (forever)
  import Network.Socket
  import System.IO

  -- Simple echo server: Reads up to 10 chars from network, echoes them back.
  -- Uses the Handle API so that `hWaitForInput` can be used.
  main :: IO ()
  main = do
    sock <- socket AF_INET Stream 0
    setSocketOption sock ReuseAddr 1
    bind sock (SockAddrInet 1234 0x0100007f)
      -- 0x0100007f == 127.0.0.1 localhost
    listen sock 2
    forever $ do
      (connSock, _connAddr) <- accept sock
      putStrLn "Got connection"

      h <- socketToHandle connSock ReadWriteMode
      hSetBuffering h NoBuffering

      ready <- hWaitForInput h (5 * 1000) -- 5 seconds
      putStrLn $ "Ready: " ++ show ready

      line <- hGetLine h
      putStrLn "Got line"
      hPutStrLn h ("Got: " ++ line)
      hClose h

I'm not sure how this was not discovered earlier; for #13525 (where
`fdReady()` breaking completely was also discovered late) at least it
failed only when the timeout was non-zero, which is not used in ghc
beyond in `hWaitForInput`, but in this Windows socket case it breaks
even on the 0-timeout.

Maybe there is not actually anybody who uses sockets as handles on
Windows?

The workaround for now is to increase `FD_SETSIZE` on Windows;
increasing it is possible on Windows and BSD, see

https://stackoverflow.com/questions/7976388/increasing-limit-of-fd-setsi
ze-and-select

A real fix would be to move to IO Completion Ports on Windows, and thus
get rid of the last uses of `select()` (the other platforms already use
`poll()` but Windows doesn't have that).

Reviewers: bgamari, austin, hvr, erikd, simonmar

Reviewed By: bgamari

Subscribers: rwbarton, thomie

Differential Revision: https://phabricator.haskell.org/D3959
Note: See TracTickets for help on using tickets.