Opened 3 years ago

Last modified 13 months ago

#13296 new bug

stat() calls can block Haskell runtime

Reported by: nh2 Owned by:
Priority: normal Milestone:
Component: Runtime System Version: 8.0.2
Keywords: Cc: simonmar, slyfox, rwbarton, redneb
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

getFileStatus (stat() syscall) is marked as unsafe, which means that if I have e.g. +RTS -N4, I can't stat more than 4 files at the same time without completely stopping the Haskell world.

This is an issue on network file systems, where a single stat() can easily take 2 milliseconds, so you typically want to do them in parallel (but due to the above you can't).

The underlying problem is that there are some Linux syscalls you typically need on networked file systems that have no asynchronous equivalent; according to http://blog.libtorrent.org/2012/10/asynchronous-disk-io/ these are at least:

  • stat()
  • open()
  • fallocate()
  • rename()

A quick skim through libraries/base/System/Posix/Internals.hs reveals the situation:

  • stat() only exists as unsafe
  • open() has both safe and unsafe variants (I haven't checked which one is used in practice)

The remaining ones are in the unix package

  • fallocate() is safe
  • rename() is unsafe

It seems to me that there are two issues here:

1) None of these calls should be unsafe because they may block for a very long time (e.g. > 0.5 ms even on the fastest LANs).

2) We need to answer the question: If we marked them as safe, how many of them would the RTS execute in parallel? To my current knowledge (thanks rwbarton and slyfox on #ghc), there's a pool of Haskell executing threads (the usual -RTS -N), and a pool of FFI threads. Are there any restrictions on the size of that latter pool? The docs https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ffi-chap.html#foreign-imports-and-multi-threading are not specific on that topic, simply mentioning "but there may be an arbitrary number of foreign calls in progress at any one time, regardless of the +RTS -N value". Does that mean the amount of FFI threads is truly unbounded?

Change History (5)

comment:1 Changed 3 years ago by nh2

slyfox on IRC answers question (2):

  • slyfox: yeah, ghc's RTS does not limit amount of outstanding safe FFIs. example: http://dpaste.com/2YT058D
  • slyfox: ghc --make a.hs -threaded -debug && time ./a +RTS -Ds this runs in 6 seconds and nicely prints full list of blocked threads on FFI

I've also put this into a gist and confirmed it on my GHC 8.0.2: https://gist.github.com/nh2/bcf583721213d34e9f464558a91a682e

Further discussion:

  • rwbarton: stat() in a loop in C costs about 300ns per stat call here
  • nh2: much easier to program than building HTTP wrappers around all software, and basic * useful features
  • rwbarton: so 100ns for safe call overhead is kind of significant
  • nh2: slyfox: I guess asking for a better API is thinking ahead a bit too much when there * isn't even an async syscall interface for it in Linux yet
  • rwbarton: I think that is the API slyfox means
  • rwbarton: getFileStatus costs another 300 ns in userspace
  • rwbarton: so it would be about 100 ns more on top of 600 ns
  • nh2: rwbarton: that is a fair point, but it is hard to get completely right: on a local FS with a fast backend (e.g. ramdisk, SSD or FS metadata in the buffer cache), stat will be very fast, but if you are on a spinning disk on a part of FS metadata that's not currently cached, a NFS, or a network-mounted VM image (e.g. Amazon EBS and the various * equivalents), it'll be 1000x slower. And there's no way to tell on which you are

comment:2 Changed 3 years ago by simonmar

The default implementation should be safe, and we could provide an internal unsafe version for people who know what they're doing and want that extra 100ns.

Separately we should work on reducing the overhead of safe calls. I have an experimental patch here: https://phabricator.haskell.org/D1466

comment:3 Changed 22 months ago by Ben Gamari <ben@…>

In cafe9834/ghc:

Always use the safe open() call

open() can sometimes take a long time, for example on NFS or FUSE
filesystems.  We recently had a case where open() was taking multiple
seconds to return for a (presumably overloaded) FUSE filesystem, which
blocked GC and caused severe issues.

Test Plan: validate

Reviewers: niteria, bgamari, nh2, hvr, erikd

Reviewed By: bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #13296

Differential Revision: https://phabricator.haskell.org/D4239

comment:4 Changed 22 months ago by bgamari

We still need to do this for the remaining syscalls.

comment:5 Changed 13 months ago by redneb

Cc: redneb added
Note: See TracTickets for help on using tickets.