Ticket #8 (closed defect: fixed)

Opened 5 years ago

Last modified 5 years ago

weird issue with round-robin proxies

Reported by: duncan@… Owned by: somebody
Priority: major Milestone:
Component: component1 Version:
Keywords: Cc:

Description

We had this issue reported against the cabal-install program:

http://hackage.haskell.org/trac/hackage/ticket/562

The summary is that when using a round-robin proxy we get sporadic failures. The difference looks like:

In the failing case

X-Cache: HIT from wwwcache.lancs.ac.uk
X-Cache-Lookup: HIT from wwwcache.lancs.ac.uk:8080
X-Cache: MISS from wwwcache.lancs.ac.uk
X-Cache-Lookup: MISS from wwwcache.lancs.ac.uk:8080
Via: 1.0 wwwcache3.lancs.ac.uk:8080 (squid/2.6.STABLE22), 1.0
wwwcache1.lancs.ac.uk:8080 (squid/2.7.STABLE4)
Connection: close

In the working case:

X-Cache: HIT from wwwcache.lancs.ac.uk
X-Cache-Lookup: HIT from wwwcache.lancs.ac.uk:8080
Via: 1.0 wwwcache3.lancs.ac.uk:8080 (squid/2.6.STABLE22)
Proxy-Connection: close

In particular in the failing case we get a cache hit followed by a mystery miss and finally Connection: close. In the working case we get no miss and a Proxy-Connection: close.

Why Connection: vs Proxy-Connection: we wonder, and what is this cache miss about?

Change History

follow-up: ↓ 5   Changed 5 years ago by michaeldever

I took a look at this, while I don't have a solution. I think it has something to do with the recent addition of lazy stream reading. Is it possibly too lazy? :)

Again, even with the addition, it all seems to flow in a *similar* manner. diffing the two libraries (.6 and .7 doesn't show too many changes.) But one thing worth noting is that in socketConnection in .7, it is setting the connInput of the HandleStream? to be Nothing..

I tried calling socketConnection_ hst sock True, but that made no difference.

I can't see anywhere that it might be going wrong.

  Changed 5 years ago by michaeldever

Extracting the tarball of the FileManip?? package that was downloaded yeilds this:

/home/member/d/dever/.cabal/packages/hackage.haskell.org/00-index.tar.gz

gzip: stdin: unexpected end of file FileManip?-0.3.2/ FileManip?-0.3.2/README FileManip?-0.3.2/FileManip.cabal FileManip?-0.3.2/System/ FileManip?-0.3.2/System/FilePath/ FileManip?-0.3.2/System/FilePath/GlobPattern.hs tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now

So it's definitely a problem with the HTTP package in my opinion. I'm not sure if it is a problem with the packages proxy handling, as it does download some of the package, but not all of it.

Seeing as both the Zlib library, and tar yield an end of stream error, its something that I'm reckoning is happening during transport.

  Changed 5 years ago by anonymous

Before trying to find the source of the bug, have you checked that the problem does not occur with some earlier version, like 4000.0.6 or 4000.0.5?

  Changed 5 years ago by joe

Another perspective here, with somewhat different symptoms. My laptop is temporarily behind a transparent proxy server and now I am unable to "cabal update".

The difference is [1] it's not intermittent (~10 tries, so far), [2] every user agent I try fails (HTTP-4000.0.7, firefox, wget, perl LWP), [3] my proxy server returns a 500 server error. Furthermore, if I request the same .tar.gz file from a different server, it works OK.

To me, this implies an interaction problem between the proxy server and hackage.haskell.org. Maybe my proxy server just fails faster than other posters' and returns a 500 when it receives a partial response. Or maybe I'm experiencing a completely different problem...

Do other posters see the problem with other user agents? How about with the other URL I created, below? If you see differences, you could try using tcpdump to watch the conversations.

$ ghc -e 'Network.HTTP.simpleHTTP (Network.HTTP.getRequest "http://hackage.haskell.org/packages/archive/HTTP/4000.0.4/HTTP-4000.0.4.tar.gz") >>= Network.HTTP.getResponseBody' | wc -c
2065

$ ghc -e 'Network.HTTP.simpleHTTP (Network.HTTP.getRequest "http://heme.elem.com/tmp/li4Ewiak/HTTP-4000.0.4.tar.gz") >>= Network.HTTP.getResponseBody' | wc -c
126197

$ GET -Se http://hackage.haskell.org/packages/archive/HTTP/4000.0.4/HTTP-4000.0.4.tar.gz
GET http://hackage.haskell.org/packages/archive/HTTP/4000.0.4/HTTP-4000.0.4.tar.gz --> 500 ( Not implemented  )
Via: <proxy server name deleted>
Title: The page cannot be displayed
X-Meta-GENERATOR: MSHTML 5.50.4522.1800
X-Meta-ROBOTS: NOINDEX
...

$ GET -Se http://heme.elem.com/tmp/li4Ewiak/HTTP-4000.0.4.tar.gz
GET http://heme.elem.com/tmp/li4Ewiak/HTTP-4000.0.4.tar.gz --> 200 OK
Via: <proxy server name deleted>
...

$ wget --verbose http://hackage.haskell.org/packages/archive/HTTP/4000.0.4/HTTP-4000.0.4.tar.gz
--2009-07-10 00:29:15--  http://hackage.haskell.org/packages/archive/HTTP/4000.0.4/HTTP-4000.0.4.tar.gz
Resolving hackage.haskell.org... 69.30.63.197
Connecting to hackage.haskell.org|69.30.63.197|:80... connected.
HTTP request sent, awaiting response... 500 ( Not implemented  )
2009-07-10 00:29:16 ERROR 500: ( Not implemented  ).

$ wget --verbose http://heme.elem.com/tmp/li4Ewiak/HTTP-4000.0.4.tar.gz
--2009-07-10 00:29:37--  http://heme.elem.com/tmp/li4Ewiak/HTTP-4000.0.4.tar.gz
Resolving heme.elem.com... 69.13.230.185
Connecting to heme.elem.com|69.13.230.185|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 44063 (43K) [application/x-gzip]
Saving to: `HTTP-4000.0.4.tar.gz.2'

100%[================================================================================================================================>] 44,063      --.-K/s   in 0.09s   

2009-07-10 00:29:38 (463 KB/s) - `HTTP-4000.0.4.tar.gz.2' saved [44063/44063]

in reply to: ↑ 1 ; follow-up: ↓ 6   Changed 5 years ago by valery.vv@…

Replying to michaeldever:

I took a look at this, while I don't have a solution. I think it has something to do with the recent addition of lazy stream reading. Is it possibly too lazy? :) Again, even with the addition, it all seems to flow in a *similar* manner. diffing the two libraries (.6 and .7 doesn't show too many changes.) But one thing worth noting is that in socketConnection in .7, it is setting the connInput of the HandleStream? to be Nothing.. I tried calling socketConnection_ hst sock True, but that made no difference. I can't see anywhere that it might be going wrong.

I do experience this problem (in both versions I tried: 4000.0.6 and 4000.0.8) and I've prepared a tiny POC module:

http-proxy-fail-poc.hs

in reply to: ↑ 5   Changed 5 years ago by valery.vv@…

Replying to vvv:

Replying to michaeldever:

I took a look at this, while I don't have a solution. I think it has something to do with the recent addition of lazy stream reading. Is it possibly too lazy? :)

I do experience this problem (in both versions I tried: 4000.0.6 and 4000.0.8) and I've prepared a tiny POC module: http-proxy-fail-poc.hs

The proof of concept module begins downloading complete files once strict ByteStrings are used:

$ time ./get_strict http://grml.deb.at/grml-small_2009.05.iso
Content-Length:   95746048
bytes downloaded: 95746048

real    2m58.560s
user    0m0.136s
sys     0m1.124s

$ time ./get_lazy http://grml.deb.at/grml-small_2009.05.iso
Content-Length:   95746048
bytes downloaded: 43140
get_lazy: user error (sizes differ)

real    0m0.313s
user    0m0.000s
sys     0m0.004s

See http-proxy-strict-poc.hs.

AFAICS, this is cabal-install's bug: HTTP has general API (rspBody :: Response a -> a), and this is cabal-install narrowing the type (replacing a with Response {-Lazy.-}ByteString). See getHTTP function in Distribution.Client.HttpUtils.

  Changed 5 years ago by vvv

Just as Duncan suspected, this is a bug of HTTP library.

sendHTTP_notify function closes connection when it sees "Connection: close" header in HTTP response. This is just the case with proxied HTTP session.

Part of response body gets buffered (written to HTTP lib's internal buffer). HTTP server keeps sending TCP packets with remaining data... But HTTP lib closes connection: HTTP client sends FIN,ACK. HTTP server replies with ACK, FIN,ACK, abandoning data transmission.

When cabal-install is about to read response body, it can have as much of it as there were buffered prior to disconnection.


I have disabled connection closing code in sendHTTP_notify and now cabal-install gets data over proxy with no problem. Here is the change:

sendHTTP_notify :: HStream ty
                => HandleStream ty
		-> Request ty
		-> IO ()
		-> IO (Result (Response ty))
sendHTTP_notify conn rq onSendComplete = do
  rsp <- catchIO (sendMain conn rq onSendComplete)
                 (\e -> do { close conn; ioError e })
--   let fn list = when (or $ map findConnClose list)
--                      (close conn)
--   either (\_ -> fn [rqHeaders rq])
--          (\r -> fn [rqHeaders rq,rspHeaders r])
--          rsp
  return rsp

Why sendHTTP_notify?

cabal-install's getHTTP calls browse function of HTTP library. And the argument has got request "action" within:

getHTTP :: Verbosity -> URI -> IO (Result (Response ByteString))
getHTTP verbosity uri = do
                 p   <- proxy verbosity
                 let req = mkRequest uri
                 (_, resp) <- browse $ do
                                setErrHandler (warn verbosity . ("http error: "++))
                                setOutHandler (debug verbosity)
                                setProxy p
                                request req
                 return (Right resp)

The latter "calls" request'... and so on. See the backtrace:

  Changed 5 years ago by sof

  • status changed from new to closed
  • resolution set to fixed

This long-standing bug is allegedly solved by HTTP-4000.0.9 changes. Closing this one as a result...if it re-surfaces or hasn't been fully addressed, let's open a new ticket. (assuming we can purge the spam first...)

Note: See TracTickets for help on using tickets.