How to implement data streaming with the Snap framework?

232 views Asked by At

I'd like to implement streaming of large data (in both directions) with the Snap server. To explore the possibilities I created a sample program that has two endpoints - reading and writing. There is a very simple internal buffer that holds one ByteString and whatever is written to the writing endpoint appears in the reading one. (Currently there is no way how to terminate the stream, but that's fine for this purpose.)

{-# LANGUAGE OverloadedStrings #-}                                           
import Control.Applicative                                                   
import Control.Concurrent.MVar.Lifted                                        
import Control.Monad                                                         
import Data.ByteString (ByteString)                                          
import Blaze.ByteString.Builder (Builder, fromByteString)                    
import Data.Enumerator                                                       
import qualified Data.Enumerator.List as E                                   
import Data.Enumerator.Binary (enumFile, iterHandle)                         
import Snap.Core                                                             
import Snap.Http.Server                                                      

main :: IO ()                                                                
main = do                                                                    
  buf <- newEmptyMVar                                                        
  quickHttpServe (site buf)                                                  

site :: MVar ByteString -> Snap ()                                           
site buf =                                                                   
    route [ ("read", modifyResponse (setBufferingMode False                  
                                     . setResponseBody (fromBuf buf)))       
          , ("write", runRequestBody (toBuf buf))                            
          ]                                                                  

fromBuf :: MVar ByteString -> Enumerator Builder IO a                        
fromBuf buf = E.repeatM (liftM fromByteString $ takeMVar buf)                

toBuf :: MVar ByteString -> Iteratee ByteString IO ()                        
toBuf buf = E.mapM_ (putMVar buf)

Then I run in different terminals

curl http://localhost:8000/read >/dev/nul

and

dd if=/dev/zero bs=1M count=100 | \
  curl --data-binary @- http://localhost:8000/write

But the writing part fails with an exception escaped to toplevel: Too many bytes read. This is obviously an instance of TooManyBytesReadException, but I couldn't find where it's thrown. Writing smaller amount of data like 1MB works as expected.

My questions are:

  1. Where/how to fix the reading limit?
  2. Will this stream data, without loading the whole POST request in memory? If not, how to fix it?
2

There are 2 answers

1
nh2 On BEST ANSWER

It will work if you add any content type that's not "application/x-www-form-urlencoded" to your /write, e.g.:

dd if=/dev/zero bs=1M count=100 | \
  curl -H "Content-Type: application/json" --data-binary @- http://localhost:8000/write

This bit in Snap does something like

if contentType == Just "application/x-www-form-urlencoded" then readData maximumPOSTBodySize
  where
    maximumPOSTBodySize = 10*1024*1024

and x-www-form-urlencoded is curl's default.

0
gregorycollins On

To follow up on the previous answer: because forms of type application/x-www-form-urlencoded are so common, as a convenience Snap auto-decodes them for you and puts them into a parameters map in the request. The idea is similar in spirit to e.g. $_POST from PHP.

However, since these maps are read into RAM, naively decoding unbounded amounts of this data would allow an attacker to trivially DoS a server by sending it arbitrary amounts of this input until heap exhaustion. For this reason snap-server limits the amount of data it is willing to read in this way.