The streaming-bytestring library gives an error after printing about 512 bytes.
Error:
openBinaryFile: resource exhausted (Too many open files)
Code:
import Control.Monad.Trans (lift, MonadIO)
import Control.Monad.Trans.Resource (runResourceT, MonadResource, MonadUnliftIO, ResourceT, liftResourceT)
import qualified Data.ByteString.Streaming as BSS
import qualified Data.ByteString.Streaming.Char8 as BSSC
import System.TimeIt
main :: IO ()
main = timeIt $ runResourceT $ dump $ BSS.drop 24 $ BSS.readFile "filename"
dump :: MonadIO m => BSS.ByteString m r -> m ()
dump bs = do
isEmpty <- BSS.null_ bs
if isEmpty then return ()
else do
BSSC.putStr $ BSS.take 1 bs
dump $ BSS.drop 1 bs
When working with streaming libraries, it's usually a bad idea to reuse a effectful stream. That is, you can apply a function like
droporsplitAtto a stream and then continue working with the resulting stream, or you can consume the stream as a whole with a function like fold, which leaves you in the base monad. But you should never apply the same stream value to two different functions.Sadly, the Haskell type system as it stands is not able to enforce that restriction at compile time, it would require some form of linear types. Instead, it becomes the responsibility of the user.
The
null_function is perhaps a wart in the streaming-bytestring api, because it doesn’t return a new stream along with the result, giving the impression that stream reuse is normal throughout the API. It would be better if it had a signature likenull_ :: ByteString m r -> m (Bool, ByteString m r).Similarly, don't use
dropandtakewith the same stream value. Instead, usesplitAtorunconsand work with the divided result.So, about the error. As @BobDalgleish mentions in the comments, what is happening is that the file is opened when
null_is invoked (it is the first time we "demand" something from the stream) . In the recursive call we pass the originalbsvalue again, so it will open the file again, one time for each iteration, until we hit the file handle limit.Personally, I'm not a fan of using
ResourceTwith streaming libraries. I prefer opening the file withwithFileand then create and consume the stream withing the callback, if possible. But some things are more difficult that way.