I have a sample download code that works fine if the file is not zipped because I know the length and when I provide, it I think while streaming play does not have to bring the whole file in memory and it works. The below code works
def downloadLocalBackup() = Action {
var pathOfFile = "/opt/mydir/backups/big/backup"
val file = new java.io.File(pathOfFile)
val path: java.nio.file.Path = file.toPath
val source: Source[ByteString, _] = FileIO.fromPath(path)
logger.info("from local backup set the length in header as "+file.length())
Ok.sendEntity(HttpEntity.Streamed(source, Some(file.length()), Some("application/zip"))).withHeaders("Content-Disposition" -> s"attachment; filename=backup")
}
I don't know how the streaming in above case takes care of the difference in speed between disk reads(Which are faster than network). This never runs out of memory even for large files. But when I use the below code, which has zipOutput stream I am not sure of the reason to run out of memory. Somehow the same 3GB file when I try to use with zip stream, is not working.
def downloadLocalBackup2() = Action {
var pathOfFile = "/opt/mydir/backups/big/backup"
val file = new java.io.File(pathOfFile)
val path: java.nio.file.Path = file.toPath
val enumerator = Enumerator.outputStream { os =>
val zipStream = new ZipOutputStream(os)
zipStream.putNextEntry(new ZipEntry("backup2"))
val is = new BufferedInputStream(new FileInputStream(pathOfFile))
val buf = new Array[Byte](1024)
var len = is.read(buf)
var totalLength = 0L;
var logged = false;
while (len >= 0) {
zipStream.write(buf, 0, len)
len = is.read(buf)
if (!logged) {
logged = true;
logger.info("logging the while loop just one time")
}
}
is.close
zipStream.close()
}
logger.info("log right before sendEntity")
val kk = Ok.sendEntity(HttpEntity.Streamed(Source.fromPublisher(Streams.enumeratorToPublisher(enumerator)).map(x => {
val kk = Writeable.wByteArray.transform(x); kk
}),
None, Some("application/zip"))
).withHeaders("Content-Disposition" -> s"attachment; filename=backupfile.zip")
kk
}
In the first example, Akka Streams handles all details for you. It knows how to read the input stream without loading the complete file in memory. That is the advantage of using Akka Streams as explained in the docs:
At the second example, you are handling the input/output streams by yourself, using the standard blocking API. I'm not 100% sure about how writing to a
ZipOutputStream
works here, but it is possible that it is not flushing the writes and accumulating everything beforeclose
.Good thing is that you don't need to handle this manually since Akka Streams provides a way to gzip a
Source
ofByteString
s: