Large file download with Play framework

2.1k views Asked by At

I have a sample download code that works fine if the file is not zipped because I know the length and when I provide, it I think while streaming play does not have to bring the whole file in memory and it works. The below code works

def downloadLocalBackup() = Action {
  var pathOfFile = "/opt/mydir/backups/big/backup"
  val file = new java.io.File(pathOfFile)
  val path: java.nio.file.Path = file.toPath
  val source: Source[ByteString, _] = FileIO.fromPath(path)
  logger.info("from local backup set the length in header as "+file.length())
  Ok.sendEntity(HttpEntity.Streamed(source, Some(file.length()), Some("application/zip"))).withHeaders("Content-Disposition" -> s"attachment; filename=backup")
}

I don't know how the streaming in above case takes care of the difference in speed between disk reads(Which are faster than network). This never runs out of memory even for large files. But when I use the below code, which has zipOutput stream I am not sure of the reason to run out of memory. Somehow the same 3GB file when I try to use with zip stream, is not working.

def downloadLocalBackup2() = Action {
  var pathOfFile = "/opt/mydir/backups/big/backup"
  val file = new java.io.File(pathOfFile)
  val path: java.nio.file.Path = file.toPath
  val enumerator = Enumerator.outputStream { os =>
    val zipStream = new ZipOutputStream(os)
    zipStream.putNextEntry(new ZipEntry("backup2"))
    val is = new BufferedInputStream(new FileInputStream(pathOfFile))
    val buf = new Array[Byte](1024)
    var len = is.read(buf)
    var totalLength = 0L;
    var logged = false;
    while (len >= 0) {
      zipStream.write(buf, 0, len)
      len = is.read(buf)
      if (!logged) {
        logged = true;
        logger.info("logging the while loop just one time")
      }
    }
    is.close

    zipStream.close()
  }
  logger.info("log right before sendEntity")
  val kk = Ok.sendEntity(HttpEntity.Streamed(Source.fromPublisher(Streams.enumeratorToPublisher(enumerator)).map(x => {
    val kk = Writeable.wByteArray.transform(x); kk
  }),
    None, Some("application/zip"))
  ).withHeaders("Content-Disposition" -> s"attachment; filename=backupfile.zip")
  kk
}
1

There are 1 answers

3
marcospereira On

In the first example, Akka Streams handles all details for you. It knows how to read the input stream without loading the complete file in memory. That is the advantage of using Akka Streams as explained in the docs:

The way we consume services from the Internet today includes many instances of streaming data, both downloading from a service as well as uploading to it or peer-to-peer data transfers. Regarding data as a stream of elements instead of in its entirety is very useful because it matches the way computers send and receive them (for example via TCP), but it is often also a necessity because data sets frequently become too large to be handled as a whole. We spread computations or analyses over large clusters and call it “big data”, where the whole principle of processing them is by feeding those data sequentially—as a stream—through some CPUs.

...

The purpose [of Akka Streams] is to offer an intuitive and safe way to formulate stream processing setups such that we can then execute them efficiently and with bounded resource usage—no more OutOfMemoryErrors. In order to achieve this our streams need to be able to limit the buffering that they employ, they need to be able to slow down producers if the consumers cannot keep up. This feature is called back-pressure and is at the core of the Reactive Streams initiative of which Akka is a founding member.

At the second example, you are handling the input/output streams by yourself, using the standard blocking API. I'm not 100% sure about how writing to a ZipOutputStream works here, but it is possible that it is not flushing the writes and accumulating everything before close.

Good thing is that you don't need to handle this manually since Akka Streams provides a way to gzip a Source of ByteStrings:

import javax.inject.Inject

import akka.util.ByteString
import akka.stream.scaladsl.{Compression, FileIO, Source}

import play.api.http.HttpEntity
import play.api.mvc.{BaseController, ControllerComponents}

class FooController @Inject()(val controllerComponents: ControllerComponents) extends BaseController {

  def download = Action {
    val pathOfFile = "/opt/mydir/backups/big/backup"
    val file = new java.io.File(pathOfFile)
    val path: java.nio.file.Path = file.toPath
    val source: Source[ByteString, _] = FileIO.fromPath(path)
    val gzipped = source.via(Compression.gzip)
    Ok.sendEntity(HttpEntity.Streamed(gzipped, Some(file.length()), Some("application/zip"))).withHeaders("Content-Disposition" -> s"attachment; filename=backup")
  }

}