Stream content from SQL/DAO to Browser in a File using Spring

33 views Asked by At

So I have this problem where I need to return huge SQL results (millions of rows) to the client as a file. To break the problems, I split them into two:

  1. Streaming SQL results from DAO to Service layer in a way where OOM error doesn't come.
  2. Streaming the acquired data from DAO(essentially a Stream<?>) object to the client as a file.

Tech stack: SpringMVC, Java11, PostgreSQL, JdbcTemplate, Hibernate.

To solve the first point, I used jdbcTemplate because some resources mentioned hibernate scrollableResults are not memory-friendly for PostgreSQL. JdbcTemplate.queryForStream() seemed the right option.

    String query = "select name from usersTable limit 1000000";
    jdbcTemplate.setFetchSize(10_000);
    
    Stream<String> revs = jdbcTemplate.queryForStream(query,
        (resultSet, rowNum) ->
            resultSet.getString("name"));


    return ResponseEntity
        .status(HttpStatus.OK)
        .contentType(MediaType.valueOf(MediaType.MULTIPART_FORM_DATA_VALUE))
        .body(new InputStreamResource(IOUtils.toInputStream(String.join("\n", revs.collect(Collectors.toList())))));

Resource: https://jvmaware.com/streaming-json-response/

To solve the second point, I am using ResponseEntitiy which is what almost everyone recommends.

The API seems to be sending back the file well as expected.

But, when I look into jconsole, I see heap memory keeps rising until file was sent and at the point of time when the file is sent to browser, there is an even bigger spike. I expected the stream to be flushed by InputStream implicitly and not cause any high memory usage issue.

Can someone please point out proper way to do both these things?

Any help is appreciated.

Thanks

I tried JdbcTemplate.queryForStream() and didn't yet have the opportunity to test if it is working as I'm focusing right now on streaming big results from my Service layer.

I tried writing to HttpServletResponse.getOutputStream() but the same memory spike.

I assumed I had to flush the outputStream once I wrote into it. But this error came up.

2024-02-29 20:14:23,775 ERROR [com.sample.controller.exception.ApplicationExceptionHandler] (default task-2) Exception Occurred : org.springframework.http.converter.HttpMessageNotWritableException: No converter for [class com.sample.service.impl.UserServiceImpl$$Lambda$858/0x0000000801578840] with preset Content-Type 'multipart/form-data'

Code for the same:

String query = "select name from usersTable limit 1000000";
    jdbcTemplate.setFetchSize(10_000);
    
    Stream<String> revs = jdbcTemplate.queryForStream(query,
        (resultSet, rowNum) ->
            resultSet.getString("name"));


StreamingResponseBody responseBody = httpResponseOutputStream ->
        revs.forEachOrdered(row -> {
          try {
            IOUtils.copy(IOUtils.toInputStream(row), httpResponseOutputStream);
            httpResponseOutputStream.flush();
          } catch (IOException e) {
            throw new RuntimeException(e);
          }
        });

    return ResponseEntity
        .status(HttpStatus.OK)
        .cacheControl(CacheControl.noCache())
        .contentType(MediaType.valueOf(MediaType.MULTIPART_FORM_DATA_VALUE))
        .body(responseBody);

Not sure If I am missing any configuration.

1

There are 1 answers

0
StormySalad On

The solution to this was going to old school JDBC.

  1. First to solve DAO data collection problem, Get a Connection from your DataSource.
  2. Set autoCommit as false before executing the query(Probably a hack, but from what I understood, if autoCommit is true, it loads entire resultSet and disregards FetchSize).
  3. Create your query as a Statement/PreparedStatement/...
  4. Set FetchSize for your statement to whatever the application performs best at(50,000 worked for me).
  5. Execute your query which would fetch ResultSet.
  6. Loop over the ResultSet and keep writing each row(after any transformation if needed. Transform it to String or any byte[]able object)
  7. The above transformed object can be written to ServletOutputStream of HttpServletResponse.

That's it!!

NOTE:

  • Since you write the DAO result into outputStream, you will have to bring outputStream object in DAO. Check if it is feasible for your use case.
  • In my testing, 2 million rows were streamed perfectly fine without OOM errors. Head memory fluctuates as each batch is flushed implicitly by JDBC.
  • CPU usage shows small spikes. I guess that's the trade off I am fine to take.
  • This way of streaming should not depend on your DB/Driver version.
  • Use try with resources for easy resource management.
  • And most importantly you don't have to write onto any secondary storage device.

Cheers!