I'm having a @RestController webservice method that might block the response thread with a long running service call. As follows:

public class MyRestController {
    //could be another webservice api call, a long running database query, whatever
    private SomeSlowService service;

    public Response get() {
        return service.slow();

    public Response get() {
        return service.slow();

Problem: what if X users are calling my service here? The executing threads will all block until the response is returned. Thus eating up "max-connections", max threads etc.

I remember some time ago a read an article on how to solve this issue, by parking threads somehow until the slow service response is received. So that those threads won't block eg the tomcat max connection/pool.

But I cannot find it anymore. Maybe somebody knows how to solve this?

2 Answers

g00glen00b On

there are a few solutions, such as working with asynchronous requests. In those cases, a thread will become free again as soon as the CompletableFuture, DeferredResult, Callable, ... is returned (and not necessarily completed).

For example, let's say we configure Tomcat like this:

server.tomcat.max-threads=5 # Default = 200

And we have the following controller:

public CompletableFuture<String> getSlowBar() {
    return CompletableFuture.supplyAsync(() -> {
        return "Bar";

public String getSlowBaz() {
    return "Baz";

If we would fire 100 requests at once, you would have to wait at least 200 seconds before all the getSlowBar() calls are handled, since only 5 can be handled at a given time. With the asynchronous request on the other hand, you would have to wait at least 10 seconds, because all requests will likely be handled at once, and then the thread is available for others to use.

Is there a difference between CompletableFuture, Callable and DeferredResult? There isn't any difference result-wise, they all behave the similarly.

The way you have to handle threading is a bit different though:

  • With Callable, you rely on Spring executing the Callable using a TaskExecutor
  • With DeferredResult you have to to he thread-handling by yourself. For example by executing the logic within the ForkJoinPool.commonPool().
  • With CompletableFuture, you can either rely on the default thread pool (ForkJoinPool.commonPool()) or you can specify your own thread pool.

Other than that, CompletableFuture and Callable are part of the Java specification, while DeferredResult is a part of the Spring framework.

Be aware though, even though threads are released, connections are still kept open to the client. This means that with both approaches, the maximum amount of requests that can be handled at once is limited by 10000, and can be configured with:

server.tomcat.max-connections=100 # Default = 10000
jin On

in my opinion.the async may be better for the sever.for this particular api, async not works well.the clients also hold the connections. finally it will eating up "max-connections".you can send the request to messagequeue(kafka)and return success to clients. then you get the request and pass it to the slow sevice.