I have a RESTful-styled RPC (remote procedure call) API running on a tomcat server that processes data of N users with M tasks on K threads. Mostly one user has around 20 to 500 tasks (but M could be between 1 to 5000). One task needs around 10 to 20 seconds to complete, but can be between 1 second and 20 minutes. Currently, mostly the system has one user, sometimes up to three, but it increases to around 10 users at the same time in the near future. Our server has 10 cores, therefore I'd like to use 10 threads. At the moment every user gets 5 threads for processing, which works fine. But a) most of the time the machine is only utilized 50% (which results in needles waiting in the "30-minute" range), sometimes the server load is up to 150%.
Requirements to solution:
- at all times the server is utilized to 100% (if there are tasks)
- that all users are treated the same regarding thread execution (same amount of threads finished as every other user)
- a new user does not have to wait until all tasks of a earlier user are done (especially in the case where user1 has 5000 tasks and user2 has 1 this is important)
Solutions that come to mind:
just use a FixedThreadPoolExecutor with 10 threads, violates condition 3
use the PriorityBlockingQueue and implement the compareTo method in my task -> can not use the threadpoolExecutors submit method (and therefore I do not know when a submitted task is over)
implement a "round robin" like blocking queue, where the K threads (in our case 10) take new tasks from the N internal queues in a round robin way -> to be able to put a task into the right queue, I need a "submit"-method that takes more than one parameter (I need to implement a ThreadPoolExecutor, too)
I tried to make an illustration of what I mean by round robin like blocking queue (if not helpful feel free to edit it out):
-- -- -- -- -- -- queue task load, -- -- -- -- -- -- -- one task denoted by -- -- -- -- -- -- -- -- -- | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | QN | | * ^ | | last| |next | | ------------- \ / \ | | | | | | T1 | T2 | T3 | T4 | TK |
Is there an elegant solution to use mostly Java standard APIs (or any other widespread Java API) for achieving this kind of processing behavior (might it be one of my proposed solutions or any another solution)? Or do you have any other hints on how to tackle this issue?
If you agree that minimizing the overall task latency is a good replacement for requirements 2 and 3, and you have good-enough task runtime estimates, then I may have an answer.
You store the task submit time with each task, so that later you can always compute its estimated latency. You can then build a PriorityBlockingQueue that, when inserting a new task, always inserts it at a queue position that provides some fairness and attempts to minimize overall latency. This will, at first, put long-running tasks at a disadvantage. I have not tried it myself, but I would assign a task priority based on your estimated run time, and use estimatedRuntime-waitingTime as priority (taking the lowest-priotity job first). This will give heavy tasks a chance after they waited enough to have negative priority. Until then, light tasks will have a better chance to be first, even if they have just been submitted. This scheduling will only be so fair as your estimates allow, though.
As for the round-robin requirement: If that is really important, you can handle this in the queue as well. Basically, when you use a thread pool, you can implement yur scheduling strategy in terms of where you insert new jobs in the queue. If you can estimate job latency, you can balance that across your users, too, by inserting at the right position.