This is basically a probability theory question, but I am so rusty that I can't seem to wrap my head around where to start.
I have a pool of Z workers. Each worker has a limit, X, for how many tasks it must perform before it is recycled. Each new task is grabbed by a worker at random (1/Z
chance of a worker grabbing it).
After Y tasks has been given, what is the odds that one worker has reached the X threshold?
I wish to calculate it because I need to perform a "cleanup" periodically, and instead of picking some number at random, I would like it to be related to the number of workers and their task limit (which I may tweak at a later point), so I'll do the clean-up once there's a 25-30-35% chance that one worker is done for.
Consider worker
W
. Each time a task enters the system, there is a1/Z
chance thatW
grabs the task. A total ofY
tasks enter the system.Y
tasks, each grabbed with probability1/Z
: the number of grabbed tasks,G_W
, is a random variable with a binomial distribution.The probability worker
W
has reached theX
threshold afterY
tasks isP(G_W >= X)
.You can easily approximate the distribution and this probability by using a normal distribution with mean
Y/Z
and varianceY(Z-1)/Z^2
. Then you can use a function that evaluates a normal distribution's cumulative distribution, which is available in mostly every programming language.