Locust 'Request time' vs 'Response time' for single request

1k views Asked by At

I have been working on an update of the YottaDB web framework benchmarks. I am having trouble working out the difference between locust request times and response times. This answer implies that response time includes all requests to that endpoint.

But these benchmarks were created by sending >10,000 requests to just one endpoint. Does that mean response time should be identical to the request time? Or should the response time cover all 10,000 requests? It seems to be neither.

My results summarised here show for the Lua web stack, an average request time of 1422ms but a 50%tile response time of 250ms. I you want more detail, here's the benchmarking setup.

How are these two related?

1

There are 1 answers

10
Solowalker On BEST ANSWER

"Response time" is the amount of time between your user making the initial request to when a response from the server is received. Locust's http client inherits from Requests so you can research more of what that response time (sometimes called elapsed in code) means technically, but the gist is that is how long the use is waiting for a response. That includes network time but is primarily determined by the amount of time it takes your server to accept the request, do whatever work it needs to, and then send the response.

EDIT:

Your specific concerns about why there's a difference between the average and 50%ile response times comes down to what average and percentiles are. They are not the same thing so the numbers are expected to almost always be different (unless you have a very responsive system that you're not able to stress sufficiently and all your response times come back as roughly the same).

Here's some simplified math to help you understand using your specific data. In a comment, you mentioned yottalua's data so we'll use that. yottalua had 21781 requests with an average response time of 1217ms. We get that by adding up the response time for every single request and divide it by the number of requests, meaning the sum total of all response times for all requests must be around 21781*1217=26,507,477ms.

The %tiles, though, give you more detailed information. They are saying that a large majority—somewhere between 80% and 90%—of your requests are fine with low response times, but things start to break down from there. 10%-20% of the requests have a really high response time. With this information we can approximate an average to compare with what Locust has already told us the average is. Let's pick some numbers and say that 15% (the middle between 10% and 20%) of your requests (21781*.15=3267.15 requests) have high response times and say they're all 6000ms (a number close to the middle between 2400ms that's your 80%ile and 13000ms at your 100%ile) and all the other 85% of your requests (21781 *.85=18513.85 requests) had 240ms (60%ile).

((21781*.15*6000)+(21781*.85*240))/21781 = 1104

Again, this is simplified and was arrived at with picking random numbers to make the math easier, but it gets us in the ballpark of the reported average (1104ms vs 1217ms).

You may have a majority of requests with a low response time (e.g. 80% of 21781, or 17,424.8 requests with 290ms) but your minority of requests with a really high response time (e.g. 20% of 21781, or 4,356 requests with 2400ms-13000ms) will skew the average higher than you think. But both your percentiles and averages will be correct even though they are different, they're just different ways of looking at the body of data you have that is your test results.