What exactly is the http latency in pivot cloud foundary?

1.1k views Asked by At

I am implementing autoscaling policies for my spring boot app deployed in PCF. I have read using memory is not a good idea for java app because java doesn't not release memory so often to OS. Secondly, cpu utilisation metrics is not recommended by PCF.

So I am implementing policies using latency metrics. Now my doubt is what exactly is http latency in PCF. Is it like the absolute time taken a request to come and respond. Or the time from when request wasacknowledged. Does it consider the queue time before it gets acknowledged? There is a lot of confusion. If any one clear it so I can implement autoscaling policies in right way.

PS: any other suggestion for autoscaling will do.

1

There are 1 answers

0
Daniel Mikusa On

Now my doubt is what exactly is http latency in PCF. Is it like the absolute time taken a request to come and respond. Or the time from when request wasacknowledged. Does it consider the queue time before it gets acknowledged? There is a lot of confusion. If any one clear it so I can implement autoscaling policies in right way.

It is the full response time for a request as visible from Gorouter's point of view.

An example to explain better:

  1. A request departs your browser
  2. It will typically hits some load balancers, which send it to Gorouter.
  3. Gorouter will then route the request to your application.
  4. Your app will handle the request.
  5. Gorouter will proxy the response back from the application.

The latency value used by autoscaler is a metric that's exposed by Gorouter called a TIMER metric. The timer lists the time from the point where Gorouter received the request to the time the response was completely delivered to the client (i.e. steps 3-5 in the example).

If you want to see the actual value for each request, you can run cf logs and look at the [RTR] entries, the gorouter_time field will tell you the latency. You can also use the cf tail command to look directly at the TIMER metrics, but this requires an additional cf cli plugin to be installed will show you the same number.

PS: any other suggestion for autoscaling will do.

Latency is a good metric to use so long as your response time is not dependent on too many other services or if you have good circuit breakers implemented). Latency can be a problem when slowness in other services reflects into the latency and causes the autoscaler to incorrectly scale your application (when in fact, the upstream service should be scaled).

Other options:

  • HTTP throughput, so long as you're on a recent PCF Version. See here.
  • Custom metrics emitted from your app. This is super easy with Spring Boot. In this way you can export better metrics about the JVM's memory usage, like heap usage or number of threads, which allow you to make more intelligent autoscaling decisions. You can also implement totally custom metrics based on business related factors within your apps.