IBM Cloud Code Engine: Different response times for app revisions when traffic is split

154 views Asked by At

I am running different versions / revisions of an app on IBM Cloud Code Engine. I split traffic 80/20 between them. I noticed that sometimes the app is responsive as expected, sometimes not.

What could be the reason? How can I investigate?

3

There are 3 answers

0
data_henrik On BEST ANSWER

I worked with the IBM Cloud Code Engine CLI and Knative CLI to investigate it.

First, I retrieved information about the app:

ibmcloud ce app get --name myapp

It showed that two revisions were active (due to split traffic), but not showing details. The YAML output had more information. It showed that one revision had "scale to zero", the other a minimum of one instance active.

ibmcloud ce app get --name myapp --output yaml

I also checked if there was more information available using Knative. First, get the Kubernetes configuration for the project:

ibmcloud ce project select --name myproject --kubecfg

Then, list the revisions, again as YAML output:

kn revision list --output yaml

This part of the revision metadata shows the minScale as zero, causing a (cold) start and hence the delay.

  metadata:
    annotations:
      autoscaling.knative.dev/maxScale: "2"
      autoscaling.knative.dev/minScale: "0"
1
chughts On

If you are allowing minimum instances to be 0, then under zero load the apps will be shut down. When the next request comes in there will be startup / instantiation time.

0
E. Anderson On

If you have access to application latency logs, they might include which revision returned the slow result, which could help explain why you sometimes get different latency results.

There may also be a Knative header in the response which indicates which revision served a request, but there doesn't seem to be anything consistently documented on that front.

If you're willing to push new code to your application, you could implement your own application-level logging of latency and instance I'd, which might help pin down host level or startup behaviors.