I am running different versions / revisions of an app on IBM Cloud Code Engine. I split traffic 80/20 between them. I noticed that sometimes the app is responsive as expected, sometimes not.
What could be the reason? How can I investigate?
I am running different versions / revisions of an app on IBM Cloud Code Engine. I split traffic 80/20 between them. I noticed that sometimes the app is responsive as expected, sometimes not.
What could be the reason? How can I investigate?
If you have access to application latency logs, they might include which revision returned the slow result, which could help explain why you sometimes get different latency results.
There may also be a Knative header in the response which indicates which revision served a request, but there doesn't seem to be anything consistently documented on that front.
If you're willing to push new code to your application, you could implement your own application-level logging of latency and instance I'd, which might help pin down host level or startup behaviors.
I worked with the IBM Cloud Code Engine CLI and Knative CLI to investigate it.
First, I retrieved information about the app:
It showed that two revisions were active (due to split traffic), but not showing details. The YAML output had more information. It showed that one revision had "scale to zero", the other a minimum of one instance active.
I also checked if there was more information available using Knative. First, get the Kubernetes configuration for the project:
Then, list the revisions, again as YAML output:
This part of the revision metadata shows the minScale as zero, causing a (cold) start and hence the delay.