Allocation of user calls to VPs on Informix RDBMS

88 views Asked by At

A problem I'd like discuss brings me trouble for some time now. There is a lot of articles addressing how to optimize Informix RDBMS in general but very little (or no at all) about a problem I'd like to mention.

What I'm going to say now is rather some kind of my best understanding how Informix works (or seems to work) based on web articles, documentation and my own observations. If there are any errors, mistakes or misunderstandings in my thinking feel free to comment and suggest weak spots.

1. Introduction

In general Informix server uses concept of VCPUs to scale the server performance and carry out user requests. Depending on the server license, underlying hardware and expected workload different number of VCPUs can be configured to handle workload. User request is scheduled to VCPU, VCPU executes request (e.g. stored procedure call) and returns results (usually a cursor to iterate through). More or less this is how it looks in my case and in general case I think. All considered requests here are reads only - no locking or any other obstacles to parallelize work to maximum possible extent.

2. Assumptions

Now lets have a closer look at the allocation and execution by VCPUs. Lets assume we have 4 vCPUs in our server instance (the same problem will eventually occurs for any number of vCPUs) and we have incoming user requests (all/most requests are stored procedures calls and return a cursor) in the form Rnn(id), where nn is a number of seconds for request to execute (this is empirical value not known or possible to determine a priori by the Informix server) and id is just a request id to identify them in this writing.

3. Scenario

When requests R are coming, in the Informix server allocates them to vCPUs - or at least that's what I can observe. Assume that in short period of time (less than 2s interval) five requests are sent to our server - T50(1), T50(2), T32(3), T15(4), T10(5) and are scheduled in the following way:

  • vCPU0 - T50(1)
  • vCPU1 - T50(2)
  • vCPU2 - T32(3)
  • vCPU3 - T15(4)

Now all vCPUs are busy and next task seems to be scheduled to one of vCPU. Any choice would be good at this point since server cannot determine how long request will take. Lets assume the following scheduling for T10(5) task:

  • vCPU0 - T50(1)
  • vCPU1 - T50(2), T10(5)
  • vCPU2 - T32(3)
  • vCPU3 - T15(4)

I'm not sure how exactly vCPUs execute tasks, but the observable fact is that we usually get results after a period of time which is a sum of execution time of all requests actively executing by vCPU (how many requests can by "actively" executed - I have no idea). But in simple scenarios when we can assume that all are running this is the case.

4. Execution

A logical execution flow in our case could be as follows. After 15 seconds T15(4) finishes and T5(10) gets rescheduled/moved to a free vCPU3 for further execution:

  • vCPU0 - T50(1)
  • vCPU1 - T50(2)
  • vCPU2 - T32(3)
  • vCPU3 - T10(5) (It could be less then 10s left since some of the work might
  • have been already done by vCPU1)

After 25 seconds (or little less) T10(5) completes:

  • vCPU0 - T50(1)
  • vCPU1 - T50(2)
  • vCPU2 - T32(3)
  • vCPU3 - idle

After 32 seconds T32(3) completes:

  • vCPU0 - T50(1)
  • vCPU1 - T50(2)
  • vCPU2 - idle
  • vCPU3 - idle

After 50 seconds T50(1), T50(2) complete and all work is done.

5. Results

In the end we expect see the following total execution time for our requests:

  • T50(1) - about 50s
  • T50(2) - about 60-70s (due to parallel execution with T10(5))
  • T32(3) - about 32s
  • T15(4) - about 15s
  • T10(5) - about 18-19s (due to parallel execution on vCPU1 and reallocation to vCPU3)

But that does not happen. More likely we can observe the following:

  • T50(1) - about 50s
  • T50(2) - about 70s
  • T32(3) - about 32s
  • T15(4) - about 15s
  • T10(5) - way above 19s - something close to 50-70s

It looks like the request T10(5) is never rescheduled and allocated to a vCPU3 when it becomes free and it stays on vCPU1 until completion. It is not good since we have to wait for it results almost 3x longer than we could expect.

I have observed such a lack of execution flexibility by Informix in many cases always leading to a significantly longer execution times for many tasks. In fact I could never see Informix to reschedule a request to a free vCPU.

6. Questions

  • Why do we observe such behavior? Is it Informix flaw or misconception that it does not dynamically distribute all work it currently has to handle among all vCPUs as they become free?
  • Is there a way to configure Informix so that it better utilizes resources in terms of pending workload and vCPUs allocation?
  • What else can be done to improve observed behavior? Can a client side provide some hints or do anything else to support better resource allocation by Informix?

Scheduling issue is also a serious problem when we mix long requests with short requests. If we are lucky then we might get short request results fast, but if unlucky then short requests get stuck on vCPUs with long requests and they are not short anymore. Is there any way to define some kind of vCPU pools on Informix? Like there is 20 vCPU capacity on the server and we could dedicate 16 vCPUs to handle ordinary requests and 4 vCPUs to handle quick requests. Scheduling on Quick Pool should be very aggressive and no request taking longer then configured threshold (e.g. 250ms) should be allowed to run there (it should be rescheduled for further execution to Ordinary Pool).

Or perhaps there is a way to define vCPU pools and provide hints in stored procedures which pool it should be (re)allocated to for further execution? We usually can tell which requests are quick to execute and which are probably not. Is there any way to tell Informix what we already know and support resource utilization?

0

There are 0 answers