I'm concerned with the design of an adaptive jitter buffer that increases and decreases capacity with increases and decreases in the jitter calculation.
I see no reason to make any adjustments to latency or capacity unless there is a buffer underrun which might then be followed by a burst of incoming packets that exceeds capacity (assuming buffer capacity equals buffer depth/latency in the first place). As an example, if I'm receiving 20ms packets, I might well implement a buffer that is 100ms deep and therefore has capacity for 5 packets. If 160ms passes between packets, then I might be expecting to see as many as 8 packets come in nearly all at once. I have two choices at this point:
- drop three of the packets according to the rules of overflow
- drop no packets and increase buffer capacity as well as latency
Assume choice 2 and that network conditions improve and packet delivery becomes regular again (the jitter value drops). Now what? Again, I think I have two choices:
- do nothing and live with the increased latency
- reduce latency (and capacity)
With an adaptive buffer, I think I'm supposed to make choice 4, but that doesn't seem right because it requires that I artificially/arbitrarily drop packets of audio that were specifically saved when I took choice 2 upon encountering the greater jitter in the first place.
It seems to me that the correct course of action is to initially take choice #1 to maintain latency while dropping packets, if necessary, that are delivered late due to increased jitter.
A similar scenario might be that instead of getting a burst of 8 packets after the 160ms gap, I only get 5 (perhaps 3 packets were just lost). In that case, having increased the buffer capacity doesn't do much of any good but does serve to reduce the potential of overflow later on. But if the idea of overflow is something to be avoided (from the network side), then I would simply make buffer capacity some fixed amount greater than the configured 'depth/latency' in the first place. In other words, if overflow is not something caused by the local application failing to get packets out of the buffer in a timely manner, then overflow can only happen for two reasons: either the sender lies and sends packets at a faster rate than agreed upon (or sends packets from the future), or, there is a gap between packet bursts that exceeds my buffer depth.
Clearly, the whole point of the 'adaptive' buffer would be to recognize the latter condition, increase buffer capacity, and avoid dropping any packets. But that brings me right to the stated problem: how do I 'adapt' back to the ideal settings when network jitter clears up while still enforcing the same 'drop no packets' philosophy?
Thoughts?
With companding. When jitter clears up, you merge packets and 'accelerate' the buffer. Merge offcourse will need appropriate handling, but the idea is to pop 2 20ms packets from ajb and creating a 30ms packet. you keep doing this until your buffer levels are normal.
Similarly for underrun, packets can be 'stretched' in addition to introduction of latency.