I am sending an audio stream over RTP, and at the same time ought to send some DTMF events to control a switch on the other end.
First of all, from the RTP standard, is it allowed to send continuous uninterrupted audio and events overlapping in time? I am reading RFC3550, RFC3551 and RFC4733, and not seeing anything that would specifically mention that this is allowed, while not really expressly prohibiting this.
The usage of the marker bit might be confusing. Namely, it is used in non-frame audio payload (I am using u-law pcm) to indicate first frame after a discontinuity, while in RFC4733 events the same mark bit marks the beginning of an event. Nowhere I can also find a mention of stream multiplexing.
Next of all, a practical consideration. Even if standard allows this, is it risky/uncommon in practice? I am controlling Asterisk features through its feature map (features.conf
). All phones and the PJSIP library mute audio stream during a DTMF event.
Finally, if the standard allows that and Asterisk is not certainly known to go crazy from such payload intermixing, what is the correct way to stream it? What I am thinking of, is this (assuming, for the sake of example only, that PCM audio payload length is 100 samples = 100 ticks, and DTMF events are 300 ticks long):
Seq = 10, Timestamp = 1000, M = 0, Payload = PCM
Seq = 11, Timestamp = 1000, M = 1, Payload = DTMF: '*'; duration = 100
Seq = 12, Timestamp = 1100, M = 0, Payload = PCM
Seq = 13, Timestamp = 1000, M = 0, Payload = DTMF: '*'; duration = 200
Seq = 14, Timestamp = 1200, M = 0, Payload = PCM
Seq = 13, Timestamp = 1000, M = 0, Payload = DTMF: '*'; duration = 300; E = 1
Would that be a correct stream?
4733 says that in case of inband dtmf, you should send dtmf 'instead' of audio data. you increment seqNo and TS at the same rate but the payload would be dtmf data.
If the remote UE supports playout of dtmf tones, it will playout the tone. else it will just discard dtmf payloads (unknown or unsupported)