Buffer starvation inside MS Mpeg-2 Demultiplexer filter

95 views Asked by At

My capture graph is dying. I have traced the problem to a media-sample buffer starvation inside the Microsoft Mpeg-2 Demultiplexer filter.

Processing stops inside CBaseAllocator::GetBuffer. The pool is exhausted and the thread sleeps waiting indefinitely for a buffer to be recycled.

0:866> ~~[3038]s
ntdll!NtWaitForSingleObject+0x14:
00007ffe`49199f74 c3              ret
0:094> k
 # Child-SP          RetAddr           Call Site
00 00000035`807fede8 00007ffe`460b9252 ntdll!NtWaitForSingleObject+0x14
01 00000035`807fedf0 00007ffe`22a35f4e KERNELBASE!WaitForSingleObjectEx+0xa2
02 00000035`807fee90 00007ffe`35609460 QUARTZ!CBaseAllocator::GetBuffer+0x7e
03 00000035`807feec0 00007ffe`3560697a mpg2splt!CMediaSampleCopyBuffer::GetCopyBuffer+0x60
04 00000035`807fef60 00007ffe`35606cc9 mpg2splt!CBufferSourceManager::GetNewCopyBuffer+0x3a
05 00000035`807fefa0 00007ffe`356073de mpg2splt!CStreamParser::CopyStream+0x89
06 00000035`807feff0 00007ffe`35608325 mpg2splt!CMpeg2PESStreamParser::ProcessBuffer_+0x15a
07 00000035`807ff040 00007ffe`35610724 mpg2splt!CMpeg2PESStreamParser::ProcessSysBuffer+0x135
08 00000035`807ff090 00007ffe`3560fb2e mpg2splt!CStreamMapContext::Process+0xb4
09 00000035`807ff110 00007ffe`3560f621 mpg2splt!CTransportStreamMapper::ProcessTSPacket_+0x30e
0a 00000035`807ff2d0 00007ffe`355fd0c1 mpg2splt!CTransportStreamMapper::Process+0xf1
0b 00000035`807ff320 00007ffe`355f4eb8 mpg2splt!CMPEG2Controller::ProcessMediaSampleLocked+0x111
0c 00000035`807ff3a0 00007ffe`355f98a7 mpg2splt!CMPEG2Demultiplexer::ProcessMediaSampleLocked+0x7c
0d 00000035`807ff3f0 00007ffd`ba58cba3 mpg2splt!CMPEG2DemuxInputPin::Receive+0x87
0e 00000035`807ff480 00007ffd`ba58ca4d 0x00007ffd`ba58cba3
0f 00000035`807ff530 00007ffd`ba58c92e 0x00007ffd`ba58ca4d
10 00000035`807ff590 00007ffe`19b5222e 0x00007ffd`ba58c92e
11 00000035`807ff5d0 00007ffe`246e5402 clr!UMThunkStub+0x6e
12 00000035`807ff660 00007ffe`2472aa23 qedit!CSampleGrabber::Receive+0x1b2
13 00000035`807ff6d0 00007ffe`287ea6d6 qedit!CTransformInputPin::Receive+0x53
14 00000035`807ff700 00007ffe`287ea459 Obsidian_DSP_DirectShow!MulticastSourceFilter::UDP_consumerThreadProc+0x276 [s:\library\obsidian.dsp.directshow\multicastsourcefilter.cpp @ 475] 
15 00000035`807ff7f0 00007ffe`46f73034 Obsidian_DSP_DirectShow!MulticastSourceFilter::UDP_consumerThreadEntry+0x9 [s:\library\obsidian.dsp.directshow\multicastsourcefilter.cpp @ 445] 
16 00000035`807ff820 00007ffe`49171461 KERNEL32!BaseThreadInitThunk+0x14
17 00000035`807ff850 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Here are a few facts about this particular graph:

  • The source media is in the form of a heavily multiplexed MPEG2-TS UDP stream.
  • This stream contains 14 SD TV programs, consuming 37.5Mbps of network bandwidth.
  • The problem occurs predictably during periods where the stream becomes heavily fragmented (the audio and video decoders emit a burst of samples with IsDiscontinuity() in TRUE.
  • According to windbg (and SOS) There are No managed or unmanaged locks contended (no possibility of a deadlock).
  • There is no evidence of a "runaway" thread (not stuck on an infinite loop).
  • The graph's final filter is a GDCL bridge box, that then bridges the decoded sample to an MP4 muxer box.
  • The demuxer video output is connected to an instance of ffdshow decoder filter. The demuxer audio output is connected to an instance of lav audio decoder filter.

Am I right to suspect the problem could be inside either the ffdshow or the lav filter? (who else could be holding demuxer buffers?)

Any pointers or suggestions on how can I trace why the buffer pool inside the demuxer is exhausted?

1

There are 1 answers

0
Roman Ryltsov On BEST ANSWER

It looks like memory allocator on certain pin connection has all buffers in user with external references, and so it fell asleep waiting for new buffer to be returned for recycling.

This is expected behavior, and the problem is either too few buffers or excessive referencing.

You seem to be able to identify pin connection using call stack, and you could either increase amount of buffers or provide a custom memory allocator which expands on demand.

The easiest is when it's your filter is a part of the connection, and you can affect allocator during negotiation phase by either providing allocator requirements or directly updating the allocator properties. In more complicated cases you could locate existing connection and change properties before going active. In even more complicated you could insert your no-op filter into processing chain just for the purpose of getting in between and having direct access to effective allocator.