Beta 5 H.264 Encoder Memory Leak

Beta 5 H.264 Encoder Memory Leak

Does anybody see a memory leak when using the H.264 Encoder in DirectShow? I noticed it in my application as every second while the graph ran it increased by 4 to 8 K. For the past two days I've been tryin to isolate the issue, as I have encoding for recording video and also playback graphs and the playback graphs work without any leaking. It's definatelly not my code because I just confirmed in graphedit by connecting a simple graph

video source - avi decompress - h264 encoder - dvd decoder - evr

I've tried with more advanced graphs and graphs such as file writing and different decoders, and different capture sources. The leak continues to happen however if I then go into the encoders property pages, and set the preset to "Normal", the leak all of a sudden goes away and the process (in task manager) increases a bit and then decreases and continues to do this but never goes over a certain limit which tells me its releasing com objects successfully.

Does anybody else see this? I just upgrade to beta 5 hoping it would fix it and before i was using beta 4. Thoughts, suggestions, concerns, ideas??

Just so that we can establish environment baseline, could you please provide some more info:- What is the default Encoder preset you use?- Are you running the filters on a machine that supports encode HW acceleration?- Have you made any changes to the encoder filter or are you using the default sample implementation?Regards,Petter

The isolated example I did by using just graphedit; pulled the intel media sdk h.264 encoder into the graph and the default preset is User Defined. Target Usage is Balanced and bitrate is 6000. If I switch it to a normal preset it goes away. However in the application the end user will be able to select from the presets and categories the filter provides. And in the application the default settings are all set to automatic for simplicity on the users end.

I am running the filters on a machine that supports HW acceleration and tried it on a second machine to confirm. Just to confirm, is this an automatic flag or do I have to set something to enable hardware acceleration? I do notice the CPU jumping when encoding... could be the issue?

I have not made any changes to the encoder. I am using the beta 5 h264_enc_filter.dll and ultimatelly pulling it into a directshow graph via c#.

Some more information, not sure if its relative... but I have a dedicated nvidia 430... would this interfere with intel media? This being the case the second computer I tested on does not have a dedicated card and is just intel graphics.

Ah, on my development machine with the nvidia graphics and intel graphics, HW is disabled but just confirmed on a release machine with just the intel graphics that HW is enabled (encode partially accerlerated is disabled) however even though it's hardware enabled, still seeing this leak in memory. Hmm?

I did look into the issue you reported and I have identified a memory leak in the Media SDK DirectShow encoder sample. The issue does not seem specifically tied to HW or SW encoding. We are looking into a solution and will provide more info to you as soon as possible.

This thread is a little bit outdated, however after working with the gold release for a couple months, just getting ready to release a product... I'm still noticing a weird buildup in memory when using the encoder. This can be reproduced in graphedit alone... if you start the graph, stop, and repeat the process several times then the memory footprint goes up. Eventually what happens is windows will display a warning message about memory consumption.

Heres the current scenario... we use the encoder in the medical workspace and our installer is currently onsite and leaves saturday morning, which gives me today and tomorrow to apply a fix.

It is very crucial we get this working because as of right now the memory issue when too high starts to affect the overal performance of our application and destroys the our mp4 muxer thus ending a recording too soon, 1 hour recording may only have 20 minutes. In a surgery this is bad!

Anything you can do to streamline the process of the memory leak would be greatly appreciated,thanks Eric.

Petter and I have been trying to debug this since yesterday afternoon. Unfortunately, its not really apparent where the leak is coming from. We continue to debug..

I certainly can provide the 2.0 DShow encode sample. I confirmed that this filter does not have the same memory leak however, there are some limitations with this sample as well. The 2.0 sample does not have YUY2 input pin support. We added that with 3.0.

We are continuing to look at the problem today, and I hope to have something to say later today.

Hi Martin,We are debugging the leak and have a question, are you using MSDK HW or SW library in your experiment?We have found a source of a small leak - it's the check inCBaseEncoder::InternalClose() method:if (!jt->pmfxSurface->Data.Locked) { MSDK_SAFE_DELETE(jt->pmfxSurface); jt->pSample->Release(); }if (!jt->pmfxSurface->Data.Locked) { MSDK_SAFE_DELETE(jt->pmfxSurface); jt->pSample->Release(); }.HW encoder may not free all the surfaces on Close call -this is a bug in MSDK dll which we will investigate and schedule to get fixed. The check can be removed to delete/free in any case - it's just an additional sanity check of MSDK encoder behavior.But seems that it't not the main source of the leak as we still see it if we remove the check. We continue working on that.The line which you have asked about - "//mark etc." only means that this flag tells that no need to store surface pointer and Receive function must free the memory when it exits.Regards,Nina

Thank you for the update. I have been testing with hardware acceleration only as all our machines are outfitted with Intel boards. If we can get rid of the majority of the leak that should by us more time before the final fix is implemented.

Hi Marty,The resume is that the problem I mentioned yesterday is the actual and likely single reason for the leak.It just wan't trivial to confirm that the fix helps, that's why I hestitated.Can you please try the fix at your side?Comment out the assert and condition as below at line 183 of base_encoder.cpp: //assert(!jt->pmfxSurface->Data.Locked); // all surface should be unlocked // if (!jt->pmfxSurface->Data.Locked) //{ MSDK_SAFE_DELETE(jt->pmfxSurface); jt->pSample->Release(); // }What I see in my experiments after the fix:1) file-avi splitter-intel enc-dvd dec-evrPrivate memory of gedit.exe fluctuates: among values 113,116,119,122, going up or down per each play-stop. I think it is connected with decoder and evr somehow as in 2) situation is different2) file-avi splitter - intel enc - dumpPrivate memory stays almost constant, 63 smth Mb(without a fix it would grow)Please let me know if this helps. We will continue more detailed verification.Regards,Nina

I had just tested the fix, I still see memory left over after stopping the graph however the foot-print is smaller. I will have my tester run through some scenarios such as 500 recordings in 5 second intervals, before it crashes anywhere from 220 to 240... I'll keep you posted as to how many recordings we achieve.

Hi Marty,Thank you for such a quick feedback. It's good that there's at least a change. Can you tell what was the footprint before the fix and now - in size? And which chain do you test - cam-enc-mux-filewriter?Also, what I saw after the fix was that memory size fluctuated - increasing or decreasing after a play-stop (I monitored over 10 iterations).I'll keep looking for more issues, but the problem is that I don't see a leak any more - only fluctuation, which is normal I believe.Regards,Nina

While running the graph it would fluctuate, growing a bit in size and shrinking but never shrinks to its smallest size, and gradually grows... when stopping, starting, and stopping, etc... I see a 2000K footprint on every stop.

My first test was with the application itself, but after testing with graphedit, it looks a lot better. Maybe I didn't copy the right dll to our project... I'll have to take another look to confirm, keep you posted.

What I had noticed is that in graphedit, when the application does run, depending on when you stop (based on the high and low of the fluctuation), the footprint can be anywhere from 0 to 2Mb which might explain the results of graphedit on my development machine.

I used the same graph as denoted before, I believe the only diference between your graph is the video source, mine being a decklink card but we use the decklink in many applications and don't see any issues with memory consumption using this filter.

I will continue to see what I can find.

Cheers.

EDIT... the 500 recordings test got up to 251 recordings, new high score but still no cigar.

The leak is inside MSDK library HW VPP component: m_mfxVPP.Init allocates more than m_mfxVPP.Close frees. A fix in HW MSDK dll is required. A WA on filters level could be to create/delete m_mfxVPP object on BaseEncoder.InternalReset/InternalClose instead of calling Init-Close. There are other possible WAs, I will come back with more details tomorrow.Nina

Hmm, I'm a bit confused about this workaround as I don't actually create a video-preprocessing object. I'm working solely in C# and calling the Encode filter through its GUID on a COM import attribute. I do release the object after the graph is built to conserve memory but I don't see any known functions based on the interface I'm using through DirectShow to do an InternalReset/InternalClose.

Hi Marty,I meant that there are mfxVPP and mfxEncode Media SDK core library components inside Encoder Filter. mfxVPP has a leak. But we have already figured out that new/delete instead of Init/Close won't help as these MFXVideoVPP class is simply a wrapper over C functions, no actual destructor.My suggestion for a workaround will be the following:1) I checked that AVI Decompressor can output YUY2 among other formats2) Encoder Filter uses CSCPlugin (SW code) in case of RGB32 top bottom and mfxVPP in case of all other formats to convert to NV12 for mfxEncode3) in your case mfxVPP is used and it causes a leak4) if you add YUY2->NV12 conversion to CSCPlugin and use plugin instead of mfxVPP you will avoid the leak5) This should not be significant performance impact (sw conversion instead of hw) becuase as I have found out current filter code uses system memory surfaces between VPP and Encode. And HW components work on d3d surfaces natively. So extra copies occur in the current pipeline. If we remove copies and use SW conversion instead of HW there should be even performance improvement.6) I will send you an e-mail with suggestions on how to modify the filter source code.Regards,Nina

I compiled the base_encoder files into the solution and tested in software mode (hardware mode isn't even available now)... but I still see a leak... anywhere from 4 to 6 megs. That and the CPU throttles at about 98%.

I'm am already using MainConcept's MP4 muxer and have played with the software encoder a bit which does not throttle that high nor is there any leak whatsoever... so I have branched the project off into MainConcept's world until the hardware fix is in place without any leaks.

As soon as the fix is in place for hardware encoding I will switch back.

Hi Marty,I think the problem to run on HW system is that you haven't installed SW MSDK dll (libmfxsw32/64.dll from MSDK Gold package) there. The code I shared initializes an additional MSDK session with SW IMPL - it needs SW dll to be in dll search path. This session hosts SW VPP which doesn't have a leak. The original session initialized with IMPL_AUTO will pick up HW dll which was installed by graphics driver. This HW session hosts HW Encode. If SW dll is not in the system first session Init will fail.Previously HW session hosted both HW VPP and HW Encode and HW VPP was causing the leak.When I run on a SW system (no HD Graphics) and both sessions use SW dll I do see memory build up after several first play-stop's but it quickly stabilizes, so it's not a leak.Please try on a system with HD graphics with SW dll also installed (copied and added to either PATH variable or local app folder).Regards,Nina

Hi MartyThe path should be C:\Program Files\Intel\Media SDK 2012\bin\Win32 for 32-bit dll andC:\Program Files\Intel\Media SDK 2012\bin\x64 for 64-bit dll.You can run some simple sample from the package to check that MSDK 2012 installed correctly and SW dll can be found.-Nina

What do you mean by HW or SW enabled? I assume you have a system with Intel HD Graphics and therefore MSDK HW dll installed by graphics driver. On the same system you install MSDK 2012 to make SW dll accessible. Is that correct?

Ok, I have another guess. Sessions join/disjoin may be not working because the 2 library have different API versions (HW - 1.1 and SW 1.3). 2 options here1) remove all the code about joining and disjoining and add a SyncOperation call after RunVPPFrameAsync. This way sessions are independent and can be of any API's2) install MSDK 2.0 instead of MSDK 2012, if you still have it.Can you try?

To keep you posted, I have built a brand new release system that I can use for development, just finishing installing some necessary SDK's... I will then be able to debug the constructor in HW mode to give you a better answer with whats happening.