Why would a OpenGL ES 2.0 leak graphics memory on Android, but not iOS with the same code

152 views Asked by At

I have an Android NDK based game which we build partially off of parts of an open source engine (Cocos2d-x). Most of the engine is custom, but for things like OpenGL context setup, Java bridging, and events, we found it convenient to use an off the shelf solution.

Fast forward to today, we are having stability issues, cocos2d-x is no longer maintained, and we are having some serious stability issues. I have spent about 1 week debugging this, and have not been able to find a way to trace it down. Here is what I've tried

  • Compiling with the old NDK it came with
  • Changing the SDK as old as it would allow
  • Using HWASan
  • Setting CheckJNI
  • Attempting to audit all of the JNI bridge code

The way in which it crashes varies greatly

  • When attached to the debugger, it crashes within the first second. Almost always in the Android runtime. Many times around garbage collection or involving the JNI bridge
  • When not attached to the debugger, sometimes it would be stable.
  • When HWASan is attached, it would occasionally crash in the ANGLE library if OpenGL is set to use it on the phone
  • Sometimes the graphics memory will simply leak and eventually lead to a crash, and other times it will not leak at all

I would also like to say that the app is very stable on all other platforms besides Android, and 95% of our code is in the shared c++. I even commented out actually booting our code in c++ and the game still crashed on a black screen with just the out of the box engine code.

I am considering taking the leap to port to SDL so the engine will at least be supported, but if anyone has any tips on tracing this sort of thing down, I would be very happy to see hear it.

UPDATE:

I believe the crash issue when attached to the debugger may be a separate issue with the engine setup. For production users, the issue seems to be that the OpenGL graphics memory from either frame buffers or textures is not being released back to the system. I should note that the code is the same as iOS which does not have this problem, so it must have something to do with the share context or how the drivers work. I will accept any answer that only solves this problem.

CONCLUSION Android development is terrible. There are random non texture related draw calls which would cause leaks simply because its the background thread.

3

There are 3 answers

0
David On BEST ANSWER

Ok, I continued to struggle with this, and this solution is not at all satisfactory to me, but I thought I would come back to share in case anyone is ever in this situation so they have something they can try. For me, it appears that by putting a glFinish() prior to calling usleep() in the background thread, made it actually give the memory back and not leak. This feels like a bad driver sort of issue, but it definitely stopped leaking even on very complex cases, so I will take that as a solution.

1
KiraHoneybee On

Sort of a long shot, but are you using various threads, i.e. a loading thread? There's various things in Android that HAVE to operate on the main thread, and if you don't do it, you get exactly this sort of "sometimes it works, mostly it doesn't" situation.

I feel for you. Mobile OS's break old code approximately every two years without a speck of guilt, and if you're using something that doesn't get updated, you basically have to get the source and maintain it yourself. We eventually decided to take two months and write our own framework.

1
KiraHoneybee On

Okay, so adding a new answer to be specific to OpenGL: We had a similar problem to you where we lost our core code framework to mobile updates and had to roll our own. Here's what I know about OpenGL with strange crashes:

On Android, you can make GL calls from different threads, but keeping them from interfering with each other turned out to be a HUGE deal. We ended up implementing these three functions only for our Android port:

    void Graphics_Core::ThreadLock() {if (!gGraphicLockInit) EnableThreadGraphics();pthread_mutex_lock(&gGraphicLock);}
    void Graphics_Core::ThreadUnlock() {if (gGraphicLockInit) pthread_mutex_unlock(&gGraphicLock);}

    void Graphics_Core::EnableThreadGraphics()
    {
        if (!gGraphicLockInit)
        {
            pthread_mutexattr_t aMutexInfo;
            pthread_mutexattr_init(&aMutexInfo);
            pthread_mutexattr_settype(&aMutexInfo,PTHREAD_MUTEX_RECURSIVE);
            pthread_mutex_init(&gGraphicLock, &aMutexInfo);
            gGraphicLockInit=true;
        }
        gOpenGLThreading++;
    }

We put ThreadLock()/ThreadUnlock() around all our blocks of OpenGL code that deal with ANYTHING that initializes anything in OpenGL. That means, any time you setup anything, any time you change a texture, any time you read data back from a texture, creating a shader, ANYTHING that is creating or deleting something in OpenGL seems prone to crashing with thread interference.

Stuff like setting a texture, setting a shader, etc, seem to be okay without it. It's just the creation/deletion stuff. Though what we did was enclose EVERYTHING in ThreadLock()/Unlock() and then just gradually remove them one by one to see which would blow it up and which would not.

OpenGL is very "undefined" when using multiple threads, so keeping them from interfering with each other is a BFD.

Hope this helps! Android is the worst system I've ever worked on for debugging.