How can I properly manage data in modern OpenGL while considering performance?

671 views Asked by At

In modern OpenGL (3.x+), you create buffer objects which contain vertex attributes, such as positions, colors, normals, texture coordinatess, & indices.

These buffers are then assigned to a corresponding vertex array object (VAO) which essentially contains pointers to all of the data as well as the data's format.

There are many tutorials out there for how to create a VAO and how to use it; unfortunately, it isn't clear how VAO's should be used for larger applications or games.

For example, a game might contain many 3D models, and it seems appropriate to separate each model by a different VAO.

On the other hand, a particle system contains many disconnected primitives traveling independent of one another. In this scenario, using a single VAO per system might improve performance in CPU-GPU transfers. However, in this case, the primitives need to be translated differently than one another, so it might seem viable to separate each particle into a very tiny VAO.

Question:

  • For a large quantity of small data sets (such as a particle system of quads), should all of the data be packed into 1 VAO or divided into many VAO's? What are the performance benefits/drawbacks in each method?

Assumming 1 VAO is used, the only apparent way to translate each independent sub-unit of data is to modify the actual position information and reload it into the GPU. Doing this many times is costly in terms of time performance.

Assuming many VAO's are used, then the GPU must store duplicate formatting information for each VAO. This seems to be costly in terms of space (but I'm not sure if this is necessarily slow).

Side-Note:
Yes, I'm personally interested in managing a particle system. To keep this question more generic, and more useful for others, I am asking about VAO management as a whole. I am curious what management methods are more suitable vs others when considering the type of data being stored and when considering what type of performance is desired (time/space).


VAO creation is described well here:

2

There are 2 answers

3
Lloyd Crawley On BEST ANSWER

In the case of particles it would be best to use instanced rendering - where you can render all the particles in a single draw call but assign a different position for each one as an attribute. You can update an existing buffer using glSubData. That way you could update the position on the CPU side between frames, and then update the buffer.

In more complex examples you can instance whichever attributes you want to.

The way I call instanced rendering and set it up in my code is as follows:

void CreateInstancedAttrib(unsigned int attribNum,GLuint VAO,GLuint& posVBO,int numInstances){
    glBindVertexArray(VAO);
    posVBO = CreateVertexArrayBuffer(0, sizeof(vec3),numInstances,GL_DYNAMIC_DRAW);
    glEnableVertexAttribArray(attribNum);
    glVertexAttribPointer(attribNum, 3, GL_FLOAT, GL_FALSE, sizeof(vec3), 0);
    glVertexAttribDivisor(attribNum, 1); 
    glBindVertexArray(0);
}

Where posVBO is the usual attrib data and the lines following set up the buffer for positions. When rendering:

void RenderInstancedStaticMesh(const StaticMesh& mesh, MaterialUniforms& uniforms,const vec3* positions){

    for (unsigned int meshNum = 0; meshNum < mesh.m_numMeshes; meshNum++){

        if (mesh.m_meshData[meshNum]->m_hasTexture){
            glBindTexture(GL_TEXTURE_2D, mesh.m_meshData[meshNum]->m_texture);
        }

        glBindVertexArray(mesh.m_meshData[meshNum]->m_vertexBuffer);
        glBindBuffer(GL_ARRAY_BUFFER, mesh.m_meshData[meshNum]->m_instancedDataBuffer);
        glBufferSubData(GL_ARRAY_BUFFER,0, sizeof(vec3) * mesh.m_numInstances, positions);
        glUniform3fv(uniforms.diffuseUniform, 1, &mesh.m_meshData[meshNum]->m_material.diffuse[0]);
        glUniform3fv(uniforms.specularUniform, 1, &mesh.m_meshData[meshNum]->m_material.specular[0]);
        glUniform3fv(uniforms.ambientUniform, 1, &mesh.m_meshData[meshNum]->m_material.ambient[0]);
        glUniform1f(uniforms.shininessUniform, mesh.m_meshData[meshNum]->m_material.shininess);
        glDrawElementsInstanced(GL_TRIANGLES, mesh.m_meshData[meshNum]->m_numFaces * 3, 
                                GL_UNSIGNED_INT, 0,mesh.m_numInstances);
    }
    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glBindVertexArray(0);
}

That's a lot to take in but the important lines are DrawElementsInstance and glBufferSubData. If you do a few googles on both functions I'm sure you will come to understand how instanced rendering works. Anymore questions please ask

0
datenwolf On

The general rule is, that you want to minimize the amount of draw calls. If you put things into individual VAOs you have to perform a draw call for each VAO. Also switching between VAOs and VBOs comes with a cost either. Don't think of VAOs and VBOs as "model" containsers, but as memory pools, where each VBO / VAO should be used to coalesce data of identical properties.

A particle system is the perfect candidate to put everything into a single VBO/VAO. In the usual case using instanced rendering where the VBO contain information about where to place each particle.