How do PowerVR GPUs provide a depth buffer?

2.1k views Asked by At

iOS devices use a PowerVR graphics architecture. The PowerVR architecture is a tile-based deferred rendering model. The primary benefit of this model is that it does not use a depth buffer.

However, I can access the depth buffer on my iOS device. Specifically, I can use an offscreen Frame Buffer Object to turn the depth buffer into a color texture and render it.

If the PowerVR architecture doesn't use a depth buffer, how is it that I'm able to render a depth buffer?

3

There are 3 answers

0
jesusgumbau On BEST ANSWER

It is true that a tile-based renderer doesn't need a traditional depth buffer in order to work.

TBR split the screen in tiles and completely renders the contents of this tile using fast on-chip memory to store temporary colors and depths. Then, when the tile is finished, the final values are moved to the actual framebuffer. However, depth values in a depth buffer are traditionally temporary because they are just used as a hidden surface algorithm. Then depth values in this case can be completely discarded after the tile is rendered.

That means that effectively tile-based renderers don't really need a full screen depth buffer in slower video memory, saving both bandwidth and memory.

The Metal API easily exposes this functionality, allowing you to set the storeAction of the depth buffer to 'don't care' value, meaning that it will not back up the resulting depth values into main memory.

The exception to this case is that you may need the depth buffer contents after rendering (i.e. for a deferred renderer or as a source for some algorithm that operates with depth values). In that case the hardware must ensure that the depth values are stored in the framebuffer for you tu use.

2
Tommy On

Tile-based deferred rendering — as the name clearly says — works on a tile-by-tile basis. Each portion of the screen is loaded into internal caches, processed and written out again. The hardware takes the list of triangles overlapping the current tile and the current depth values and from those comes up with a new set of colours and depths and then writes those all out again. So whereas a completely dumb GPU might do one read and one write to every relevant depth buffer value per triangle, the PowerVR will do one read and one write per batch of geometry, with the ray casting-style algorithm doing the rest in between.

It isn't really possible to implement OpenGL on a 'pure' tile-based deferred renderer because the depth buffer needs to be readable. It also generally isn't efficient to make the depth buffer write only because the colour buffer is readable at any time (either explicitly via glReadPixels or implicitly according to whenever you present the frame buffer as per your OS's mechanisms), meaning that the hardware may have to draw a scene, then draw more onto it.

1
EugeneMi On

PowerVR does use a depth buffer, but in a different way than a regular(Immediate Mode Rendering) GPU

The differed part of Tile-based differed rendering means that triangles for a give scene are first processed (shaded, transformed clipped, etc. ) and saved into an intermediate buffer. Only after the entire scene is processed the tiles are rendered one by one.

Having all the processed triangles in one buffer allows the hardware to perform hidden surface removal - removing the triangles that will end up being hidden/overdrawn by other triangles. This significantly reduces the number of rendered triangles, resulting in improved performance and reduced power consumption.

Hidden surface removal typically uses something called a Tab Buffer as well as a depth buffer. (Both are small on-chip memories as they store a tile at a time)

Not sure why you're saying that PowerVR doesn't use a depth buffer. My guess is that it is just a "marketing" way of saying that there is not need to perform expensive writes and reads from system memory in order to perform depth test.

p.s

Just to add to Tommy's answer: the primary benefits of tile based differed rendering are:

  1. Since fragments are processed a tile at a time all color/depth/stencil buffer read and writes are performed from a fast on-chip memory. While the color buffer still has to be read/written to system memory ones per tile, in many cases the depth and stencil buffers need to be written to system memory only if it is required for later use(like your user case). System memory traffic is a significant source of power consumption... so you can see how it reduced power consumption.
  2. Differed rendering enables hidden surface removal. Less rendered triangles means less fragments processing, means less texture memory access.