The power of being given access to GPU acceleration for graphics isn’t just in getting images to be drawn really fast. If we use vertex shaders to make things move around the world then we can offload a significant amount of per-frame calculations over to the graphics card, which frees up the CPU for more interesting tasks like AI and physics.
As an example of the CPU time saved by using graphics hardware, here is virtually the same program embedded twice. The first version attempts to use your graphics hardware while the second is forced to use the fall-back software implementation. If you’re viewing this on a device that isn’t currently supported by Flash’s hardware acceleration then both versions will be using software rendering. You will of course need Flash Player 11 to see the demo.
If your results are anything like mine then the difference in performance even with this relatively small scale example will be dramatic. That said I’ve got to take a moment to say how impressed I am by the performance of the software implementation. It multi-threads and spreads itself across my CPU cores beautifully, and most importantly produces output which is so far as I can tell completely identical to that coming out of the graphics hardware version.
Perhaps more interesting than the raw increase in speed is the difference in CPU use. For me the hardware accelerated version’s main loop is measured at taking 0ms, and as you would expect the software version’s is much higher at around 20ms. Not only is the hardware accelerated version pumping out frames faster but it’s barely touching the CPU, which is left free for running the rest of the game.
Writing the shader for these particles was good fun. If stripped of comments the final code is 54 lines (of a maximum 200 allowed) of virtually indecipherable opcodes and registers, but the process it represents is simple enough.
The classic way to handle moving objects in game programming is to adjust their position each update cycle by adding on a value representing their velocity. This technique doesn’t work for objects that are to be handled entirely through AGAL shaders as there is simply no place where their position could be updated by the shader and stored between frames.
Instead the shader needs to be able to produce the particle’s position when provided with the current time. So the process of writing the shader starts with finding an equation in terms of time that gives position (or rotation, colour, scale, or any other property of your particles that you want to have change). If the particles are moving at a constant velocity this is simple enough:
position = startPosition + startVelocity * time
The variables like startPosition and startVelocity will be passed into the shader through the vertex buffer (as they will be different for each particle), while time will be passed in through the shader constants as it is the same for all particles being drawn that frame. If you want the particles to undergo acceleration (such as due to gravity) then you can just add on a new term to account for that.
position = startPosition + startVelocity * time + 0.5 * constantAcceleration * time^2
Effects like the particles bouncing once when reaching a certain y-value are a little more complex but it still all comes down to making an equation that gives the position of the particle in terms of time.
Not particularly related to particle systems, but I decided to reproduce an old example of normal mapping I made back when Pixel Bender was the closest to hardware acceleration Flash had available. This version is of course running entirely on the GPU.