A quick reference for working with AGAL, the new shader language for use with Stage3D introduced with Flash Player 11.
I’ve been working on a series of tutorials for Stage3D which aren’t yet ready, but with the recent release of Flash Player 11 now seems a good time to get at least some information out. If you’re new to Stage3D then this may not make much sense until you’ve read some other resources, but I hope it serves as a useful reference as you experiment and learn.
Each register consists of four components, which are floating point values. These components are accessed by registerName.x, registerName.y, registerName.z, registerName.w. They’re named for dealing with 3D positions – with the ‘w’ for rotation in the style of quaternions – but they can just as well be used to hold a colour (in fact you can also access the components with .r .g. b. a) or any other values you want to use.
There are several registers of each type available. For instance you might have va0 giving the 3D position of a vertex in space, and va1 giving the uv mapping coordinate for that vertex. The nice thing about having registers made up of components is you can do things like perform a basic addition operation va0 and vc0, and the addition will be performed correctly on each component.
Registers for Vertex Shaders
- va[0 to 7] Vertex Attribute. The contents of the vertex buffer, as set with context3D.setVertexBufferAt. Each vertex has its own space in the vertex buffer which only it can access.
- vc[0 to 127] Vertex Constant. Passed into the shader with context3D.setProgramConstantsFromVector or context3D.setProgramConstantsFromMatrix. These registers can be read by all vertices, but cannot be written to by the shader.
- vt[0 to 7] Vertex Temporary. A handy temporary register where you can put values during a calculation.
- op Vertex Output or “Output Position”. The output: op.x and op.y is where in the 2D space of the screen this vertex will be drawn. The op.z value is used for depth checking and writing to the depth buffer if you have those enabled. So far as I know the op.w value is not actually used.
- v[0 to 7] Varying. A magical (not actually magical) register that allows you to pass values from the vertex shader to the fragment shader. The value that arrives in the fragment shader will be interpolated between the value of the three v registers of the vertices which make up the triangle in which the fragment falls.
Note! The fragment shader cannot directly access the vertex buffer, so anything it needs from there has to be passed through the v registers. For instance if you’re using texture mapping this register will need to pass the uv coordinates to the fragment shader.
Registers for Fragment Shaders
- fc[0 to 27] Fragment Constant. Much like vc for the vertex shader, this register is set by context.setProgramConstantsFromVector or context.setProgramConstantsFromMatrix, can be read by each fragment, and not written to by the shader.
- ft[0 to 7] Fragment Temporary. Again just like vt for the vertex shader, this is a temporary store useful for performing calculations.
- fs[0 to 7] Texture Sampler. This is where the fragment shader is able to access whatever texture(s) were bound using context3D.setTextureAt.
- oc Fragment Output or “Output Colour”. The output: oc.x oc.y oc.z oc.w are the red, green, blue and alpha values respectively for the fragment to be drawn.
Shaders are made up of a series of operations, with one operation on each line. First the operation to perform is identified by a three letter opcode such as “add”, then the parameters for that operation are given. The parameters are (almost) always specified as registers. If you want to use a number in your shader, it should be supplied through the vc or fc registers.
In the parameters the target register is always specified first. The target register is where the result of the operation is placed. No change is made to a register other than the target register. Some operations require two further parameters, others just one. The tex operation used for texture sampling is a special case that has six parameters, three of which are given as strings rather than registers. tex is a pretty wild guy.
At first look the mess of opcodes and registers names that make up the AGAL code for a shader can look intimidating but they’re actually quite simple. Just remember that each operation does exactly one thing, and writes to exactly one register. AGAL doesn’t allow for conditional statements like if then or any form of looping, so following along with what a shader is doing is extremely easy: it always just proceeds to the next operation.
With that said, AGAL code is not nearly as intuitive to glance at and understand what it does as (well written) AS3 code is. Taking a minute to type out some comments for the AGAL code you write is a very good idea.
You are limited to 200 operations in a single AGAL shader.
- mov t a - Copy the contents of a into t.
- add t a b - Add a and b, put result in t.
- sub t a b – Subtract b from a, put result in t.
- mul t a b – Multiple a and b, put result in t.
When working component-wise this operation doesn’t always do as I’d expect. Specifically performing the operation:
mul vt0.xy va0.xy vc0.xy
Gives a different result from performing the two operations:
mul vt0.x va0.x vc0.x
mul vt0.y va0.y vc0.y
Whereas they would give the same result if it were an add operation in both instances instead of mul. I’ve yet to work out exactly what the mul operation does with multiple components.
- div t a b – Divide a by b, put result in t.
The same behaviour as outlined above for the mul operation applies to div too.
- rcp t a – Divide 1 by a, put result in t.
- min t a b – Copy whichever of a or b is smaller into t.
- max t a b – Copy whichever of a or b is larger into t.
- frc t a – Copy just the fractional part of a into t.
e.g. if a has the value 5.86 then 0.86 is placed in t.
- sqt t a - Find the square root of a, put result in t.
- rsq t a – Find 1 divided by the square root of a, put result in t.
- pow t a b – Raise a to the power of b, put result in t.
- log t a – Find the binary logarithm of a, put result in t.
- exp t a – Raise 2 to the power of a, put result in t.
- nrm t a – Normalise the vector given in a (keep same direction, but make it length 1), put result in t.
- sin t a – Find the sine of a, put result in t.
- cos t a – Find the cosine of a, put result in t.
- crs t a b – Find the cross product of the vectors a and b, put result in t.
- dp3 t a b – Find the dot product of the three-dimensional vectors a and b, put result in t.
- dp4 t a b – Find the dot product of the four-dimensional vectors a and b, put result in t.
- abs t a – Find the absolute value of a, put result in t.
- neg t a – Multiply a by -1, put result in t.
- sat t a – Clamp a between 1 and 0, put result in t.
e.g. if a is -4.6, 0 will be placed in t.
If a is 0.6, 0.6 will be placed in t.
If a is 8.2, 1 will be placed in t.
- m33 t a b – Perform a 3×3 matrix multiply on a and b, put result in t.
- m44 t a b – Perform a 4×4 matrix multiply on a and b, put result in t.
- m34 t a b – Perform a 3×4 matrix multiply on a and b, put result in t.
- m43 t a b – Perform a 4×3 matrix multiply on a and b, put result in t.
I need to write a decent explanation of what matrix operations do.
In all of these, b is the first register that makes up a matrix. For instance if you perform m44 with b as vc0 then the contents of registers vc0 vc1 vc2 vc3 will be used. a is a single register that gets multiplied through by the matrix specified in b.
- sge t a b - If a is greater or equal to b put 1 in t, otherwise put 0 in t.
- slt t a b – If a is less than b put 1 in t, otherwise put 0 in t.
sge and slt are the closest we have to conditional flow control in AGAL so look out for ways to use multiplication of their 1 or 0 result in place of traditional conditionals.
- kil a - a must be a single scalar value rather than a vector, for instance: ft0.x. If the value given is less than zero then execution on this fragment is halted and it is not drawn.
- tex t a b <type, wrap, filter> - Samples the texture in b (which should be one of the fs registers) at the coordinates in a, putting the resulting colour in t.
- type determines what kind of texture you’re sampling from, and should be either: “2d” for standard texturing or “cube” for using a cubemap.
- wrap determines how to deal with sampling beyond the bounds of the texture, either: “clamp” or “repeat”.
- filter determines how to interpolate between texels, either “mipnearest”, “miplinear” (both of those require mipmaps to have been uploaded for the texture), “nearest”, or “linear”. Use nearest if you want your texture to have crisp pixel edges, use linear if you want them smoothed out.
First a simple vertex shader. This simply performs a 4×4 matrix multiply on the vertex position and a projection matrix. It assumes va0 holds the x, y, z position of the vertex, while vc0-vc3 store a projection matrix set up to transform that point from world space to display space. It also copies va1 into the shared v0 register so that it may be used by the following fragment shader as UV coordinates for texture mapping.
m44 op, va0, vc0 mov v0, va1
Now a single-line fragment shader. Taking the interpolated value it finds in v0 as UV coordinates this samples the texture loaded into fs0 and outputs that sample as the colour for this fragment.
tex oc, v0, fs0 <2d,clamp,linear>