## July 19, 2009

### Image Rotation with Pixel Bender

Unlike my previous experiments with Pixel Bender, today I set out to implement a basic image manipulation routine, image rotation. While Flash natively provides the ability to rotate a DisplayObject that doesn't help if you want to combine multiple Pixel Bender filters that should use the rotated image as the source image. Turns out rotating an image in Pixel Bender was a little harder than I thought, but I ended up with a nice solution.

Rotation is a simple image manipulation that can take advantage of transformation matrixes like you can create in Flash. The normal 2D rotation transformation matrix looks like this, given d is the angle of rotation, x and y are the original pixel location, and x' and y' are the new pixel location:

 cos(d) -sin(d) * x = x' sin(d) cos(d) y y'

The problem with this rotation (and the standard DisplayObject.rotation property) is that the rotation is performed relative to the origin. In this case it is (0, 0) or the upper left corner of the image. To address this problem we switch to 3x3 transformation matrixes and first translate the image to define a new origin (like changing the registration point on a MovieClip) before performing the rotation. By combining a translation, rotation, and reverse translation we can rotate the image around an arbitrary point but still have the result appear anchored in the upper left. The full matrix transform then looks like this, given a and b are the translation in x and y respectively:

 1 0 a * cos(d) -sin(d) 0 * 1 0 -a * x = x' 0 1 b sin(d) cos(d) 0 0 1 -b y y' 0 0 1 0 0 1 0 0 1 1 1

Now we are getting somewhere. The next problem is that we don't know what x and y are. We do know x' and y' as that is what outCoord() gives us. In order to use this transformation matrix in Pixel Bender we need to solve for x and y. It turns out the matrixes reduce to a linear system of equations. Thankfully I have an old symbolic solver kicking around that did the dirty work. First simplify the transform matrixes:

 (x - a)*cos(d) + (b - y)*sin(d) + a = x' (y - b)*cos(d) + (x - a)*sin(d) + b y' 1 1

That leaves us with two equations and two unknowns. Using a linear equation solver, we end up with:

x = (x' - a)*cos(d) + (y' - b)*sin(d) + a
y = (y' - b)*cos(d) + (a - x')*sin(d) + b

The translation into a Pixel Bender kernel is straight forward from here. Using a parameter called origin I've replaced a and b, outCoord() replaced x' and y', and a parameter called rotation (specified in degrees) is used for d.

```float2 dstCoord = outCoord();
float cosAngle = cos(angle);
float sinAngle = sin(angle);
float x = (dstCoord.x - origin.x) * cosAngle + (dstCoord.y - origin.y) * sinAngle + origin.x;
float y = (dstCoord.y - origin.y) * cosAngle + (origin.x - dstCoord.x) * sinAngle + origin.y;
dst = sampleNearest(src, float2(x, y));```

After the kernel was created and exported for use in Flash I slapped together a simple test application to play with it (view source enabled). Note: if you use the kernel in the Pixel Bender toolkit you will want to setup a minValue and maxValue for the origin parameter based on your test image size, otherwise you can't meaningfully adjust it.

If you played with the application you probably noticed the extension slider. By default a shader filter in Flash will map the source image to an equally sized output image. Increasing the various extension properties of ShaderFilter you can create an output image larger than the input which avoids any output image clipping.

I suspect if Pixel Bender graphs were supported in Flash you could easily setup a series of filters that do the image translation and rotation separately instead of combining them all into a single matrix. It might even be easy enough to do that with an array of filters but this isn't something I've played with much.

## November 17, 2008

### Sorting out Pixels or How I Didn't Plan to Spend My Sunday

I wasted, well maybe wasted is too strong of a word. I used way more time than I should have on Sunday playing with Pixel Bender trying to figure out how to make it sort values. In my trend to make Pixel Bender do stuff besides manipulate pixels I thought a fun mental exercise would be to implement a parallel sort. Now I've not come up with any good use for this yet, considered if this is the best performing parallel sort, yadda, yadda, but it works, which is what I really wanted to get out of this experiment.

While quicksort and mergesort are efficient given their O(n*log(n)) runtimes, they require iteration which isn't what Pixel Bender is good at. I ran across one paper that outlined a parallel sort on a linear array of cellular automata algorithm but its running time was O(2*n-3) cycles. For a reasonably large array I was concerned about exceeding the size of an image that I could feed into Pixel Bender.

As always, Knuth had an answer. On page 111 of his Sorting and Searching tome, he outlines Batcher's sorting scheme which is characterized as a "merge exchange sort". The key piece that makes the algorithm attractive for what I wanted to do is that step "M3. [Loop on i.]" can be done for all relevant i in any order, even simultaneously. [Knuth 111]. It also scales nicely, with worst case steps required to sort being O(1/2*ceil(log_2(n))*(ceil(log_2(n)+1). At 4096 elements 78 steps are needed while at 262,144 elements only 171 steps are needed. I'm going to exceed the width much more quickly than the height.

Doing a rough implementation of the algorithm in ActionScript was straight forward:

```/**
* Merge Exchange Sort using Batcher's Method
* Knuth, The Art of Computer Programming,
* Vol 3. Sorting and Searching, Page 111.
* @param data The data to be sorted, will be modified inplace
*/
private function mergeExchangeSort(data:Array):void {
// Sanity check
var n:int = data.length;
if (n < 2) {
return;
}
// M1. [Initialize p.]
var t:int = Math.ceil(Math.log(n)/Math.log(2));
var p:int = Math.pow(2, t - 1);
do {
// M2. [Initialize q, r, d.]
var q:int = Math.pow(2, t - 1);
var r:int = 0;
var d:int = p;
var loopOnQ:Boolean = true;

do {
// M3. [Loop on i.]
for (var i:int = 0; i < n - d; i++) {
if ((i & p) == r) {
// M4. [Compare/exchange R_(i+1):R_(i+d+1)]
if (data[i] > data[i + d]) {
var temp:Object = data[i];
data[i] = data[i + d];
data[i + d] = temp;
}
}
}
// M5. [Loop on q.]
if (q != p) {
d = q - p;
q = q / 2;
r = p;
} else {
loopOnQ = false;
}
}
while (loopOnQ);

// M6. [Loop on p.]
p = Math.floor(p / 2);
}
while (p > 0);
}```

The variables, p, q, r, and d are bookkeeping and my thoughts was that I would pass these in as values for each row in the pipeline. While the M3 and M4 piece would be the actual parallelized sorting piece. I quickly hit my first problem when I discovered that it doesn't look like I can do bitwise operators in Pixel Bender. The ((i & p) == r) line is vital to only testing possible exchanges at specific indexes for a particular step. It is in fact what gives the algorithm its complexity and beauty.

My next thought was to manually calculate the bitwise and using what functions I had available in Pixel Bender. In the back of my mind I think there is some eloquent way to do bitwise operations that I've forgotten since college (and some quick Google searches didn't refresh my memory). Just wanting to get something working I hacked together a quick little routine that used only operations available in Pixel Bender. Thankfully Pixel Bender does support modulus. This is the ActionScript version of the code:

```private function bitwiseAnd(a:int, b:int):int {
var result:int = 0;
var n:int = 1;
while ((a > 0) && (b > 0)) {
if (((a % 2) == 1) && ((b % 2) == 1)) {
result += n;
}
a = a / 2;
b = b / 2;
n = n * 2;
}
return result;
}```

I had forgotten that Pixel Bender, when exporting for Flash, doesn't support loops :( I unrolled the loop for the largest values that I thought I'd test with, yay cut and paste. Then I ran into another subtle aspect of the code that I'd skimmed over. Primarily the fact that when looping over i it is doing a swap of the elements. Given that I'd be running in parallel I really had to calculate not only if i was bigger than i+d but the opposite at the same time. That would mean another unrolled bitwise and operation. This was turning into more of a pain than I had hoped and I was far exceeding whatever playtime limit I'd set aside for this little project.

Alas, for better or worse as Brian has aptly pointed out, I have a tenacious personality. I did take a break to run some errands, eat some more of the yummy chicken stew I made last night, and catch up on some saved TV shows. I came back, didn't have any insight, and figured I'd just write up this blog post. Half way through writing it I had an insight. The p, r, q, d, and i nonsense is just bookkeeping overhead. For a fixed size n, the values of p, r, q, d, and what swaps need to be made can be calculated once and reused as part of the pipeline.

Pixel Bender supports multiple source images. One could be my input data and the other could be the bookkeeping junk. (Insert long pause while I try this out). Success! I've now used way too much time putting this together. I only hope I haven't forgotten something else that I was supposed to be doing...

The couple of changes are before M3 in the algorithm mentioned above, I store the value of d in my bookkeeping bitmap:

```// Pass along the value for d
step++;
bitmapData.setPixel(0, step + Y_OFFSET, d);```

Then I replace M4 with the following change:

```// M4. [Compare/exchange R_(i+1):R_(i+d+1)]
// Store the fact that we need to make this comparison
bitmapData.setPixel(X_OFFSET + i, step + Y_OFFSET, RIGHT);
bitmapData.setPixel(X_OFFSET + i + d, step + Y_OFFSET, LEFT);```

This stores either a compare to the right or compare to the left value in the bookkeeping bitmap.

Lastly to get the pipelining working, the Pixel Bender code operates offset by one row. That is when working on row 1 it is pulling source pixels from row 0. This way each time the code runs the source row can be changed but the previous rounds values will continue to work down the sorting steps. This way if you can feed new data into the system every step and then after the minimum number of steps needed start pulling data off the other end that has been sorted. The cheesy UI I put together to test the code demonstrates this pipelining effect.

The Pixel Bender code ended up being straight forward once I introduced the bookkeeping bitmap. The only funky thing is doing the comparisons. Since I'm setting values as pixels, when in Pixel Bender it is chopped up into RGB which means a little special handling. Also, I've had some issues with float versus integer comparisons so don't be surprised if there are some stray oddities in the code. This then is the core of the algorithm on the Pixel Bender side:

```// grab the data to figure out what which direction to compare with, if any
pixel4 bookdata = sampleNearest(book, outCoord());
// grab the d value to use, which is always stored at x = 0 for this row
pixel4 dData = sampleNearest(book, float2(0.0, outCoord().y));
// convert it into a real value, only need GB since we would create images too large otherwise
// although I don't know what the size limit is...
float d = dData.g * 65536.0 + dData.b * 256.0;
// grab the previous step's data, doing this makes for easier pipelining
pixel4 me = sampleNearest(src, outCoord() + float2(0.0, -1.0));
// default to doing nothing
dst = me;
// always skip x = 0 as that holds bookkeeping information
// and skip y = 0 as that is the start of the pipeline
// we read the previous row and write in this row
if ((int(outCoord().x) != 0) && (int(outCoord().y) != 0)) {
// need to compare to the value on our right at me.x + d
if (bookdata.r == 1.0) {
pixel4 right = sampleNearest(src, outCoord() + float2(d, -1.0));
// simplistic way to make the RGB compare like a number, can probably be optimized
if ((me.r > right.r) || ((me.r == right.r) && ((me.g > right.g) || ((me.g == right.g) && (me.b > right.b))))) {
dst = right;
}
// need compare to the value on our left at me.x - d
} else if (bookdata.g == 1.0) {
pixel4 left = sampleNearest(src, outCoord() + float2(-d, -1.0));
// see note above
if ((left.r > me.r) || ((left.r == me.r) && ((left.g > me.g) || ((left.g == me.g) && (left.b > me.b))))) {
dst = left;
}
}
}```

Give the finished application a try. Flash Player 10 required. It currently limits the values to 0xFFFF and the number of elements to 16 so they fit in the grid. View the full source if you want to play with it some more.

Tags: flex pixelbender sort

## September 22, 2008

### Pixel Bender and Conway's Game of Life

When I started reading about Pixel Bender and realized at its core it was a cellular automaton engine I immediately thought of Conway's Game of Life. I started looking into how to use Pixel Bender to render generations in the game of life.

Given that pixel bender is boundless (i.e. defined over an infinite plane of discrete pixel coordinates) I had to introduce the notion of a border pixel to make handling edge cases easier. Since most of the area in a generation is dead I decided to use white for a pixel that is dead (easier on the eyes) and black for a pixel that was alive. My initial attempt at the algorithm ran into a bug with the kernel export for Flash Player feature. After some rewriting I had a working pixel bender kernel that implemented the logic (albeit unoptimized):

```<languageVersion : 1.0;>
// \$Id: GameOfLife.pbk 57 2008-09-16 02:55:33Z danielr \$
kernel GameOfLife
<   namespace : "Daniel Rinehart";
vendor : "NeoPhi.com";
version : 1;
description : "Conway's Game of Life";
>
{
input image4 src;
output pixel4 dst;

void
evaluatePixel()
{
pixel4 border = pixel4(1.0, 0.0, 0.0, 1.0);
pixel4 dead = pixel4(1.0, 1.0, 1.0, 1.0);
pixel4 alive = pixel4(0.0, 0.0, 0.0, 1.0);

pixel4 me = sampleNearest(src, outCoord());

// default to no change in pixel
dst = me;

// I would like to write "all(equal(border, me))" but that's buggy in Flash
if (!((border.r == me.r) && (border.g == me.g) && (border.b == me.b)))
{
int aliveNeighborCount = 0;
float2 offset = float2(pixelSize(src).x, pixelSize(src).y);

// Find out how many of my neighbors are alive
// left
pixel4 test = sampleNearest(src, outCoord() + (offset * float2(-1.0, 0.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// upper left
test = sampleNearest(src, outCoord() + (offset * float2(-1.0, -1.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// up
test = sampleNearest(src, outCoord() + (offset * float2(0.0, -1.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// upper right
test = sampleNearest(src, outCoord() + (offset * float2(1.0, -1.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// right
test = sampleNearest(src, outCoord() + (offset * float2(1.0, 0.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// lower right
test = sampleNearest(src, outCoord() + (offset * float2(1.0, 1.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// down
test = sampleNearest(src, outCoord() + (offset * float2(0.0, 1.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}
// lower left
test = sampleNearest(src, outCoord() + (offset * float2(-1.0, 1.0)));
if ((alive.r == test.r) && (alive.g == test.g) && (alive.b == test.b))
{
aliveNeighborCount++;
}

// As per: http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life
// 1. Any live cell with fewer than two live neighbours dies, as if by loneliness.
// 2. Any live cell with more than three live neighbours dies, as if by overcrowding.
// 3. Any live cell with two or three live neighbours lives, unchanged, to the next generation.
if (((alive.r == me.r) && (alive.g == me.g) && (alive.b == me.b)) && ((aliveNeighborCount < 2) || (aliveNeighborCount > 3)))
{
}
// 4. Any dead cell with exactly three live neighbours comes to life.
{
dst = alive;
}
}
}
}```

Now with a working filter I used some of the recent posts by Mike Chambers on how to embed a pixel bender kernel into MXML. However, since I wasn't using it as a filter and wanted more control over the flow I leveraged the ShaderJob class.

```var shader:Shader = new Shader(new gameOfLife() as ByteArray);