# Fun and canny optim for a Canny Edge Detector

2012-09-08

The canny edge detector is likely the most commonly implemented edge detection algorithm. It's generally a nice filter to write when starting with image processing as it is not much complicated and provide kind of useful results in some situations.

That's why I took a few days to implement it in FFmpeg. Following is a screenshot of a Lynx video by Stefan Sobotta filtered with the `edgedetect` filter: Command line used: `ffplay ~/samples/Lynx.mp4 -vf 'split[b], pad=iw:ih*2[src], [b]edgedetect, [src]overlay=0:h'`

For those not familiar with the algorithm, it is essentially a few successive filters applied on a gray scale image:

• apply a gaussian filter to reduce noise (it blurs the image)
• compute the gradients, and the directions for the next step (see Sobel)
• non maximum suppressions
• double threshold to keep only the interesting values

For more details, I'd recommend the Canny edge detector Wikipedia page and its external links.

This article will focus on the second step, and especially the directions computation, since it involves some little math and can be expensive. The other steps are either trivial or widely documented so they won't be the topic of this post.

## Direction computation

This step really is in my opinion the most interesting one. But first we need some little understanding of this Sobel operator. In input we have a blurred image, and for each pixel we get a gradient value `G = |Gx|+|Gy|`, where `Gx` and `Gy` are defined using 6 surrounding pixels (both use a `3x3` kernel with 3 zero values).

The kernel for `Gx` is:

 -1 0 +1 -2 0 +2 -1 0 +1

And the kernel for `Gy` is:

 -1 -2 -1 0 0 0 +1 +2 +1

The interesting thing about `Gx` and `Gy` is that they allow to define an edge direction angle `Θ = atan2(|Gx|,|Gy|)`. In the Canny Edge Detector algorithm, this angle must be rounded to one of the four directions:

• horizontal
• vertical,
• 45° going "up"
• 45° going "down"

These directions, defined for each pixel, will be used in the next step when looking for maximums.

## Naive implementation

Let's write a piece of code to test our direction rounder:

``````#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <math.h>

enum {
DIRECTION_45UP,
DIRECTION_45DOWN,
DIRECTION_HORIZONTAL,
DIRECTION_VERTICAL,
DIRECTION_UNKNWON,
};

static const uint8_t colors[] = {
0xff, 0x00, 0x00, // red     -> 45 up
0x00, 0xff, 0x00, // green   -> 45 down
0x00, 0x00, 0xff, // blue    -> horizontal
0xff, 0xff, 0x00, // yellow  -> vertical
0x00, 0x00, 0x00, // black   -> unknown
};

static int get_direction(int gx, int gy)
{
/* TODO */
return DIRECTION_UNKNWON;
}

static void print_ppm(int m)
{
int y, x, sz = m*2 + 1;

printf("P3\n%d %d\n255\n", sz, sz);
for (y = -m; y <= m; y++) {
for (x = -m; x <= m; x++) {
const uint8_t *c = colors + 3*get_direction(x, y);
if (x != -m)
printf("  ");
printf("%3d %3d %3d", c, c, c);
}
printf("\n");
}
}

int main(int ac, char **av)
{
if (ac != 2) {
fprintf(stderr, "usage: %s limit\n", av);
return 0;
}
print_ppm(strtol(av, NULL, 10));
return 0;
}
``````

That code takes a `limit` parameter used to limit the range of the `x` and `y` values. Basically, `get_direction()` will be called for all the integer values for `x` in `[-limit;limit]` and for `y` in `[-limit;limit]`. Each pixel will be attributed a color describing the direction (blue for horizontal, yellow for vertical, etc).

The usage is pretty simple:

``````% gcc -Wall -lm a.c && ./a.out 100 > out.ppm && feh out.ppm
``````

The resulting output `out.ppm` is a picture of size `2xlimit` x `2xlimit` full of colors.

Here is a naive `get_direction()` implementation, where we compute `Θ` to compare it to reference angles (`π/8` and `3π/8`), and use the signs of `Gx` and `Gy`:

``````static int get_direction_naive(int gx, int gy)
{
const double theta = atan2(abs(gy), abs(gx));
if (gx) {
if (theta < 3*M_PI/8 && theta > M_PI/8) {
if ((gx > 0 && gy > 0) || (gx < 0 && gy < 0))
return DIRECTION_45DOWN;
else
return DIRECTION_45UP;
}
if (theta < M_PI/8)
return DIRECTION_HORIZONTAL;
}
return DIRECTION_VERTICAL;
}
``````

The resulting `out.ppm` with `limit=100`: So this is exactly what we want to achieve. Unfortunately, the call to `atan2()` for each pixel is quite troublesome.

Note: if `Gx = 0` and `Gy = 0`, we are at the center of the picture.

## Get rid of the tangent computation

If we want to get get rid of the `atan2()` we can just use `tan()`: basically the tangent of the angle `Θ` defined by `Gx` and `Gy` is `Gy/Gx`: `tan(Θ) = Gy/Gx`. So we just have to compare `Gy/Gx` against the tangent of the reference angle, which are the following constants:

• `tan( π/8)` = `√2 - 1`
• `tan(3π/8)` = `√2 + 1`

And to avoid the division of `Gy/Gx`, we can simply compare `Gy` against `Gx` multiplied by these constants. Here is a new proposition of the `get_direction()` function:

``````#define SQRT2 1.4142135623730951

static int get_direction_no_atan2(int gx, int gy)
{
double tanpi8gx, tan3pi8gx;

if (gx) {
if (gx < 0)
gx = -gx, gy = -gy;
tanpi8gx  = (SQRT2-1) * gx;
tan3pi8gx = (SQRT2+1) * gx;
if (gy > -tan3pi8gx && gy < -tanpi8gx)  return DIRECTION_45UP;
if (gy > -tanpi8gx  && gy <  tanpi8gx)  return DIRECTION_HORIZONTAL;
if (gy >  tanpi8gx  && gy <  tan3pi8gx) return DIRECTION_45DOWN;
}
return DIRECTION_VERTICAL;
}
``````

You'll notice I also took the liberty to also change the branching logic to make the code somehow more readable.

So now that we dropped the dependency to `libm` and the `atan2()` call, let's do something even better.

## Making the code bit exact

We are still using floating point operations here, which is kind of a problem (for performances, and for architecture independent testing at least). Let's try to make some integer arithmetic instead.

First, we need to analyze what will be the range of values `Gx` and `Gy` can take. Both are using the `3x3` kernels defined a bit earlier, and assuming the pixel is defined as a 8-bit gray-scale value (`0` to `0xFF`), we can conclude that `Gx` and `Gy` are both in the range: `[(-1)x255 + (-2)x255 + (-1)x255; 1x255 + 2x255 + 1x255]`, or `[-1020;1020]`.

This range is quite small, and multiplying these values by `1<<16` will largely fit in an integer, so it should be safe to use 16-bit arithmetic.

Instead of comparing `Gy` against:

• `Gx` x `√2-1`
• `Gx` x `√2+1`

...like we did previously, let's do the comparison between `Gy` x `(1<<16)` and:

• `Gx` x `√2-1` x `(1<<16)`
• `Gx` x `√2+1` x `(1<<16)`

The rounded values for these constants are:

• `round((sqrt(2)-1) * (1<<16)) = 27146`
• `round((sqrt(2)+1) * (1<<16)) = 158218`

These values will be multiplied to `Gx`, and `1020 x 158218` fits into a 32-bit integer so it's still OK. Following, the optimized version:

``````static int get_direction_integer(int gx, int gy)
{
if (gx) {
int tanpi8gx, tan3pi8gx;

if (gx < 0)
gx = -gx, gy = -gy;
gy <<= 16;
tanpi8gx  =  27146 * gx;
tan3pi8gx = 158218 * gx;
if (gy > -tan3pi8gx && gy < -tanpi8gx)  return DIRECTION_45UP;
if (gy > -tanpi8gx  && gy <  tanpi8gx)  return DIRECTION_HORIZONTAL;
if (gy >  tanpi8gx  && gy <  tan3pi8gx) return DIRECTION_45DOWN;
}
return DIRECTION_VERTICAL;
}
``````

We should now have a way faster version than our first naive implementation.

## Benchmark

Following is a naive `time` benchmark, on a i7 Intel CPU, and using `gcc -O3`:

Run naive no atan2 integer
1 0.257 0.032 0.025
2 0.251 0.032 0.029
3 0.255 0.029 0.027

As usual, performances depend on a lot of various other parameters, but it looks like it was worth the effort since it's almost 10 times faster (and now bit exact).