The *canny edge detector* is likely the most commonly implemented edge detection
algorithm. It's generally a nice filter to write when starting with image
processing as it is not much complicated and provide kind of useful results in
some situations.

That's why I took a few days to implement it in FFmpeg.
Following is a screenshot of a Lynx video by Stefan Sobotta
filtered with the `edgedetect`

filter:

Command line used: ```
ffplay ~/samples/Lynx.mp4 -vf 'split[b], pad=iw:ih*2[src],
[b]edgedetect, [src]overlay=0:h'
```

For those not familiar with the algorithm, it is essentially a few successive filters applied on a gray scale image:

- apply a gaussian filter to reduce noise (it blurs the image)
- compute the gradients, and the directions for the next step (see Sobel)
- non maximum suppressions
- double threshold to keep only the interesting values

For more details, I'd recommend the Canny edge detector Wikipedia page and its external links.

This article will focus on the second step, and especially the **directions
computation**, since it involves some little math and can be expensive. The
other steps are either trivial or widely documented so they won't be the topic
of this post.

## Direction computation

This step really is in my opinion the most interesting one. But first we need
some little understanding of this Sobel operator. In input we have a blurred
image, and for each pixel we get a gradient value `G = |Gx|+|Gy|`

, where `Gx`

and `Gy`

are defined using 6 surrounding pixels (both use a `3x3`

kernel with 3
zero values).

The kernel for `Gx`

is:

-1 | 0 | +1 |

-2 | 0 | +2 |

-1 | 0 | +1 |

And the kernel for `Gy`

is:

-1 | -2 | -1 |

0 | 0 | 0 |

+1 | +2 | +1 |

The interesting thing about `Gx`

and `Gy`

is that they allow to define an edge
direction angle `Θ = atan2(|Gx|,|Gy|)`

. In the Canny Edge Detector algorithm, this
angle must be rounded to one of the four directions:

- horizontal
- vertical,
- 45° going "up"
- 45° going "down"

These directions, defined for each pixel, will be used in the next step when looking for maximums.

## Naive implementation

Let's write a piece of code to test our direction rounder:

```
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <math.h>
enum {
DIRECTION_45UP,
DIRECTION_45DOWN,
DIRECTION_HORIZONTAL,
DIRECTION_VERTICAL,
DIRECTION_UNKNWON,
};
static const uint8_t colors[] = {
0xff, 0x00, 0x00, // red -> 45 up
0x00, 0xff, 0x00, // green -> 45 down
0x00, 0x00, 0xff, // blue -> horizontal
0xff, 0xff, 0x00, // yellow -> vertical
0x00, 0x00, 0x00, // black -> unknown
};
static int get_direction(int gx, int gy)
{
/* TODO */
return DIRECTION_UNKNWON;
}
static void print_ppm(int m)
{
int y, x, sz = m*2 + 1;
printf("P3\n%d %d\n255\n", sz, sz);
for (y = -m; y <= m; y++) {
for (x = -m; x <= m; x++) {
const uint8_t *c = colors + 3*get_direction(x, y);
if (x != -m)
printf(" ");
printf("%3d %3d %3d", c[0], c[1], c[2]);
}
printf("\n");
}
}
int main(int ac, char **av)
{
if (ac != 2) {
fprintf(stderr, "usage: %s limit\n", av[0]);
return 0;
}
print_ppm(strtol(av[1], NULL, 10));
return 0;
}
```

That code takes a `limit`

parameter used to limit the range of the `x`

and `y`

values. Basically, `get_direction()`

will be called for all the integer values
for `x`

in `[-limit;limit]`

and for `y`

in `[-limit;limit]`

. Each pixel will be
attributed a color describing the direction (blue for horizontal, yellow for
vertical, etc).

The usage is pretty simple:

```
% gcc -Wall -lm a.c && ./a.out 100 > out.ppm && feh out.ppm
```

The resulting output `out.ppm`

is a picture of size `2xlimit`

x `2xlimit`

full
of colors.

Here is a naive `get_direction()`

implementation, where we compute `Θ`

to
compare it to reference angles (`π/8`

and `3π/8`

), and use the signs of `Gx`

and `Gy`

:

```
static int get_direction_naive(int gx, int gy)
{
const double theta = atan2(abs(gy), abs(gx));
if (gx) {
if (theta < 3*M_PI/8 && theta > M_PI/8) {
if ((gx > 0 && gy > 0) || (gx < 0 && gy < 0))
return DIRECTION_45DOWN;
else
return DIRECTION_45UP;
}
if (theta < M_PI/8)
return DIRECTION_HORIZONTAL;
}
return DIRECTION_VERTICAL;
}
```

The resulting `out.ppm`

with `limit=100`

:

So this is exactly what we want to achieve. Unfortunately, the call to
`atan2()`

for each pixel is quite troublesome.

**Note**: if `Gx = 0`

and `Gy = 0`

, we are at the center of the picture.

## Get rid of the tangent computation

If we want to get get rid of the `atan2()`

we can just use `tan()`

: basically
the tangent of the angle `Θ`

defined by `Gx`

and `Gy`

is `Gy/Gx`

: ```
tan(Θ) =
Gy/Gx
```

. So we just have to compare `Gy/Gx`

against the tangent of the reference
angle, which are the following constants:

`tan( π/8)`

=`√2 - 1`

`tan(3π/8)`

=`√2 + 1`

And to avoid the division of `Gy/Gx`

, we can simply compare `Gy`

against `Gx`

multiplied by these constants. Here is a new proposition of the
`get_direction()`

function:

```
static int get_direction_no_atan2(int gx, int gy)
{
double tanpi8gx, tan3pi8gx;
#define SQRT2 1.4142135623730951
if (gx) {
if (gx < 0)
gx = -gx, gy = -gy;
tanpi8gx = (SQRT2-1) * gx;
tan3pi8gx = (SQRT2+1) * gx;
if (gy > -tan3pi8gx && gy < -tanpi8gx) return DIRECTION_45UP;
if (gy > -tanpi8gx && gy < tanpi8gx) return DIRECTION_HORIZONTAL;
if (gy > tanpi8gx && gy < tan3pi8gx) return DIRECTION_45DOWN;
}
return DIRECTION_VERTICAL;
}
```

You'll notice I also took the liberty to also change the branching logic to make the code somehow more readable.

So now that we dropped the dependency to `libm`

and the `atan2()`

call, let's
do something even better.

## Making the code bit exact

We are still using floating point operations here, which is kind of a problem (for performances, and for architecture independent testing at least). Let's try to make some integer arithmetic instead.

First, we need to analyze what will be the range of values `Gx`

and `Gy`

can
take. Both are using the `3x3`

kernels defined a bit earlier, and
assuming the pixel is defined as a 8-bit gray-scale value (`0`

to `0xFF`

), we
can conclude that `Gx`

and `Gy`

are both in the range: ```
[(-1)x255 + (-2)x255 +
(-1)x255; 1x255 + 2x255 + 1x255]
```

, or `[-1020;1020]`

.

This range is quite small, and multiplying these values by `1<<16`

will largely
fit in an integer, so it should be safe to use 16-bit arithmetic.

Instead of comparing `Gy`

against:

`Gx`

x`√2-1`

`Gx`

x`√2+1`

...like we did previously, let's do the comparison between `Gy`

x `(1<<16)`

and:

`Gx`

x`√2-1`

x`(1<<16)`

`Gx`

x`√2+1`

x`(1<<16)`

The rounded values for these constants are:

`round((sqrt(2)-1) * (1<<16)) = 27146`

`round((sqrt(2)+1) * (1<<16)) = 158218`

These values will be multiplied to `Gx`

, and `1020 x 158218`

fits into a 32-bit
integer so it's still OK. Following, the optimized version:

```
static int get_direction_integer(int gx, int gy)
{
if (gx) {
int tanpi8gx, tan3pi8gx;
if (gx < 0)
gx = -gx, gy = -gy;
gy <<= 16;
tanpi8gx = 27146 * gx;
tan3pi8gx = 158218 * gx;
if (gy > -tan3pi8gx && gy < -tanpi8gx) return DIRECTION_45UP;
if (gy > -tanpi8gx && gy < tanpi8gx) return DIRECTION_HORIZONTAL;
if (gy > tanpi8gx && gy < tan3pi8gx) return DIRECTION_45DOWN;
}
return DIRECTION_VERTICAL;
}
```

We should now have a way faster version than our first naive implementation.

## Benchmark

Following is a naive `time`

benchmark, on a i7 Intel CPU, and using `gcc -O3`

:

Run | naive | no atan2 | integer |
---|---|---|---|

1 | 0.257 | 0.032 | 0.025 |

2 | 0.251 | 0.032 | 0.029 |

3 | 0.255 | 0.029 | 0.027 |

As usual, performances depend on a lot of various other parameters, but it looks like it was worth the effort since it's almost 10 times faster (and now bit exact).