At the forefront of Artificial Intelligence
  Home Articles Reviews Interviews JDK Glossary Features Discussion Search
Home » Articles » Machine Vision » Beginner

Convolution and Correlation

Convolution is a very important operation in image processing. It basically involves calculating the weighted sum of a neighbourhood of pixels. The weights are taken from a convolution kernel. Each value from the neighbourhood of pixels is multiplied with its opposite on the matrix. For example, the top-left of the neighbour is multiplied by the bottom-right of the kernel. All these values are summed up, the this is the result of the convolution.

This operation can be mathematically represented as:

Note that this only applies to kernels that have dimensions 3x3. Since the operations revolve around a particular pixel, neighbourhoods are always of odd dimensions (3x3, 5x5, 7x7...). The neighbourhoods are nearly always square too.

A Stepped Example

Below is a portion of an image, with a fragment of it highlighted and the greyscale values shown. To the right of that is our convolution kernel (incidentally, one of the Sobel kernels).

So to calculate our convolution value:
N(x,y) = (-1 * 222)  +
         ( 0 * 170)  +
         ( 1 * 149)  +
         (-2 * 173)  +
         ( 0 * 147)  +
         ( 2 * 205)  +
         (-1 * 149)  +
         ( 0 * 198)  +
         ( 1 * 221)  = 63 

Correlation

Correlation is nearly identical to convolution bar one minor difference:

Spot the difference? Instead of multiplying the pixel by the opposite in the kernel, you multiply it by the equivalent (top-left multiplied by top-left). Using our example above, we can calculate that the result of a correlation is -63.

Computational Notes

When implementing a convolution or correlation algorithm, there are certain important things you should think about. Firstly, since this a neighbourhood operation, not all pixels in the original image are going to have a complete neighbour (border pixels) so you need a way to deal with these. The most common way is just to ignore them and have an output image slightly smaller than the input. Other techniques include truncating the kernel to process the edge pixels correctly, but this can be complex to implement.

The second, most important factor is efficiency. Any programmer will have noticed that the convolution algorithm is naturally costly since it requires retrieval of the neighbourhood, then numerous operations upon that neighbourhood. The most common way to cut down the operation is to take into account the shifting window of operation.

You can see that the neighbourhood in the first iteration (left) and the neighbourhood in the second iteration (right) share two-thirds of the same values. If the algorithm takes this into account, it is only necessary to make 3 retrievals instead of 9. Savings are even more significant when the kernels are bigger (i.e., for 11x11 kernels, 11 retrievals are made instead of 121!).

As always, see Image Analysis Explorer Pro for examples and code.

Submitted: 30/08/2002

Article content copyright © James Matthews, 2002.
 Article Toolbar
Print
BibTeX entry

Search

Latest News
- The Latest (03/04/2012)
- Generation5 10-year Anniversary (03/09/2008)
- New Generation5 Design! (09/04/2007)
- Happy New Year 2007 (02/01/2007)
- Where has Generation5 Gone?! (04/11/2005)

What's New?
- Back-propagation using the Generation5 JDK (07/04/2008)
- Hough Transforms (02/01/2008)
- Kohonen-based Image Analysis using the Generation5 JDK (11/12/2007)
- Modelling Bacterium using the JDK (19/03/2007)
- Modelling Bacterium using the JDK (19/03/2007)


All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -