| ||||||||||||||
| ||||||||||||||
|
||||||||||||||
Kohonen-based Image Analysis using the Generation5 JDK
Using the Generation5 JDK, this article will go over how to create an image analyzer to detect similar images automatically. You will need to understand Kohonen networks, Hough transforms and histograms:
The ConceptI wanted to create an example of a Kohonen network that detected similar images, using a range of image analysis metrics to detect images that weren't just similar from a colour perspective but also in structure. In previous experiments, I used images of cars which often had strong colour connections. For this version, I looked at using images of fighter aircraft as they have a mix of backgrounds, strong lines and many aircraft look quite similar. Below is a shot of the network's output (bottom-right), with a few examples pulled out to show how the network groups photos. Note that many of the shots are actually different aircraft grouped together in similar orientations or backgrounds. Three groupings of the F-4 Phantom, Su-27 Flanker and F-117 stealth fighter are similar shots but taken from slightly different angles, and another grouping clearly shows a connection by silhouetting the foreground aircraft on a sunset.
TheoryThe program loads a directory of equal-sized JPEG images and generates nearly 300 feature vectors (by default). The feature vector partly comprises of:
Understanding the histogram portion is straightforward (understanding the colour profile of the image as a whole and subimages is important to group photos together), but we should look quickly at the subimages and Hough transforms. Visualizing what the program does is the easiest way to understand the process. The program splits the image up to subimages using a little padding either side to ensure that no detail is lost at the edges. The Hough Transform will find any strong lines within the subimages, providing valuable data for the objects within the image itself. Below is a composite image of the Hough Transforms on the subimages. You can see how the strong lines within the photo are highlighted, including the bomb.
CodeLet's start looking at the code. I've taken out sections for brevity, for basic utility routines (get/set) as well as redundant or impertinent code (generating/displaying subimages). A link to the full code is available below. package org.generation5.demos.nn;import java.io.*; import java.awt.*; import java.awt.image.*; import javax.imageio.*; import org.generation5.*; import org.generation5.nn.*; import org.generation5.util.*; import org.generation5.vision.*; public class KohonenImageClassifier implements Visualizable { public static int networkWidth = 7; public static int networkHeight = 7; public static int subImages = 36; public static int numberFeatures = subImages * 8 + 8; These are the initial variables for the network. They were chosen using a fair amount of experimentation, but the 7x7 network deliberately constrains the network to group the images more tightly. I used 188 test images, which means the network has to group at least 3 or 4 images together on a neuron. The 36 subimages allowed the network to gather enough information at a granular level without too much redundancy. These are great variables to play with to see how the network reacts. protected BufferedImage[] inputImages;protected int numberImages = 0; protected File[] imageFiles; protected double[][] featureVectors; KohonenNN kohonenNetwork; // Utility functions removed public void generateVectors() { if (inputImages == null) throw new NullPointerException("no input images loaded!"); System.err.print("Generating vectors..."); featureVectors = new double[numberImages][numberFeatures]; double[] data; double[][] lines; int subWidth = (int)Math.sqrt(subImages); Histogram hist; BufferedImage[] subImage = new BufferedImage[subImages]; HoughImageClassifier hic = new HoughImageClassifier(); hic.setLocalPeakNeighbourhood(25); There are a few things to note to this point. Firstly, the code is badly commented — I'll rectify this in the full source code! Secondly, HoughImageClassifier is not a standard Generation5 JDK class, but a class I derived from LineHoughTransformOp (definition below). The reason for this is we needed to be a little more precise and efficient retrieving strong lines within the image. Hough transforms often return as many lines as necessary that meet a certain criteria, I just wanted to return a set number of lines. Therefore, HoughImageClassifier has a getLines method that stops processing after the required numbers of lines have been found. It returns the lines as a double array, using the sine and cosine components taken from the Hough space. int width = inputImages[0].getWidth();int height = inputImages[0].getHeight(); for (int i = 0; i < numberImages; i++) { hist = new Histogram(inputImages[i]); data = hist.getMean(); featureVectors[i][0] = data[0] / 255.0; featureVectors[i][1] = data[1] / 255.0; featureVectors[i][2] = data[2] / 255.0; data = hist.getStandardDeviation(); featureVectors[i][3] = data[0] / 255.0; featureVectors[i][4] = data[1] / 255.0; featureVectors[i][5] = data[2] / 255.0; // Generate subimages with padding // ... subimage generation method removed for brevity subImage = generateSubImages(inputImages[i], 10); for (int j = 0; j < subImages; j++) { hist = new Histogram(subImage[j]); data = hist.getMean(); featureVectors[i][6 + (j * 8)] = data[0] / 255.0; featureVectors[i][7 + (j * 8)] = data[1] / 255.0; featureVectors[i][8 + (j * 8)] = data[2] / 255.0; data = hist.getStandardDeviation(); featureVectors[i][9 + (j * 8)] = data[0] / 255.0; featureVectors[i][10 + (j * 8)] = data[1] / 255.0; featureVectors[i][11 + (j * 8)] = data[2] / 255.0; lines = hic.getLines(subImage[j], 1); featureVectors[i][12 + (j * 8)] = lines[0][0]; featureVectors[i][13 + (j * 8)] = lines[0][1]; } int last = 13 + ((subImages - 1) * 8) + 1; lines = hic.getLines(inputImages[i], 1); featureVectors[i][last] = lines[0][0]; featureVectors[i][last+1] = lines[0][1]; } System.err.println("done."); } The remainder of the code in this method should be easy to understand. Again, notice how getLines is used to return just one line. Experimentation can be done increasing the number of lines, although I generally found this to confuse the network. public void trainNetwork(){ // ... kohonenNetwork = new KohonenNN(networkWidth, networkHeight, numberFeatures); KohonenImageTrainer train = new KohonenImageTrainer(); train.featureVectors = featureVectors; train.setNetwork(kohonenNetwork); train.init(); train.setPhases(inputImages.length * 15, inputImages.length * 5); kohonenNetwork.initialize(0.0, 1.0); do { train.doStep(); } while (!train.isComplete()); System.err.println("done."); } The network training method is very simple too. After setting up the network, we initialize the trainer. A trainer in the Generation5 JDK allows you to be very flexible with the data you present to the network during training. Our trainer, KohonenImageTrainer, is defined below, a simply cycles through the training images' feature vectors. The two training phases are set up to be 15 times the number of images for phase 1 and a relatively short phase 2. This is quite a long training cycle for a Kohonen network, but seemed to work best when constraining the size of the network deliberately. public void findSimilarImages(String s){ // Generate feature vector int[] neuron = kohonenNetwork.getClosestNeuron(features), in; for (int i=0; i<inputImages.length; i++) { in = kohonenNetwork.getClosestNeuron(featureVectors[i]); if (in[0] == neuron[0] && in[1] == neuron[1]) { System.out.println("Matching image with " + imageFiles[i].getName()); } } } The findSimilarImages method was a simple test function I wrote that loaded an images not in the training set and returned any images that the network deemed similar. You can see an example below:
The first image returned bears a very close resemblance to the structure despite being a different aircraft (F-22 vs. F-35). The two other images seem to be more based on the background. public static void main(String[] args) {KohonenImageClassifier kic = new KohonenImageClassifier(); kic.loadImages(args[0]); kic.generateVectors(); kic.trainNetwork(); kic.writeImage("kohonenOutput.jpg"); kic.findSimilarImages(args[1]); } static public class KohonenImageTrainer extends KohonenTrainer { public double[][] featureVectors; private int currentStep = 0; public double[] getTrainingPoint() { return featureVectors[currentStep++ % featureVectors.length]; } } static public class HoughImageClassifier extends LineHoughTransformOp { public double[][] getLines(BufferedImage img, int numberLines) { double[][] lines = new double[numberLines][2]; int index = 0; GreyscaleFilter s1 = new GreyscaleFilter(); SobelEdgeDetectorFilter s2 = new SobelEdgeDetectorFilter(); ThresholdFilter s3 = new ThresholdFilter(); BufferedImage in1 = s1.filter(img); // to greyscale BufferedImage in2 = s2.filter(in1); BufferedImage in3 = s3.filter(in2); double accRatio = 1.0d; setLocalPeakNeighbourhood(25); run(in3); maxAccValue = getMaximum(); long thresh = (int) (accRatio * maxAccValue); int tmp = Math.max(img.getWidth(), img.getHeight()); int hh = (int) (Math.sqrt(2) * tmp); int hw = 180; for (int i = 0; i < hw; i++) { for (int j = 0; j < 2 * hh; j++) { if (houghAccumulator[i][j] >= thresh) { if (localPeak(i, j, hw, hh, localPeakNeighbourhood) == false) continue; lines[index][0] = Math.sin(i * thetaStep); lines[index][1] = Math.cos(i * thetaStep); index++; } if (index == numberLines) break; } if (index == numberLines) break; } return lines; } } } The remainder of the code is the test method which simply reads all the images from a directory, runs the network, renders the network to disk and finally runs an unrecognized image against the network. The merits of the network can be seen from this sample output of the network — notice how the picture we used above in the subimage example is grouped with other pictures of different aircraft dropping bombs:
The network isn't perfect, grouping seemingly disconnected images together (the SR-71 above, bottom centre, for example seems a little out of place) but it is a good start and not bad for a couple of hours work. I would be very interested to hear of anyone that has made improvements or experimentated with the code. DownloadsUntil I figure how on earth to run the Javadoc tool in NetBeans 6 (which incidentally, must have some of the worst "support" I've ever seen), I'm afraid this is just the raw source file, with little in the way of documentation. Hopefully the article goes some way to unravel it.
Submitted: 11/12/2007 Article content copyright © James Matthews, 2007.
|
|
|||||||||||||
All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -