Archive vision (Archv) provides a model for the computational analysis of large sets of images. It is a tool for pattern recognition, finding recurring visual elements within the set and quantitatively ranking their differences. The application uses SURF feature detection and extraction to find keypoints within the images to be compared. The keypoints are then matched using their descriptors, and those matches are then filtered based on their homography. The application has four programs that can be used in conjunction or separately depending on your needs:

  1. showKeypoints - a tool to find and display the keypoints of an image given certain SURF parameters
  2. processImages - a tool to process an entire directory of images and output files containing their keypoints and descriptors in YML format
  3. scanDatabase - a tool to find matches between a seed image and a directory of files containing keypoints and descriptors of an image set.
  4. drawMatches - a tool that draws two images and their robust matches

Setting Up Archv

To use archv, you need to have OpenCV installed.

Installing Opencv

For OpenCV, you need several dependencies: gcc, g++, cmake and several video and image libraries specified on their site. For Linux, use these links to install OpenCV:

Provided is a simplified version of the process for building OpenCV on Unix based systems:

Compiling Archv

Once OpenCV is installed and the libraries are included, go to your Archv directory and run make all. You should be left with an executable (.exe) version of each program: processImages.exe, scanDatabase.exe, showKeypoints.exe and drawMatches.exe.

Extracting Keypoints

Archv uses SURF (SURF Website) for keypoint detection and neighbourhood description. SURF is fast, scale invariant and orientation invariant, making it an ideal tool for finding matches over an image set. In particular, this project demonstrates the flexibiliy of SURF keypoint detection for Early Modern imagesets, where there exists wide vairance in printing quality.

Archv builds upon tools provided by OpenCV, an open source and powerful library for image analysis. OpenCV's documentation can be found on their website:

The program showKeypoints allows the user to test the parameters that SURF uses for keypoint detection. This is useful as those parameters need to be tweaked according to whatever image set is being tested. SURF parameters that work well for one image set might not work at all for another.

Using showKeypoints

showKeypoints reads in a SURF parameter file, an image, and an output file path. It then outputs to an image file a copy of the original image with circles drawn representing the detected keypoints. Below is an example of an image from the British Library's flickr image set with the keypoints drawn on.

Usage for showKeypoints:

$ ./showKeypoints.exe -i <path to seed image> -o <path to output file.jpg> -p < path to parameter file>

Show keypoints only works with one image as an input. Its use can be expanded to process an entire image set using process images.

Processing an Image Set

The program processImages allows the user to extract and store all the keypoints and their descriptors for an entire image set. For each image in the set, the SURF keypoints and their descriptors are extracted and recorded in a YAML file. When completed, the user will have a directory of YAML files corresponding to the images from the set. This is a required step for the next program, scanDatabase.

Using processImages

processImages reads in a parameter file (for SURF), an input directory that contains the images to be processed, and an output directory to put the YAML files containing their discovered keypoints and their descriptors. Make sure that the output directory exists and is empty before running this program. This step can take a long time depending on the number of images, and the number of keypoints.

Usage for processImages:

$ ./processImages.exe -i <path to imageset > -o <path to output directory> -p < path to parameter file>

For an example of a yml file containing keypoints and descriptors look at the archv readme (link). Once this step is completed, you can run scanDatabase to find matches within your image set.

Finding Matches

The program scanDatabase reads in your seed image, extracts the keypoints and descriptors for that image, and compares that information with the keypoints and descriptors from every .yml file. Each comparison is done using a robust filter, that checks for sensitivity, symmetry, as well as geometric proximity of the matches. Images are then ranked based on the number of good matches (post filter) they have with the seed image.

Using scanDatabase

scanDatabase reads in a seed image, the directory of .yml files, a filepath to an output json (text) file, and the path to the SURF parameter file. Below, is an illustration of the top three matches next to the seed image as found by scanDatabase.

Usage for scanDatabase::

$ ./scanDatabase.exe -i <path to seed image> -d <path to input directory> -k <path to keypoints directory> -o <path to output file> -p <path to parameter filea>

The output file contains a ranked list of each matching image and the number of robust matches.

Comparing Images

The program drawMatches allows the user to see the matching robust keypoints between two images. Generally, its best use is comparing the matches found using scan database with the seed image used.

Using drawMatches

drawMatches takes as input two images, the path to an output image file as well as the path to the parameter file. The code is self contained so you can input any two images and any SURF parameter file to find the keypoints that match and have passed the robust homography filter. Below is an example of two images and the robust matches they share.

Usage for drawMatches::

$ ./drawMatches.exe -i1 <path to seed image one> -i2 <path to image for comparison> -o <path to output file> -p <path to parameter filea>