PageRank for Images
Ranking images in image search would be much easier if computers could interpret what exactly is contained in an image. Currently, a “semantic gap” exists between descriptions generated by computer vision algorithms, and the actual content of the image.
Instead of trying to detect what the image contains, another approach for ranking is to compare two images and see if there are similar “features” among them. This section discusses some approaches presented in two Google papers: “PageRank for Product Image Search”, by Y. Jing and S. Baluja, and “Canonical Image Selection from the Web” by Y. Jing, S. Baluja, and H. Rowley.
The idea behind the “PageRank for images” or “Image Rank” approach is that many images in a result set for a query are expected to have similar features. We could “link” these images together based on their similar features, and compute a rank based on this visual link structure (an image similarity graph). This approach is similar to PageRank in which we have web pages and hyperlinks forming a link-graph rather than images and similarity features forming an image similarity graph. The importance of an image is contributed by the images sharing similar features with it.
The Image Rank computation algorithm based on the Google paper is described in the following steps.
- Step 0: Crawl and collect images based on text content in web pages, as well as labeled or tagged images. We call this the “Image Collection”.
- Step 1: Retrieve “n” number of images, for a query string “q” from the image collection. The query string chosen for this process could be from the set of top search query terms.
- Step 2: Identify interest points (features) and generate feature descriptors for each of these images. For identifying interest points, popular feature detection methods like Scale Invariant Feature Transform (SIFT), Harris Corners, Spin Images etc. could be used. In the Google paper, they have chosen SIFT.
- Step 3: Identify matching features among these images. Given two images “u” and “v”, and their corresponding feature descriptor vectors, Du = (d1u, d2u,…dju) and Dv = (d1v, d2v,…dkv), similarity between the two images is defined as the number of interest points shared between the two images divided by their average number of interest points. This is stored a similarity measure score between the two images.
- Step 4: Construct a similarity graph, with images as vertices and edges as weighted values for similarity measure score (see figure below).
- Step 5: Rank images in the similarity graph using Eigenvector centrality (similar to PageRank). The idea here is to iteratively compute the importance of a vertex (image). With other factors being equal, a vertex closer to an “important” vertex should rank higher than others. Display images in the order of their ImageRank (IR) – for the query string “q”.
We can express the computation of “Image Rank” as follows:
- IR = Image Rank
- n = Number of images
- Su,v = Measure of visual similarity between image u & v
- S = Similarity Matrix
- S* = Column normalized, symmetrical adjacency matrix S
- d = Damping factor (d > 0.8 chosen empirically)
This PageRank-like approach for images works well in the case of popular queries, with a large number of similar images. The downside of this approach is that a large amount of computation is required and there has to be a large volume of images in the image data collection.
An interesting by-product of this approach is the option to search for “similar images”, which we are effectively gathering in order to calculate the image rank.