Non-Negative Matrix Factorization

Non-Negative Matrix Factorization

I recently took the course “Multimedia and Web Databases” and we learnt to use feature reduction techniques to reduce the dimensionality of the multimedia data. This is done so that we can scan the entire database (relatively) fast for approximately similar data.

One of our project tasks involved using NMF to find top ‘k’ images closest to a given image in the Euclidean space. Our project can be found here.

The image feature we used for this task was Color Moments (More information about that here).

Color Moments contains [mean, std, skew] for every channel YUV color model of an image. So, an image would have 3 × 3 = 9 features. If the image is divided into windows of 100 × 100, for every image of size 1600 × 1200, there would be 1728 features.

Now all we had to do is apply feature reduction techniques on it.

However, there was a small problem. Color moments uses skew as a part of its feature set. Skew could be negative. NMF is Non-Negative Matrix factorization. But that’s not a problem, right? Just normalize and we should be good, right?

I tried different ways:

  1. Shift all the values in the feature vector by the least negative value in the database, so that all the values in the database is >= 0.
  2. Shift only the skew column by the least negative value in the database.
  3. Normalize only the skew column of the database.
  4. Normalize all the values.
  5. Ignore the skew values (missing out on a whole lot of discrimination power)

The approaches I tried gave me completely different results for my task. Of course, one could argue that similar images can be subjective to the user. But I had hard evidence – the metadata of the image database. We were using the 11k Hand image dataset. Logically, given an image of a hand of a subject, the hand images of the same subject should show up as the top closest. This suggested that approach 1 had the closest matching results to the metadata among all. Even so, it was far from ideal.

So even shifting the entire database linearly changes the results. I also noticed that the smaller the shift, the closer to the ideal result.

Then my goal is to try and reduce the shift. Which means, I somehow needed to reduce how negative the skew was.

Why is there a negative skew?

In our case, that’s because there is a hand on a white surface. The transition from the hand to white background causes the values to be skewed to the right since the RGB value for white is 255. So, in a window, if there is a transition from a part of the hand to a very large amount of white surface, there will be a large negative skew.

Can we try to linearly transform the features so that the distances of the data are preserved and gets rid of the negative values?

As in approach 1, I tried to simply add the absolute of the lowest negative value to all features such that the lowest negative value present in the data matrix becomes a zero. The coordinates of the data points are shifted but the relative distances are preserved. However, NMF transformation does some sort of operation that amplifies this linear transformation. The larger the shift, the more the results change.

Now the challenge is - how to reduce the shift to transform the negative values to zero?

Inverting the image would change all the 255 values to 0. This would help reduce the number of negative skew values. Based on the results, I saw that the number of negative skew values and the lowest negative values have also reduced. So, the value to be added to transform the dataset to convert the lowest negative value to zero also has reduced.

Minimal shift. Closer to the ideal result. But never close enough.


See also