Face Recognition: Eigenface and Fisherface Performance Across Pose

by Alan Brooks (in collaboration with Li Gao) 
ECE 432 Computer Vision with Professor Ying Wu
Final Project Report submitted on June 9, 2004

Computer Vision Research in General

The computer vision field has recently become a very interesting area to research. Technological improvements in computers and imaging hardware have enabled many novel computer vision applications.

Automated computer vision systems that capture and analyze images have found mostly military applications to date: in surveillance, targeting, and biometrics. In the near future, computer vision is likely to be used in consumer applications such as:

  • improved human-computer interfaces,
  • assisted automobile navigation & collision avoidance,
  • video summarization,
  • virtual simulations & games, and
  • home entertainment.

Also, some useful commercial applications that are currently or soon-to-be in development include:

  • motion capture for movie special effects,
  • face recognition biometric systems, and
  • home robotics.

Computer vision makes use of multidisciplinary knowledge from many different fields, including optical physics, machine learning, pattern recognition, signal processing, and computer graphics. It can be useful to divide computer vision research into four levels:

  1. Image Formation (physics, optics, & cameras)
  2. Low-level Vision (derivatives, optical flow)
  3. Mid-level Vision (segmented objects, tracking, …)
  4. High-level Vision (understanding underlying semantics)

While theoretical knowledge is well-developed for image formation and low-level vision, many mid-level and high-level vision problems lack theoretical descriptions or solutions. Because so much remains to be learned, the next several years should be an exciting time to research computer vision.

As a researcher beginning study in the computer vision field, I wanted to start with study of an established research topic. The facial recognition problem has been studied extensively, with research beginning in the 1960-70’s that was mostly influenced by early works authored by Bledsoe [1], Kelly [2], and Kanade [3].

Problem Statement

Broad Problem Statement

Given a training database of human facial photographs annotated with identity, train an automated system to recognize the identity of a person from a new image of the person. The system should have minimal sensitivity to lighting, maximal probability of detecting the correct person, and minimal probability of false alarms.

Pictorially, given this database of photos

facial photo database

and this new picture,

jessica face

can an automatic algorithm be developed that can match the identity of the person from the new picture with a previously stored image in the database?

While the broad goal is appropriate for a research group, the scope must be narrowed a bit for a class project. By assuming that the face detection problem is already solved, we can expect to use pre-processed (in scale, rotation, and alignment) face images as inputs to our face recognition software.

cmu face detection result

An example of this face detection work can be seen at Carnegie Mellon University’s face detection online demo. Running some of our database through this demo gives this result.

Our (Specific) Problem Statement

Given a training database of pre-processed face images, train an automated system to recognize the identity of a person from a new image of the person. Examine sensitivity to pose using the eigenface approach suggested in [4,5] and the fisherface approach developed in [6].

Why is Face Recognition Interesting?

Face recognition is interesting to study because it is an application area where computer vision research is being utilized in both military and commercial products. Much effort has been spent on this problem, yet there is still plenty of work to be done.

Basic research related to this field is currently active. For example, research searching for a fundamental theory describing how light and objects interact to produce images (via the plenoptic function) was recently published in April 2004 [7]. Often, practical applications can grow out of improvements in theoretical understanding and it seems that this problem will continue to demonstrate this growth.

Personally, I’m interested in this project because it’s a high-level pattern recognition problem in which humans are very adept, whereas it can be quite challenging to teach a machine to do it. The intermediate and final visual results are interesting to observe in order to understand failures and successes of the various approaches.

Background

The state of the art research of face recognition has progressed much beyond the eigenface and fisherface approaches we study in this project. The history has developed from the 1960’s until now (2004) with many significant advances happening within the last eight years.

Influential contributions (with our focus emphasized)

  • 1964, 1970, 1977: Facial feature-based recognition (Bledsoe, Kelly Kanade)
  • 1984: WIZARD neural net approach (Stonham)
  • 1991, 1994: Eigenface PCA (Pentland & Turk)
  • 1997: Fisherface FLD + PCA (Belheumeur)
  • 2000: FERET standard testing method & database
  • 2002: Independent Component Analysis (ICA) captures higher-order statistics (Bartlett)
  • 2003: Kernel, SVM, RBF, combos (Liu, Er)
  • 2004: Plenoptic light-fields (Gross), 2D-PCA (Yang)

For more historical detail, refer to R. Chellappa’s survey [8] of this topic’s early history, covering the 1960’s until 1995.

Approach

We have implemented the eigenface and fisherface algorithms and tested them against two face databases, observing results across pose (out-of-plane face rotation). We evaluated performance against databases with both densely-sampled and sparsely-sampled facial poses.

Our new ideas include:

  1. comparing results of eigenface & fisherface across pose,
  2. testing dense and sparse training databases, and
  3. a slight modification to fisherface imposing a max on the number of PCA components to be used for dimension reduction.

More detail describing the new ideas can be found in the new contributions section.

Mathematics Overview

This section reviews the basic mathematics for the eigenface and fisherface approaches. We use MATLAB-like pseudo-code notation.

Eigenface

Given:

M training images, sized N pixels wide by N pixels tall
c recognition images, also sized N by N pixels
Mp = desired number of principal components

Feature Extraction:

% merge column vector for each training face
X = [x1 x2 ... xm]
% compute the average face
me = mean(X,2)
A = X - [me me ... me]

% avoids N^2 by N^2 matrix computation of [V,D]=eig(A*A')
% only computes M columns of U: A=U*E*V'
[U,E,V] = svd(A,0)

eigVals = diag(E)
lmda = eigVals(1:Mp)
% pick face-space principal components (eigenfaces)
P = U(:,1:Mp)

% store weights of training data projected into eigenspace
train_wt = P'*A

Nearest-Neighbor Classification:

% A2 created from the recog data (in similar manner to A)
recog_wt = P'*A2

% euclidean distance for ith recog face, jth train face
euDis(i,j) = sqrt((recog_wt(:,j)-train_wt(:,i)).^2)

Fisherface

Given:

same training & recognition images, also sized N by N pixels
P1 = eigenface result

Feature Extraction:

% same as eigenface
A = X - [me me ... me]

% compute N^2 by N^2 between-class scatter matrix
for i=1:c
   Sb = Sb + clsMeani*clsMeani'

% compute N^2 by N^2 within-class scatter matrix
for i=1:c, j=1:ci
   Sw = Sw + (X(j)-clsMeani)*(X(j)-clsMeani)'

% project into (N-c) by (N-c) subspace using PCA
Sbb = P1'*Sb*P1
Sww = P1'*Sw*P1

% generalized eigenvalue decomposition
% solves Sbb*V = Sww*V*D
[V,D] = eig(Sbb,Sww)

eigVals = diag(D)
lmda = eigVals(1:Mp)
P = P1*V(:,1:Mp)

% store training weights
train_wt = P'*A

Nearest-Neighbor Classification:

% same as eigenface

Results and Comparisons

Facial recognition software was developed using the MATLAB programming language by the MathWorks. This environment was chosen because it easily supports image processing, image visualization, and linear algebra.

The software was tested against two databases: ALAN & UMIST.

  1. UMIST was created by Daniel B. Graham, with a purpose of collecting a controlled set of images that vary pose uniformly from frontal to side view. The UMIST database has 565 total images of 20 people.
  2. ALAN was created by the author, Alan C. Brooks, by collecting facial snapshots of people taken at different times. These snapshots were then pre-processed by hand using the gimp to align, normalize lighting, and remove background. This database has 47 total images of 14 people.

The UMIST database images, displayed below, has uniform lighting and pose varying from side to frontal.

UMIST database montage

UMIST Results Using Eigenface in Densely Sampled Database

For these results, 20 recognition faces (one for each person) were randomly picked from the database, leaving 545 photos to use as training faces. Mp, the number of principal components to use, was chosen as 20.

The average face, eigenvalue strengths, and the 20 eigenfaces (eigenvectors) corresponding to the 20 strongest eigenvalues are displayed.

average face eigenvalues eigenfaces

Once these eigenfaces have been found, they can be used as a basis for face-space. All of the training images are projected into this face-space and the projection weights are stored as training data.

In order to recognize a face, it is projected into face-space, producing weights that are then used as features in a nearest neighbor classifier that simply finds the minimum euclidean distance between the recognition weights and the training weights.

The resulting plots follow. Note that blue text over a face denotes a correct match while red text indicates a incorrect recognition.

umist results 1 umist results 2 umist results 3

All 20 of 20 images were correctly recognized, confirming the very good performance of eigenface with densely and uniformly sampled inputs. For this same database and setup, fisherface performs very similarly.

UMIST Results Using Fisherface in Sparsely Sampled Database

For these results, 20 recognition faces (one for each person) were randomly picked from the database, then 60 more photos were used as training faces. Three training faces were picked for each person: a frontal, side, and 45-degree view.

The resulting plots follow.

umist results 1 umist results 2 umist results 3

Out of the 20 faces, 16 were correctly classified in the 1st match. Also notice that this approach is rather pose invariant — it often (13 times) picks out all 3 training images from the database.

For comparison, the next plots show the same setup run using the eigenface algorithm. Note that 14 of the 20 faces are correctly classified, and all 3 correct images are never found.

umist results 1 umist results 2 umist results 3

Clearly, the fisherface algorithm performs better under pose variation when only a few samples across pose are available in the training set.

ALAN Database Results Using Eigenface

The following images depict the original captured photos and the faces generated by manual pre-processing.

original faces preprocessed faces

Pre-processing was used to attempt to remove differences among images in lighting (by normalizing skin tone), scale, alignment, and background.

For these results, 11 recognition faces (one for each person) were randomly picked from the database, then the remaining 26 photos were used as training faces. The results before and after pre-processing are displayed.

performance improvement1

performance improvement2

While performance was still not perfect, the pre-processing improved the number of correct classifications from 6 to 9 (of 13). Fisherface performance was similar to eigenface for this database.

Our New Contributions

Eigenface vs. Fisherface for Training Data that is Densely and Sparsely Sampled Across Pose

As displayed by the UMIST results, we found that when only a few training samples across pose are available, fisherface clearly performs better than eigenface. This is not unexpected because fisherface is minimizing the within-class scatter. However, this particular detail not been clearly stated in past literature.

Fisherface Algorithm Tweak

In experimenting with the fisherface approach, we also implemented a slightly modified version of the algorithm. Our change tweaks the principal component analysis dimension reduction step in the fisherface approach.

The original algorithm, detailed in [6], suggests picking the number of principal components to keep in the PCA dimension reduction step as Mp = N - c, where N is the number of training images and c is the number of unique identities. While this choice of Mp guarantees that the resulting matrix will be non-singular, it is not the only possible choice.

By choosing Mp less than N - c, we can further reduce the dimensionality before employing the fisherface approach. Our rationale is that the “face-space” does not need the higher dimensional eigenvectors to be well represented. This adds flexibility in making the trade between using the strongest principal components versus classifying based on the within-class and between-class scatter.

This modification should improve upon both fisherface and eigenface by:

  1. improving computational speed over fisherface while still providing better classification than eigenface, and
  2. avoiding over-training on densely sampled databases.

More thorough testing on larger databases would be necessary to verify these improvements.

Summary

Table comparing eigenface to fisherface (* denotes new result)

 

Fisherface

Eigenface

Computational Complexity

slightly more complex

simple

Effectiveness Across Pose

good, even with limited data *

some, with enough data

Sensitivity to Lighting

little

very

We find that both the eigenface and fisherface techniques work very well for a uniformly and densely sampled data set varied over pose. When a more sparse data set across pose is available, the fisherface approach performs better than eigenface.

Further Work Ideas

Given more time to improve and expand our results, we would suggest getting better databases, trying other recognition techniques, and using the formal FERET test methodology. The eventual goal might be a submission to the Face Recognition Vendor Test.

We could augment our results by:

  1. Integrating with face detector.
  2. Incorporating time info in classifier.
  3. Trying SVM, kernel, ICA, wavelet, plenoptic (light-field) approaches.
  4. Acquiring & using CMU and FERET databases.

Appendix

Comments on ECE432

I enjoyed this class, ECE 432 Advanced Computer Vision. Adaption of the lecture schedule to better fit the projects people chose was a good idea because it was very helpful to get a detailed explanation about the problem and solution from the professor.

One minor improvement I’d like to see made in the future would be a bit more intermediate feedback on the final project. Maybe some earlier demo days where we show running code to classmates and ask for input in class would be useful.

MATLAB Code and Tools Used

Functions:

  • RunFaceRecog
  • [P,train] = computeEigenfaces(train, Mp, plots)
  • [P,train] = computeFisherfaces(train, trainClass, plots, P1)
  • [recog] = classifyFaces(recog, train, P, threshFace, threshClass, plots)
  • subfigure(m,n,p)

Function Name

Inputs

Outputs

Description

RunFaceRecog


user-modified
 parameters
  

n/a
  

Main function that, once edited with user parameters, reads the input data, applies pre-processing, calls the desired face recognition algorithm, and then plots the results.

computeEigenfaces


train
Mp
plots
  

P
train
  

Performs feature extraction: computing Mp eigenfaces, denoted as P, according to the training data, train. Intermediate debugging graphs are enabled by plots.

computeFisherfaces


train
trainClass
plots
P1
  

P
train
  

Performs 2-step feature extraction by using the eigenfaces (P1) and trainClass to generate a FLD subspace of fisherfaces, P.

classifyFaces


recog
train
P
threshFace
threshClass
plots
  

recog
  

Applies nearest neighbor classification approach to find recognition face weights (recog.wt) that most nearly match training face weights (train.wt).

subfigure


m
n
p
  

 

Utiltity similar to subplot that allows easy positioning of figures in an m by n grid, with p denoting the current figure number, ranging from 1 to m*n.

MATLAB by the MathWorks was used for mose of the software development. Imagemagick, the GIMP, and Markdown were used in preparing the data and documentation.

All of the functions required to run version V of the simulation (RunFaceRecog, computeEigenfaces, computeFisherfaces, and classifyFaces) are available in matlab m-file and html formats.

A supporting function called subfigure that nicely arranges the plots is also available in matlab m-file and html formats.


Cited References

[1] W. W. Bledsoe, “The model method in facial recognition,” Panoramic Research Inc., Tech. Rep. PRI:15, Palo Alto, CA, 1964.

[2] M. D. Kelly, “Visual identification of people by computer,” Tech. Rep. AI-130, Stanford AI Proj., Stanford, CA, 1970.

[3] T. Kanade, Computer Recognition of Human Faces. Basel and Stuttgart: Birkhauser, 1977.

[4] M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cognitive Neuroscience, vol. 3, no. 1, 1991.

[5] M. Turk and A. Pentland, “Face recognition using eigenfaces,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 1991, pp. 586-591.

[6] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisherfaces: recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.

[7] R. Gross, I. Matthews, S. Baker, “Appearance-based face recognition and light-fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 4, pp. 449-465, April 2004.

[8] R. Chellappa, C. L. Wilson, and S. Sirohey, “Human and machine recognition of faces: A survey,” Proc. IEEE, vol. 83, pp. 705-740, 1995.

Other References

[9] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. New York: Wiley, 2001.

[10] D. A. Forsyth, J. Ponce, Computer vision: a modern approach. New Jersey: Prentice Hall, 2003.