Facial Recognition System-Computer Intelligence-Project Report, Study Guides, Projects, Research of Machine Learning

This is project report related to Computer Intelligence course. It was submitted to Arvind Jayantilal at Aliah University. Its main points are: Expressions, Facial, Development, FER, System, Ensemble, Classifiers, Human, Computer, Interface, HCI

Typology: Study Guides, Projects, Research

2011/2012

Uploaded on 07/15/2012

sachii
sachii 🇮🇳

4.8

(4)

42 documents

1 / 25

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Chapter 1. Introduction
Expressions give a solid manifestation of personality, cognitive activity, and intension
of a person which plays an important role in interpersonal relations. Moreover
expressions help us to identify overall mood of the person. As according to Mehrabian
[1] the verbal part of communication gives 7% to the effect of message while vocal
part gives 38% contribution to the effect and here expressions contribute to 55% of
the total effect of the message. This high contribution shows the importance of
expressions for better communication.
The development of facial expression system is very important for computer
vision/machine vision area. The recent advancement in image analysis has open up
the possibility of detection and classification of emotional and conversational facial
signals. In this regards, a facial expression analysis could bring facial expressions into
man-machine interaction. This makes the interaction tighter and more efficient. Such
a system could also make classification of facial expressions widely accessible as a
tool for research in behavioral science and medicine.
So a Facial Expression Recognition (FER) system enables computers to communicate
with humans in the same way, as humans interact with each other. In this era of
science, HCI (Human Computer Interface) is very important for many things to deal
with such as in pain detection, where computers detect expression of person, and
work accordingly. FER can also be used to train computers how to interact with other
humans by giving expressions. As in the modern countries, robots work and complete
a lot of daily life tasks and a lot of work is done in order to make them realize
emotions. So a perfect communication of humans with robots is achieved.
1.1 Basic Terminologies
Some basic terminologies are discussed here, which will be used in the whole
document.
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19

Partial preview of the text

Download Facial Recognition System-Computer Intelligence-Project Report and more Study Guides, Projects, Research Machine Learning in PDF only on Docsity!

Chapter 1. Introduction

Expressions give a solid manifestation of personality, cognitive activity, and intension of a person which plays an important role in interpersonal relations. Moreover expressions help us to identify overall mood of the person. As according to Mehrabian [ 1 ] the verbal part of communication gives 7% to the effect of message while vocal part gives 38% contribution to the effect and here expressions contribute to 55% of the total effect of the message. This high contribution shows the importance of expressions for better communication. The development of facial expression system is very important for computer vision/machine vision area. The recent advancement in image analysis has open up the possibility of detection and classification of emotional and conversational facial signals. In this regards, a facial expression analysis could bring facial expressions into man-machine interaction. This makes the interaction tighter and more efficient. Such a system could also make classification of facial expressions widely accessible as a tool for research in behavioral science and medicine. So a Facial Expression Recognition (FER) system enables computers to communicate with humans in the same way, as humans interact with each other. In this era of science, HCI (Human Computer Interface) is very important for many things to deal with such as in pain detection, where computers detect expression of person, and work accordingly. FER can also be used to train computers how to interact with other humans by giving expressions. As in the modern countries, robots work and complete a lot of daily life tasks and a lot of work is done in order to make them realize emotions. So a perfect communication of humans with robots is achieved.

1.1 Basic Terminologies

Some basic terminologies are discussed here, which will be used in the whole document.

1.1.1 Facial Expression Recognition System

An effective system which can detect expressions in facial images. This system uses machine learning, in order to be first trained, and then give results on unknown samples.

1.1.2 Ensemble classifiers

These methods use different models which then give performance that would be much better than that of a single model

1.1.3 Human Computer Interface (HCI)

A field of research where various algorithms are made to interact computers with humans in order to detect emotions by the machines using machi9ne learning techniques.

1.2 Problem Statement

The three basic problems related to facial expression analysis are:

 Face detection in a facial image.  Facial expression data extraction.  Facial expression classification using various machine-learning approaches.

For an efficient recognition system of facial expressions, more informative features are required. Previous works have shown that as the number of classes increase, the problem domain moves to a more difficulty level.

1.3 Requirements of FER system

1.3.1 Hardware and Software Requirements

The optimal hardware requirement for this project is Pentium IV 2.8GHz with 512MB ram while software requirement is Matlab version 7or upgraded.

1.3.2 Accuracy

Mostly our data is critical, so we need a system which is accurate and reliable. Accuracy of the system is directly proportional to robustness.

Chapter 2. Existing Methods

2.1 FER Based on Local Gabor Filter Bank and PCA plus

LDA

Hong-Bo Deng et al proposed this method [ 2 ]. According to this method, Gabor wavelet is used for feature extraction [ 3 ] which returns a lot of features and also takes much time. Then those features are reduced using Principal Components Analysis and Linear Discriminant Analysis. JAFFE dataset is used for training and testing and then adopted minimum distance classifier to recognize facial expression.

2.2 FER Based on Fusion of Multiple Gabor Features

This method takes the advantage of different channel approximation of Gabor filters for different facial expression so that reasonable combination of the features can improve the performance of a facial expression recognition system [ 4 ]. This method also uses JAFFE dataset for training and testing samples.

2.3 FER Based on Wavelet Energy Distribution Feature and

Neural Network Ensemble

As each image has its wavelet energy in its various components dependent upon the value of pixels at that place. So using this technique to calculate features, this technique provides a robust method [ 5 ]. Original image is deformed into 3-level wavelet and can be shown by figure 1.

Figure 1: The original image and 3-level 2D wavelet [ 6 ]

This method can be summarized in figure 2.

Figure 2: The facial expression recognition flowchart [ 6 ]

2.4 FER based on weighted principal component analysis

and support vector machines

This technique employed the Weighted Principal Components Analysis (WPCA) [ 7 ] with multi-features to extract the facial expression feature and the SVMs to c1assity human facial expression. Facial Action Coding System (FACs) is used to determine the weights for WCPA. The points for FACs are shown in figure 3.

Figure 3: Facial features localization used in face detection and facial [ 8 ]

3.1.2 Number of classes

Previous work has shown that most of the work on facial expressions was of about two to three expressions but now times have changed. So we selected to work on 7 basic expressions which are:  Anger  Disgust  Fear  Happy  Neutral  Sad  Surprise So this is a multiclass problem as 7 classes have been selected.

3.2 Image preprocessing

3.2.1 Image resize

All images have been resized to 100*100 resolutions for better extraction of features

3.2.2 Histogram Equalization

The resized images are passed through another stage of preprocessing i-e histogram equalization. This stage also increases possibility for good features extraction

3.3 Feature Extraction

In most of the facial expression techniques, feature extraction is the most crucial part. In past wavelet has been used as the most successful technique for feature extraction for all types of facial expression models but it has also a lot of disadvantages. It work best for point singularities but the time comes when we have to deal with line singularities, wavelet fails. Here comes the new technique proposed by cande’s and Donoho [ 10 ], the Curvelet transform which is the descendent of wavelet but have a lot of properties which make it very much useful for facial expressions because it can handle line or curve singularities very easily so as to deal with the curves on the faces for expressions detection

3.3.1 Curvelet transform

The curvelet transform is a multiscale directional transform that allows an almost optimal nonadaptive sparse representation of objects with edges [ 11 ].

3.3.1.1 Discrete Curvelet Transform via Wrapping

There are mainly two algorithms for curvelet transform. The first algorithm is the Unequispaced FFT Transform, where the curvelet coefficients are found by irregularly sampling the Fourier coefficients of an image. The second algorithm is the Wrapping transform, using a series of translations and a wraparound technique. Both algorithms having the same output, but the Wrapping Algorithm gives both a more intuitive algorithm and faster computation time. Because of this, we have used the Wrapping DCT method.

3.3.1.2 Algorithm for DCT via wrapping

The algorithm for the Discrete Curvelet Transform via wrapping [ 12 ] can be summarized in the following points.  Take FFT of the image  Divide FFT into collection of Digital Corona Tiles (Figure 5.)  For each corona tile o Translate the tile to the origin (Figure 6.) o Wrap the parallelogram shaped support of the tile around a rectangle centered at the origin (Figure 7.). o Take the Inverse FFT of the wrapped support o Add the curvelet array to the collection of curvelet coefficients.

Figure 5: Digital Corona of the Frequency Domain [ 12 ]

Figure 8: Discrete curvelet coefficients

3.4 Dimension Reduction

The curvelet transform is very suitable for fast feature extraction and optimally sparse. But the collected coefficients are in a very large number. So those coefficients have to be reduced by applying some suitable technique. Previous works have shown the use of Principal Components Analysis [ 13 ], Linear Discriminant Analysis and Independent Components Analysis for feature reduction. So we selected PCA for our proposed method to reduce dimensions.

3.4.1 Principal Components Analysis

Principal Components is a method to reduce features or dimensions. It uses a transformation which is orthogonal. It means that it converts a set of correlated variables into uncorrelated variables which are also known as principal components. These principal components are less than the original variables. The first principal component has maximum variance and it continues to decrease as we go to next principal components.

3.4.2 Proposed step of Dimension Reduction

Since this technique is using ensemble classifiers approaches. So number of features retained, depends upon the classifier and is different for each classifier.

3.5 Training and testing data selection

A lot of techniques are available for the selection of training and testing data. In the proposed method, we have selected 2 main categories

3.5.1 Leave one out cross validation

According to its name, leave one out cross validation uses all observations as training data and leaves only one as testing or validation data. Then that observation is included and any other observation is excluded and then trained on that set and test on the excluded observation. This process continues until whole of the observations are validated. This process requires more computations and is computationally expensive.

3.5.2 Manual training testing data division

Manual training testing division is also done. Previous works have shown a lot of ratios of this division particularly 60:49. We have selected 64:36 for better performance. Which means that every out of 213 samples for Japanese Female Facial Expression model, which is the selected dataset in our case, 137 have been used for training while remaining 76 have been used for testing phase.

3.6 Individual Classifier training and testing

Following classifiers have been used for our Facial Expression Recognition model.  Support Vector Machines (SVMs)  K-Nearest Neighbor (KNN)  Artificial Neural Network (ANN)

3.6.1 Support Vector Machine

Support Vector Machine is a method used to recognize unknown patterns for classification. Firstly it takes data to be trained and then predicts on unknown data. Actually is makes a haperplane, or many hyperplanes to transform data to a high dimensional space where classification is achieved. Now that hyperplane has the largest distance to the closest point of any class. Simple hyperplane with data points from two classes can be shown by figure 9.

Figure 9: Simple SVM hyperplane

Figure 10: ANN simple structure

3.6.4 Back Propagation Neural Network

Back propagation neural network is a network that is feed forward as well. It means that I will train data in the forward direction and calculates error during that phase. When it reaches to the final layer, the output layer, then it iterates back, reducing that error.

Figure 11: Simple ANN with hidden and output layer

3.7 Ensemble

When all the above classifiers will be trained and tested, Ensemble will be taken of the above classifiers on the basis of majority voting. Other than majority voting, many other techniques are available for ensemble making like boosting and bagging but majority voting is the simplest of all and very less time taking in comparison with others.

3.8 Summary of proposed methodology

Summary for leave one out strategy can be shown by figure 12.

Figure 12: Summary for leave one out cross validation

Chapter 4. Results and Discussion

4.1 Leave one out cross validation

Following are the results of each classifier for leave one out cross validation

4.1.1 K-Nearest Neighbor

KNN works according to the data points in neighbors. Following are the results (table

  1. obtained by using K=1, K=3 and K=5. Number of features taken is also shown.

Table 1: Results of leave one out for different number of K

Experiment

Number of features

Number of nearest neighbors

Recognition rate

Curvelet + PCA features

4.1.2 Support Vector Machines

SVM have 4 kernel functions which make it a non-linear classifier. Three kernels are used and the corresponding accuracy and number of features are given in table 2.

Table 2: Results of leave one out for different kernels of SVM

Experiment with Number of features

Kernel function

Recognition rate

Curvelet + PCA features

79 Linear 92 % 67 Polynomial 92 %

4.1.3 Artificial Neural Network

In our case ANN uses back propagation neural network to train the neurons and then tested and the results are tabulated in table 3:

Table 3: Results of leave one out for BPNN

Experiment Number of features

Number of hidden layers

Number of Neurons

Recognition rate

Curvelet + PCA features 70 1 40 82.63%

4.1.4 Individual class accuracy

As there are 7 classes for which we were working on. So separate class accuracies are also measured and tabled here in table 4.

Table 4: Individual class accuracies for leave one out method Anger Disgust Fear Happy Neutral Sad Surprise Average KNN- K=1 93.3%^ 79.3%^8 4.4%^71 %^100 %^ 77.4%^ 96.7%^ 91.1% KNN- K=3 86.7%^ 72.4%^ 65.6%^ 58.1%^ 86.7%^42 %^ 66.7%^ 78.4% KNN- K=5 66.7%^ 31.1%^ 68.8%^51.^6 %^ 63.3%^ 35.5%^ 36.7%^ 62.9% SVM- Lin^100 %^ 93.1%^ 93.8%^ 74.2%^ 100%^9 0.3%^ 93.3%^92 % SVM- Poly

BPNN 85% 91.2% 70.6% 78.9% 75.1% 81.2% 86% 82.6%

4.2 Manual Data Selection

Training and testing data is divided in ratio 64:36. Following tables show the results for each classifier.

4.2.1 K-Nearest Neighbor

By training images, and testing on KNN, following results (table 5) are obtained along with their number of features selected.

  • Figure 15: Graph for k =
  • Figure 16: Graph for k =

4.2.2 Support Vector Machines

Three different kernels are used for manual selection of training and testing data and their results are compiled in table 6.

Table 6: Results of manual training/testing for different SVM kernels Experiment with Number of features

Kernel function

Recognition rate

Curvelet + PCA features

11 RBF 13.11%

64 Linear 92.11% 62 Polynomial 92 .1 0 %

4.2.2.1 Feature Selection

Following graphs (figure 17-18) shows the recognition rate in correspondence with the number of features after applying PCA. The red spot shows the maximum accuracy spotted. For linear kernel, maximum accuracy was achieved at 64 features (figure 17). Similarly for polynomial kernel, number of features are 67 (figure 18) for maximum accuracy.

Figure 17: Graph for Linear kernel