A MONOGENIC LOCAL GABOR BINARY PATTERN FOR FACIAL EXPRESSION RECOGNITION

The paper implements a monogenic-Local Binary Pattern (mono-LBP) algorithm on Local Gabor Pattern (LGP). The proposed approach initially features from the samples using LGP at different scales and orientation. The extracted LGP features are further enhanced by decomposing it into three monogenic LBP channels before being recombined to generate the final feature vector. Different Normalization schemes are applied to the final feature vector. Two best performing normalization algorithms with mono-LBP are fused at score level to obtain an improved performance using K-Nearest Neighbor classifier with L1-norm as a distance metrics. Moreover, performance comparison is done with other variants of LGP algorithm and also the effects of various normalization techniques are investigated. Experimental results from JAFFE and TFEID facial expression databases show that the new technique has improved performance compared to its counterparts.


INTRODUCTION
Facial Expression Recognition (FER) has recently been one of leading field drawing a lot of interests and attentions of the researchers in the field of computer vision and pattern recognition.This may not be unconnected to the need for human-machine interaction (HMI), surveillance systems, robotics applications and many others (Chao et al., 2015).Quiet a handful number of feature extraction and classifier algorithms have been proposed and implemented in this field (FER).Gabor kernel has been one of the most robust feature extraction algorithm and widely exploited in FER and face recognition due to its ability to approximate receptive fields of simple cells in the primary visual cortex of human eyes, multi-resolution approach and direction selectivity (Chao et al., 2015).
Following successful implementation of Gabor kernels in iris recognition by 2001 (Doughman, 2001) and coupled with the success of local binary pattern (LBP) algorithm, several variants of Gabor algorithms emerged over times.These Gabor variants are sometimes referred to as Local Gabor Patterns (LGP).LGP algorithms exploit various Gabor feature channels such as magnitude, phase, imaginary and real channels.For instance, (Yanxia and Bo, 2010) proposed Local Gabor Binary Patterns (LGBP) which encodes Gabor magnitude with LBP operator at different resolution and orientations to form the feature vector.The proposed LGBP was reported to have improved performance for face recognition.In (Zhang et al., 2010), the authors proposed Local Gabor Phase pattern (LGPP) variants and applied it for face recognition.
LGPP essentially encodes both real and imaginary parts of the Gabor features using Douglas method and then the result is further encoded using what is called Local XOR Pattern (LXP).In search for robustness and improved performance, other LGP were proposed such as Histogram of Gabor Phase Pattern (HGPP), Local Gabor Phase Difference Pattern (LGPDP) and a host of others which are quite relevant to specific problems.In general, these LGP algorithms come with additional cost of computation, extensive memory usage and in most cases, feature vector dimensionality reduction becomes necessary.
A rotation invariant monogenic LBP which was proposed for texture classification in (Zhang et al., 2010) is used in this work.Instead of encoding the Gabor magnitude channels with LBP as is the case in (LGBP), we encoded these channels with monogenic LBP which, within the context of this work, is referred to mono-LGBP.Furthermore, the results are computed at different resolution (scales) of the Gabor kernel under different normalization algorithms.At each scale, results of the proposed method with the best two performing normalization technique are fused at the score level to obtain the overall performance of the method.
The paper is divided into five sections.Section I covers the introduction while section II briefly discusses Gabor kernel, LBP and M-LBP and normalization schemes.Section III describes the proposed approach and section IV presents the experimental results.Section V summarizes the findings.

FEATURE EXTRACTION AND NORMALIZATION SCHEMES
A brief literature background on the feature extraction operators and normalization schemes deployed in the course of this work are discussed below.

Gabor Wavelet Transform
Gabor filter is basically a modulation of a Gaussian function with a sinusoidal plane wave.Therefore the result of convolution of Gabor kernel,  , () with an image, () is represented as  , () in Eqn. 1.
Here,  = (, ) which is the 2D pixel's index along  and  plane and operator ' * ' is the 2D convolution operator. and  are the orientation and the scales of the kernel, respectively.The kernel is defined as: where ‖. ‖ is the norm operator and  is the standard deviation of the distribution.The vector  , is defined as: where   =   /  and   = /8 ;   is the maximum frequency,   is the kernel's orientation and  is the spacing between the kernels in the frequency domain (Eleyan et al., 2008;Liu and Wechsler, 2003;Lyon et al., 1998;Cootes et al., 1995).

Local Binary Pattern
Due to its relative simplicity, LBP has been applied successfully in many applications.The algorithm uses 3 × 3 windows of neighborhood pixels in the image to determine the new value of a pixel being considered (Ahonen et al., 2006;Ojala et al., 2010, Tran et al., 2014).Consider Figure 1, initially, the algorithms probes the 8-neighbood pixels around pixel .Any pixel greater than  is assigned a binary bit value 1 otherwise, assigned bit value 0. An 8-bit code is generated and is converted to decimal and recorded as the new value for .The operation is applied to all the pixels in the image.

Figure 1. LBP operator
The LBP code for pixel  can be computed by arranging the results of the operation starting from top-left corner clockwise is '01011000' which is equivalent to 152 in decimal.

Local Binary XOR Operator
LXP is very similar to LBP except that it applies XOR to 3 × 3 pixels neighborhood to decide the new value of a pixel.Due to the fact that it applies XOR operator the pixels values must be converted to zeros and ones before being applied (Zhang et al., 2007).For instance results from an image convolved with Gabor kernel may be formatted to logical by deciding that any value greater than zero is assigned a logical zeros while those with zeros and below are assigned logical ones.Figure 2 shows how LXP is applied to the logically formatted image.For the new value of  is to be determined, all the 8neighborhood pixels are XOR-ed with  and the resulting 8 bit codes are converted to decimal.

Monogenic Local Binary Pattern
The motivation for this algorithm comes from the monogenic signal theory.It combines the local phase information, the local surface type information, and the traditional LBP to improve the performance of LBP in texture classification (Zhang et al., 2010).Based on this theory, three features are combined together to form monogenic 3-D texton feature vector to determine monogenic LBP.These features are; phase,   , rotation invariant uniform pattern LBP, LBPriu2 and the monogenic curvature tensor Sc based on higher order Riesz transforms.Eqn.4-6 describe these features.For more details refer to (Zhang et al., 2010). where; (5) Superscript "riu2" means the use of rotation invariant "uniform" patterns that have  value of at most 2; s is the sign function;   corresponds to the gray value of the center pixel of the local neighborhood and   ( = 0, … ,  − 1) correspond to the gray values of  equally spaced pixels on a circle of radius .
The last parameter   is defined by: where det(  ) is the determinant of the monogenic curvature tensor.

Normalization Operators
Normalization techniques are quite often being used without much regards to the effect they can have on the general statistical distribution of the vectors to be normalized (Ribaric and Fratric, 2006;Nandakumar et al., 2005).For instance, in fusion of the score levels of various classifiers, a normalization scheme can be deployed to bring the scores within the same range.But in a vector sense, the normalization algorithm is more of a vector transform from one vector space to another.Hence the choice for a compatible normalizer becomes important as this may distort the vectors there by improving or decreasing the class separability between two distinct class vectors.Due to this fact, we investigated some of the most common normalization techniques to show how they affect class vectors distribution.Four normalization techniques are examined in this paper.

Z-Score Normalization
It is one of the most common normalization schemes.It uses the arithmetic mean and standard deviation of the vector.Z-score has a record of good performance on a set of data with Gaussian distribution.However, it is not robust due to the fact that it depends on the mean and standard deviation of the data which are both sensitive to outliers (Ribaric and Fratric, 2006).For a data point   , Z-score computes the new normalized value   ′ , using Eqn.8.
where  and  are the mean and standard deviation of the distribution respectively.

Min-Max Normalization
Is one of the simplest of all the normalization techniques.This operator shifts the data sets within an interval [0, 1].It can easily be seen that this technique is also not robust because presence of outliers in the distribution may affect the contribution of the majority datasets.Eqn. 9 defines min-max operator.
where  is the maximum data value of the distribution, and  is the minimum data value of the distribution.

Median-MAD Normalization
The median and median absolute deviated as abbreviated (Median-MAD), are less sensitive to outliers and points at the extreme ends of the distribution.Therefore, this technique is robust.However, for distributions other than Gaussian, median and MAD are poor estimates of the location and the scales parameters (Ribaric and Fratric, 2006;Sigdel et al., 2014, Nandakumar et al., 2005).Therefore, the scheme does not preserve the original distribution and does not transform the datasets into a common numerical range (Ribaric and Fratric, 2006).The equation below defines the median-MAD operation.

𝑆 𝑘
where median is the median of the distribution and MAD is the median of the absolute deviation from the median defined as  = (|  − |).

Tangent-hyperbolic (Tanh) Normalization
Tanh normalization has been successfully used in many normalization schemes (Nandakumar et al., 2005).The tanh estimator is robust and very efficient.It is defined as; where   and   are the mean and standard deviation estimates, respectively.Quite a number of normalization schemes do exist, for example Decimal Scaling normalization which is useful for data in logarithmic scales and Euclidean normalization.The ability of particular normalization algorithm to capture statistical distribution of a dataset will make it worthwhile.

PROPOSED APPROACH
The proposed approach has both encompasses the critical stages of a facial expression algorithms.Initially During feature extraction, the proposed approach extract Gabor features from each sample using different orientation (i.e. = 8) and scale (1 to 3) of the Gabor filter.For each Local Gabor features extracted at a specified orientation and scale, monogenic LBP is further applied to the extracted Gabor Features using equations (4, 6 and 7).The three monogenic LBP features ( , 2 ,   ,    ) are combined together to form a single monogenic 3-D texton feature vector.The monogenic 3-D texton feature vector is adopted as the final feature vector known as the mono-LGBP algorithm in the context of this work.Moreover, as a way of exploiting performance, a normalization scheme is applied at the feature level of the proposed approach.Hence different normalization approaches are applied to the mono-LGBP feature vector before classification.Four different normalization techniques are investigated as explained in chapter 3.
In the classification stage, KNN has being used with l2-norm as the distance measure.Each Euclidean representation of mono-LGBP feature with different normalization algorithm is classified separately using KNN.Based on the performance of the normalization representation of the mono-LGBP feature, two of the best normalization schemes are fused at score level of the classifier using a simple sum rule to obtain a better performance.

EXPERIMENTAL RESULTS
The proposed algorithm is implemented using two different facial expression databases which include Japanese Female Facial Expression JAFFE (Lyon et. al 1998) and Taiwanese Facial Expression Image Database TFEID (Chen and Yen, 2007).JAFFE database contains a total of 213 samples images of seven basic facial expressions (i.e.Neutral, Happy, Sad, Surprise, Anger Disgust and Fear) collected from 10 different subjects.The number of samples per expression in each subject ranges from 2 to 4. While the current public TFEID database consist of facial images from 20 male models each acquired During training all the sample images were grouped into emotions classes (i.e. 7 for JAFFE and 8 for TFEID) irrespective of subjects to which they belong to (i.e.person-independent).Using Leave-One-Pose-Out (LOPO) procedures, one sample is drawn from each class for training while the remaining samples are used for training.This process is repeated and rotated until each sample is uniquely use as a training set.The overall performance is given as the average performance of the entire number of times the training is repeated.

Sum Rule Fusion
Table 1 to 6 display the experimental results from the proposed approach in comparison to its counterparts.Results from different normalizations schemes and three other variants of LGP algorithms (Gabor-magnitude features, LGBP and LGPP) were implemented to compare the results with the proposed approach for the two database used.The same experimental procedures were adopted in throughout the experiments.Similarly, results from the fusion of the proposed mono-LGBP were also included to show the leverages of the fusion techniques over the non-fusion approach.Figure 4 shows training samples from JAFFE database of four different subjects with 7 basic facial expressions (e.g.Neutral, Happy, Sad, Surprise, Anger, Disgust and Fear) from left to right, respectively.

Results Discussion
It is worth noting that the proposed mono-LGBP algorithm performance increases with the increase of Gabor scales (see Tables 1-6).The fused results from Z-score and Tanh normalization algorithms gives a better performance.This is because in mono-LGBP, each of the two normalization schemes has been able to uniquely recognize some poses which are not being recognize by the other.Hence, the fusion of these results will lead to improving performance.The same cannot be said for the other LGP.For example, Gabor-magnitude (Gabor-mag) has the best result with all the Z-score, Tanh and min-max normalizations but unfortunately, they all pointed at the same recognition classes.Therefore fusing their results does not improve the performance.The same for LGPP and LGBP.Z-Score+tanh = 97.9

CONCLUSION
A new approach for facial expression recognition was proposed and implemented.The performance of the proposed approach was compared with the existing LGP algorithms using different normalization schemes.The new approach was able to achieve better performance approximately 92.8% with JAFFE database and 97.9% with TFEID database.The results are comparable to the best-known results of facial expression recognition on JAFFE database and TFEID in the literature using KNN as a classifier.The normalization schemes further indicate that a great deal of performance can be realized with a proper application of normalization algorithm to extracted feature vectors.The results also confirmed the effectiveness of the fusion technique deployed in the proposed approach.

Figure 2 .
Figure 2. LXP operator Figure depicts the flowchart of the proposed approach.

Figure 3 .
Figure 3. Proposed approach Flowchart view between 0  to 45  .It constitutes 8 expressions with contempt as the eighth expression in addition to the seven basic expressions found in JAFFE.

Figure 4 .
Figure 4. Examples of images from the JAFFE Database