a revisit histogram of oriented descriptor for facial

12
Research Article A Revisit Histogram of Oriented Descriptor for Facial Color Image Classification Based on Fusion of Color Information Huy Nguyen-Quoc and Vinh Truong Hoang Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City, Vietnam Correspondence should be addressed to Vinh Truong Hoang; [email protected] Received 30 August 2021; Revised 27 October 2021; Accepted 18 November 2021; Published 30 November 2021 Academic Editor: Ying-Ren Chien Copyright © 2021 Huy Nguyen-Quoc and Vinh Truong Hoang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Histogram of Oriented Gradient (HOG) is a robust descriptor which is widely used in many real-life applications, including human detection, face recognition, object counting, and video surveillance. In order to extract HOG descriptor from color images whose information is three times more than the grayscale images, researchers currently apply the maximum magnitude selection method. This method makes the information of the resulted image is reduced by selecting the maximum magnitudes. However, after we extract HOG using the unselected magnitudes of the maximum magnitude selection method, we observe that the performance is better than using the maximum magnitudes in several cases. Therefore, in this paper, we propose a novel approach for extracting HOG from color images such as Color Component Selection and Color Component Fusion. We also propose the extended kernels in order to improve the performance of HOG. With our new approaches in the color component analysis, the experimental results of several facial benchmark datasets are enhanced with the increment from 3 to 10% of accuracy. Specically, a 95.92% of precision is achieved on the Face AR database and 75% on the Georgia Face database. The results are better more than 10 times compared with the original HOG approach. 1. Introduction Nowadays, image classication is one of the most extensive elds in computer vision which attracts the attention of many researchers because of its wide range of application in real life such as human detection, facial recognition, object classication, and diagnose diseases in medical. Many local and global image descriptors have been proposed in order to handle this task [13]. The key step is to nd a robust descriptor which can discriminate classes. A large number of descriptors have been proposed to extract feature from image, including Color Local Binary Pattern (LBP) [4], Scale Invariant Feature Transform (SIFT) [5], Histogram of Oriented Gradients (HOGs) [6], and GIST [7]. Among them, HOG is a successful descriptor with various applica- tions in real life, including pedestrian detection, face recognition, object classication, security, and industrial inspection. For example, Ding et al. [8] fuse HOG features and global normalized histogram for human detection task by the AdaBoost classier. Qi et al. [9] apply HOG for rail- way track detection by using region-growing methods. Qingbo et al. [10] combine HOG features and Discrimina- tive Multimanifold Analysis Method (DMAM) for face rec- ognition in few shot learning context. In this approach, facial features were extracted from image patches fusing and then applied DMAM to transform to lower dimensional space. Chowdhury et al. [11] introduced an improved ver- sion of HOG for human detection. Nabila et al. [12] present an optimized version of HOG for road car detection. This approach is based on the concatenation of shape features and motion integration. Aytac et al. [13] extract LBP and HOG features from stomach cancer images. Then, multiple reduction techniques are applied to select the most useful attributes. Hmood et al. [14] introduce an improved version of HOG by proposing an approximate windows that can cover whole object, namely, Dynamic-HOG. This method requires less processing time and achieves a higher accuracy for coin classication. Jebril et al. [15] apply HOG for Hindawi Journal of Sensors Volume 2021, Article ID 6296505, 12 pages https://doi.org/10.1155/2021/6296505

Upload: others

Post on 23-Apr-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Revisit Histogram of Oriented Descriptor for Facial

Research ArticleA Revisit Histogram of Oriented Descriptor for Facial ColorImage Classification Based on Fusion of Color Information

Huy Nguyen-Quoc and Vinh Truong Hoang

Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City, Vietnam

Correspondence should be addressed to Vinh Truong Hoang; [email protected]

Received 30 August 2021; Revised 27 October 2021; Accepted 18 November 2021; Published 30 November 2021

Academic Editor: Ying-Ren Chien

Copyright © 2021 Huy Nguyen-Quoc and Vinh Truong Hoang. This is an open access article distributed under the CreativeCommons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided theoriginal work is properly cited.

Histogram of Oriented Gradient (HOG) is a robust descriptor which is widely used in many real-life applications, includinghuman detection, face recognition, object counting, and video surveillance. In order to extract HOG descriptor from colorimages whose information is three times more than the grayscale images, researchers currently apply the maximummagnitude selection method. This method makes the information of the resulted image is reduced by selecting themaximum magnitudes. However, after we extract HOG using the unselected magnitudes of the maximum magnitudeselection method, we observe that the performance is better than using the maximum magnitudes in several cases.Therefore, in this paper, we propose a novel approach for extracting HOG from color images such as Color ComponentSelection and Color Component Fusion. We also propose the extended kernels in order to improve the performance ofHOG. With our new approaches in the color component analysis, the experimental results of several facial benchmarkdatasets are enhanced with the increment from 3 to 10% of accuracy. Specifically, a 95.92% of precision is achieved on theFace AR database and 75% on the Georgia Face database. The results are better more than 10 times compared with theoriginal HOG approach.

1. Introduction

Nowadays, image classification is one of the most extensivefields in computer vision which attracts the attention ofmany researchers because of its wide range of applicationin real life such as human detection, facial recognition,object classification, and diagnose diseases in medical. Manylocal and global image descriptors have been proposed inorder to handle this task [1–3]. The key step is to find arobust descriptor which can discriminate classes. A largenumber of descriptors have been proposed to extract featurefrom image, including Color Local Binary Pattern (LBP) [4],Scale Invariant Feature Transform (SIFT) [5], Histogram ofOriented Gradients (HOGs) [6], and GIST [7]. Amongthem, HOG is a successful descriptor with various applica-tions in real life, including pedestrian detection, facerecognition, object classification, security, and industrialinspection. For example, Ding et al. [8] fuse HOG featuresand global normalized histogram for human detection task

by the AdaBoost classifier. Qi et al. [9] apply HOG for rail-way track detection by using region-growing methods.Qingbo et al. [10] combine HOG features and Discrimina-tive Multimanifold Analysis Method (DMAM) for face rec-ognition in few shot learning context. In this approach,facial features were extracted from image patches fusingand then applied DMAM to transform to lower dimensionalspace. Chowdhury et al. [11] introduced an improved ver-sion of HOG for human detection. Nabila et al. [12] presentan optimized version of HOG for road car detection. Thisapproach is based on the concatenation of shape featuresand motion integration. Aytac et al. [13] extract LBP andHOG features from stomach cancer images. Then, multiplereduction techniques are applied to select the most usefulattributes. Hmood et al. [14] introduce an improved versionof HOG by proposing an approximate windows that cancover whole object, namely, Dynamic-HOG. This methodrequires less processing time and achieves a higher accuracyfor coin classification. Jebril et al. [15] apply HOG for

HindawiJournal of SensorsVolume 2021, Article ID 6296505, 12 pageshttps://doi.org/10.1155/2021/6296505

Page 2: A Revisit Histogram of Oriented Descriptor for Facial

handwritten character recognition from grayscale imagestransformed from RGB color space. The authors used bothR-HOG and C-HOG with different windows. Uddin et al.create another HOG version named T20-HOG with theability to extract the textural features from seed varietiesfor identification [16]. A two-stage classification process isproposed by Chandrakala and Devi in combination withHOG descriptors [17]. With the robustness of deep neuralnetworks and the advantage of HOG, Hung created a hybridcombination of HOG and CNN named HOG-CNNapproach with promising results [18].

Various advancement versions of HOG are proposed inrecent years by incorporating with convolutional neuralnetworks (CNNs). Zaffar et al. [19] present CoHOG whichis based on CNN in order to extract features from ROI forvisual place recognition. Xiong et al. [20] introduce amethod for handling depth information of images basedon HOG, namely, Histogram of Oriented Depth (HOD)features. The proposed approach is applied for pedestriandetection by combining color image edge information andHOD. Wang et al. [21] present an approach based on HOGand multiorientation computation, namely, MO-HOG. This

181

Three-component color image

Component C1

173

188

183

172

Compare

132

125

143

135

121 131

122

139

133

120

Component C2 Component C3

M(x, y) =

M(x, y) = Mmax = MC2

+𝛥√ 2 𝛥2

𝛼(x, y) = arctan𝛥y

𝛥x

MC1 = 18, 86𝛼C1 = 32°

MC2 = 24, 16

MC1, MC2, MC3

𝛼C2 = 24°MC3 = 21, 95𝛼C3 = 30°

𝛼(x, y) = 𝛼C2

yx

Figure 1: An illustration of maximum magnitude selection method.

2 Journal of Sensors

Page 3: A Revisit Histogram of Oriented Descriptor for Facial

method combines two-part gradient information to buildMO-HOG and applied for facial expression recognition. Pan[22] combines HOG and CNN for facial expression classifica-tion. In this approach, HOG features are extracted fromCNN’s shallow features. Joshi et al. [23] present a methodfor selecting optimal parameter for HOG descriptor. A multi-

level HOG based on sign language image is applied for severalwindows size of 30 × 30 and 20 × 20. Wang et al. [24] concat-enate HOD and the Zernike moment’s features of SAR imagesfor terrain matching.

Furthermore, HOG has outperformed other descriptorsin the classification stage for grayscale image. However,

Gradient magnitude is computed on component C1.Gradient direction is computed on component C2.

181

Component C1

173

188

183

172

𝛼(x, y) = arctan

𝛼(x, y) = 𝛼C2 = 24°

M(x, y) = MC1 = 18, 86

132

125

143

135

121

131

122

139

133

120

Component C2

Component C3

181

Component C1

173

188

183

172

132

125

143

135

121

Component C2

Gradient magnitude is computed on component C2.Gradient direction is computed on component C1.

Both gradient magnitude and direction are computed on component C3.

𝛥y

𝛥x

𝛼(x, y) = arctan𝛼(x, y) = 𝛼C2 = 32°

𝛼(x, y) = 𝛼C3 = 30°

M(x, y) = MC1 = 24, 16

M(x, y) = MC3 = 21, 95

𝛥y

𝛥x

𝛼(x, y) = arctan𝛥y

𝛥x

M(x, y) = +𝛥√ 2 𝛥2yx

M(x, y) = +𝛥√ 2 𝛥2yx

M(x, y) = +𝛥√ 2 𝛥2yx

Figure 2: An illustration of the Color Component Selection method.

3Journal of Sensors

Page 4: A Revisit Histogram of Oriented Descriptor for Facial

when we extract HOG from color image, the performanceusually degrades seriously. The reason is because the infor-mation that the color image contains is three times morethan the grayscale image does. Moreover, most descriptorsare first designed to perform on grayscale images only.Therefore, color has been well investigated in recent yearsfor extracting HOG features. For example, Hoang et al.

[25] extract local image descriptors, including LBP, HOG,and GIST for an application of rice seed image recognition.In this approach, features extracted from independent colorcomponent are then fused to form final feature vector. Aslanet al. [26] compare HOG-SVM and CNN for human track-ing based on video in occlusion context. Zhou et al. [27]introduce a method for extracting HOG features based on

01111 –1 –1 –1 –1

R = 4 123

0

1

1

1

1

–1

–1

–1

–1

R∥4

3

2

1

Figure 3: An illustration of original kernels and extended kernels.

(a) AR dataset

(b) Georgia dataset

(c) CLV dataset

(d) MUCT dataset

Figure 4: The four datasets used to evaluate the proposed approach.

4 Journal of Sensors

Page 5: A Revisit Histogram of Oriented Descriptor for Facial

a color filter array. Each Bayer pattern is then applied toconvert into three component color images by downsam-pling operation. A new extension of bin was added toHOG and extracted in HSI color space for removing shadowregion. Nhat and Hoang [28] enhance the accuracy of facerecognition by the fusion of three popular features (LBP,HOG, and GIST). Duong and Hoang [29] apply to extractrice seed images based on features coded in multiple colorspaces using HOG descriptor. Banerji et al. [30] enhancethe HOG descriptor by subjecting the image to the Haarwavelet transform to improve the HOG descriptor for imageclassification. Recently, Z. Zhou and Y. Zhou [31] handlecolor space information for extracting features, namely,

CCS-HOG. They apply a between component color spacesimilarity, combining with HOG for extracting facial fea-tures. This approach has been shown its effectiveness inkinship verification. Van and Hoang extract HOG featuresfor coding facial image in different color spaces [32, 33].The final features are obtained by concatenating all featuresextracted from each color component. Fekri-Ershad andTajeripour analyzed the color information of color-textureimages for classification using hybrid color LBP [34].Color-texture can also be analyzed using weighted colororder of LBP for classification [35].

Although the current researchers have applied maxi-mum magnitude selection method in order to selectively

Max mag C1+C1 C1+C2 C1+C3 C2+C1 C2+C2 C2+C3 C3+C1 C3+C2 C3+C3 Fusion

Method

70

75

80

85

90

95

100

Acc

urac

y (%

)

RGB with R = 1RGB with R = 2RGB with R = 3RGB with R = 4RGB with R = 5

HSV with R = 1HSV with R = 2HSV with R = 3HSV with R = 4HSV with R = 5

ISH with R = 1ISH with R = 2ISH with R = 3ISH with R = 4ISH with R = 5

YCbCR with R = 1YCbCR with R = 2YCbCR with R = 3YCbCR with R = 4YCbCR with R = 5

I1I2I3 with R = 1I1I2I3 with R = 2I1I2I3 with R = 3I1I2I3 with R = 4I1I2I3 with R = 5

Figure 5: Different values of extended kernels for HOG by CCS and CCF on AR dataset.

Table 1: Description of experimental dataset.

Dataset Image size Class Training set Testing set Total

AR 165 × 120 100 1,300 1,300 2,600

CLV 256 × 256 111 333 444 777

Georgia 128 × 128 50 350 400 750

MUCT 256 × 256 199 199 398 597

5Journal of Sensors

Page 6: A Revisit Histogram of Oriented Descriptor for Facial

reduce the information of color image so it can meet therequirements of the following stage of HOG extraction pro-cess, the performance is not completely optimized. Theimpact of color component for extracting features is firstmentioned in [36] for LBP descriptors. This issue is exten-sively investigated to incorporate color information by vari-ous proposed method in recent years. It can be used byfusing features extracted from each color component inde-pendently or jointly [4]. Specifically, better performancesare achieved when we use the unselected magnitudes ofmaximum magnitude selection method. Novel approachesfor extracting HOG from color image are Color ComponentSelection and Color Component Fusion. Furthermore, wealso upgrade the kernels in the gradient computation stageby extending it in horizontal and vertical dimensions. Theintention is to figure out the connection between surround-ing pixels and the computing pixel, whether the surroundingpixels cause any effects on the output performance.

The rest of this paper is organized as follows. Section 2.1and Section 3 introduce HOG descriptor with ColorComponent Selection, Color Component Fusion, and theextended kernels. Then, experimental results are presentedin Section 4. Finally, the conclusion and future works arediscussed in Section 5.

2. HOG Descriptor with Color ComponentSelection and Color Component Fusion

This section briefly introduces HOG computation and the twoproposed approaches with Color Component Selection (CCS)and Color Component Fusion (CCF) for HOG descriptor.

2.1. HOG Descriptor. Before extracting HOG feature, animage I is split into three subimages IC1, IC2, and IC3 whichare three color components of I. Next, several image pro-cessing algorithms are applied on these images in order toreduce noisy for enhancing the performance. After prepro-cessing step, on each image, gradient magnitude and direc-tion of each pixel are computed by using the horizontaland vertical gradients. The gradient computation of pixellocated at coordinate ðx, yÞ is defined as follows [6]:

Δx = G x−1,yð Þ −G x+1,yð Þ��� ���, ð1Þ

Δy = G x,y−1ð Þ −G x,y+1ð Þ��� ���, ð2Þ

M x,yð Þ =ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiΔ2x + Δ2

y ,q

ð3Þ

RGB with R = 1RGB with R = 2RGB with R = 3RGB with R = 4RGB with R = 5

HSV with R = 1HSV with R = 2HSV with R = 3HSV with R = 4HSV with R = 5

ISH with R = 1ISH with R = 2ISH with R = 3ISH with R = 4ISH with R = 5

YCbCR with R = 1YCbCR with R = 2YCbCR with R = 3YCbCR with R = 4YCbCR with R = 5

I1I2I3 with R = 1I1I2I3 with R = 2I1I2I3 with R = 3I1I2I3 with R = 4I1I2I3 with R = 5

Method

40

45

50

55

60

65

70

75

80A

ccur

acy

(%)

Max mag C1+C1 C1+C2 C1+C3 C2+C1 C2+C2 C2+C3 C3+C1 C3+C2 C3+C3 Fusion

Figure 6: Different values of extended kernels for HOG by CCS and CCF on Georgia dataset.

6 Journal of Sensors

Page 7: A Revisit Histogram of Oriented Descriptor for Facial

α x,yð Þ = arctanΔy

Δx

� �, ð4Þ

where G is grayscale value of the computing pixel, Δx andΔy represent horizontal and vertical gradients, and Mðx,yÞand αðx,yÞ sequentially define gradient magnitude and gra-dient direction.

2.2. Proposed Approaches. Each image then results in a pairof matrices which contains one gradient magnitude matrixand one the gradient direction matrix. So there are threeachieved pairs of matrices in total which are MC1 and αC1,MC2 and αC2, and MC3 and αC3. As the following steprequires only one pair of matrices, the current task is tofigure out which pair of matrices should be selected in orderto produce an optimal feature vector. The most popularsolution is maximum magnitude selection. This methodcompares three gradient magnitudes in each pixel. Then,the maximum is chosen for the final magnitude of this pixel.The final direction is also extracted from the same colorcomponent with the selected magnitude. The detail of max-imum magnitude selection method is illustrated in Figure 1.

After selection step, two final matrices are obtained tomeet the requirement of the orientation binning step. In thisstep, the original image is divided into cells (8 × 8 per cell)

and a 9-bin histogram is built in each cell based on the gra-dient features of the inner pixels. The bins are ranged from 0to 180 degrees for unsigned gradient (αunsigned) and from 0 to360 degrees in case of signed gradient (αsigned). Gradientmagnitude of each pixel is added into the correspondingindex number of a bin in the histogram. The index numberBidxðx,yÞ is then computed by Equation (5) or (6) and usedthe ceiling value.

Bidx x,yð Þ =α x,yð Þ

αunsigned/Bnum� ������

����� αunsigned ∈ 0,180ð �, Bnum = 9� �

,

ð5Þ

or

α x,yð Þαunsigned/Bnum� ������

����� αsigned ∈ 0,360ð �, Bnum = 9� �

, ð6Þ

where Bnum stands for the default number of orientation binsof the histogram and is usually set to 9 by default. αðx,yÞ iscomputed above by Equation (4).

RGB with R = 1RGB with R = 2RGB with R = 3RGB with R = 4RGB with R = 5

HSV with R = 1HSV with R = 2HSV with R = 3HSV with R = 4HSV with R = 5

ISH with R = 1ISH with R = 2ISH with R = 3ISH with R = 4ISH with R = 5

YCbCR with R = 1YCbCR with R = 2YCbCR with R = 3YCbCR with R = 4YCbCR with R = 5

I1I2I3 with R = 1I1I2I3 with R = 2I1I2I3 with R = 3I1I2I3 with R = 4I1I2I3 with R = 5

Method

35

40

45

50

55A

ccur

acy

(%)

Max mag C1+C1 C1+C2 C1+C3 C2+C1 C2+C2 C2+C3 C3+C1 C3+C2 C3+C3 Fusion

Figure 7: Different values of extended kernels for HOG by CCS and CCF on CLV dataset.

7Journal of Sensors

Page 8: A Revisit Histogram of Oriented Descriptor for Facial

For the normalization stage, the original image is thendivided into blocks (each block contains 2 × 2 cells whichequal to 256 × 256 pixels). An effective normalization is ableto reduce noise and cancel the ill effects. In each block, 50%of the surrounding blocks are overlapped so each cell can benormalized more than once except the cells locate in thecorner. Histograms of each block are concatenated and thennormalized by using L1-norm, L2-norm, or L1-sqrt. Finally,all normalized histograms are combined together as afeature vector.

In practice, we observe that the maximum magnitudeselection method is not optimal enough as it may causeinformation loss. When we use the unselected values of themaximum magnitude selection for HOG extraction, theachieved results are higher than use the maximum magni-tude in several experiments. Therefore, we apply ColorComponent Selection and Color Component Fusionmethods to improve the performance. For the Color Com-ponent Selection method, instead of considering whichmagnitude or direction should be selected for the next stage,we choose the final pair of matrices based on the colorcomponents. There are three matrices of each kind so ninedifferent pairs of matrices are achieved in total. Theorientation binning stage sequentially applies these pairs toproceed. In the end, nine feature vectors are obtained which

are corresponding to nine selected pairs. These vectors arethen evaluated to find out the best performance. Theprocess of the Color Component Selection method ispresented in Figure 2. The Color Component Fusion sim-ply takes the achieved vectors and concatenates all of theminto a fusion vector.

3. Extended Kernels for HOG Extraction

In order to speed the gradient computation process up,researchers used kernels which are illustrated in Figure 3.By filtering the image with these kernels, the horizontal gra-dient Δx and vertical gradient Δy are computed faster thancalculating based on Equations (1) and (2). We define theseoriginal kernels are kernels with R = 1. As we sequentiallyextend the kernels through horizontal and vertical dimen-sion, the parameter R also increases. Figure 3 describes spe-cifically the kernels which parameter R in range from 1 to 4.Then, the achieved horizontal and vertical gradients aredivided by parameter R.

4. Experiments

4.1. Dataset Description. The proposed approaches areevaluated on four common facial databases, including AR,

RGB with R = 1RGB with R = 2RGB with R = 3RGB with R = 4RGB with R = 5

HSV with R = 1HSV with R = 2HSV with R = 3HSV with R = 4HSV with R = 5

ISH with R = 1ISH with R = 2ISH with R = 3ISH with R = 4ISH with R = 5

YCbCR with R = 1YCbCR with R = 2YCbCR with R = 3YCbCR with R = 4YCbCR with R = 5

I1I2I3 with R = 1I1I2I3 with R = 2I1I2I3 with R = 3I1I2I3 with R = 4I1I2I3 with R = 5

Max mag C1+C1 C1+C2 C1+C3 C2+C1 C2+C2 C2+C3 C3+C1 C3+C2 C3+C3 FusionMethod

80

82

84

86

88

90

92

94

96

98

100A

ccur

acy

(%)

Figure 8: Different values of extended kernels for HOG by CCS and CCF on MUCT dataset.

8 Journal of Sensors

Page 9: A Revisit Histogram of Oriented Descriptor for Facial

Georgia, CLV, and MUCT (see Figure 4). In order to exper-iment the proposed approaches, these databases are splitinto 50% of training and 50% of testing. However, databases,such as Georgia, CLV, and MUCT, have an odd number ofimages per class. For instance, the MUCT dataset includes3 images in each class so we decide to randomly split oneimage for training and the other two for testing in order tomake the training more challenging. The 1-NN classifier isemployed to evaluate the classification performance, andaccuracy metric is employed to measure the performanceon testing set. The summary of those databases is presentedin Table 1.

4.2. Experimental Setup. As our proposed approaches aredesigned to fit in every three-component color images, wealso experiment them in several well-known color spaces,including HSV, ISH, I1I2I3, and YCbCr. These spaces arefrequently applied for pattern recognition [4]. According toSection 2, each image after applying Color ComponentSelection results in 10 different feature vectors, including 9different vectors and one fusion vector which is extractedby the Color Component Fusion method. These results arethen deployed to a 1-NN classifier for evaluating. Moreover,

we set the kernels with R from 1 to 5, respectively, for com-parison. The experiments are implemented by Matlab 2017band conducted on a PC with a configuration of a CPU I38100 3.60GHz and 8GBs of RAM.

4.3. Results. The experimental results are presented by chartsin Figures 5–8. By converting in many color spaces andincreasing parameter R, the performance of maximum mag-nitude selection method has been upgraded. Furthermore,the Color Component Selection and the Color ComponentFusion have also outperformed the maximum magnitudeselection method according to the charts. For the AR data-base, the highest achieved accuracy is 95.92% which is0.54% more than the highest accuracy can be achieved byusing the maximum magnitude selection method, and thisresult is obtained by the Color Component Fusion withR = 5 kernels, and the used color space is ISH. For theGeorgia database, when we apply the Color ComponentFusion method combined with R = 1 kernels and YCbCrcolor space, the achieved result is 75.00%. This is the highestaccuracy in all experiments on the Georgia database. Betterperformance is also obtained in experiment for the MUCTdatabase and even in experiment with the CLV database

Table 2: Several experimental cases on the AR database.

Colorspace

RAR

Maxmagnitude

C2 C2 C2 C3Fusion

C1 C2 C3 C1

RGB

1 92.69 87.00 91.23 89.77 85.46 90.62

2 93.46 88.46 93.15 92.92 87.85 92.77

3 94.54 88.77 93.85 93.69 88.85 93.69

4 94.92 89.15 94.38 94.38 89.08 94.54

5 94.38 89.85 93.69 94.46 89.23 93.77

HSV

1 92.62 84.69 84.00 74.85 82.77 86.31

2 94.46 87.15 90.38 83.54 86.31 90.08

3 94.46 88.15 92.46 86.38 87.54 92.15

4 94.54 88.69 94.23 87.31 87.85 93.77

5 94.69 89.00 94.77 87.92 87.62 94.15

ISH

1 92.08 92.77 82.08 72.23 89.23 88.69

2 94.77 94.54 88.15 81.77 92.77 92.85

3 94.85 94.77 92.15 85.85 93.85 94.23

4 95.38 94.46 93.69 87.23 94.00 95.08

5 94.69 94.62 94.23 87.69 94.08 95.92

YCbCr

1 92.77 92.38 86.15 85.54 93.15 91.31

2 94.00 94.15 90.15 91.00 94.85 93.77

3 94.54 94.15 91.54 92.31 95.15 94.31

4 94.77 94.38 91.38 92.54 94.92 94.69

5 94.54 94.08 90.77 91.69 94.92 94.62

I1I2I3

1 93.08 93.15 89.62 78.23 93.31 92.00

2 93.77 94.46 93.08 87.31 95.69 94.23

3 94.46 94.69 93.31 89.38 95.08 95.38

4 94.54 94.62 93.00 90.31 95.31 95.62

5 94.62 94.85 93.00 91.54 95.31 95.69

Table 3: Several experimental cases on the Georgia database.

Colorspace

RGeorgia

Maxmagnitude

C1 C1 C2 C3Fusion

C2 C3 C2 C2

RGB

1 65.75 67.00 65.75 66.50 66.25 66.50

2 66.00 67.50 66.50 68.25 68.25 67.75

3 66.75 68.75 69.25 68.75 68.50 69.00

4 67.00 68.75 69.25 68.75 69.75 69.00

5 67.00 68.50 67.50 68.25 69.25 68.00

HSV

1 69.25 67.00 63.50 64.00 47.00 70.00

2 65.00 67.75 59.00 64.75 52.50 69.50

3 64.50 65.75 55.00 64.75 56.75 69.00

4 65.00 67.25 56.25 65.25 58.25 71.25

5 64.25 68.25 58.75 68.00 62.50 72.75

ISH

1 69.00 68.75 63.25 66.50 47.25 70.50

2 66.25 66.75 58.00 64.75 49.75 70.25

3 64.75 66.00 58.75 65.25 54.00 69.50

4 65.00 65.75 55.75 66.50 56.25 70.50

5 65.75 67.75 56.75 65.50 59.25 72.50

YCbCr

1 66.25 65.75 71.50 66.50 68.50 75.00

2 68.25 68.00 68.75 65.75 65.75 72.75

3 68.25 65.50 65.50 63.75 65.00 72.50

4 69.00 65.00 64.50 64.25 63.50 71.00

5 69.00 65.75 63.50 63.75 64.75 70.00

I1I2I3

1 66.25 67.75 56.25 67.25 68.00 72.00

2 67.75 67.50 61.50 66.00 65.00 70.25

3 68.50 65.50 59.50 64.25 64.75 70.00

4 68.50 65.00 59.75 63.75 64.00 70.50

5 68.75 65.50 60.00 64.00 63.50 69.75

9Journal of Sensors

Page 10: A Revisit Histogram of Oriented Descriptor for Facial

which is the most challenging image set. Generally, weobserve that the best results are mostly achieved withFusion approach in comparing with other approaches,which can tell the promising performance of our pro-posed approaches.

Several of best cases are reported in Tables 2–5. Inthese tables, the Max magnitude abbreviation stands formaximum magnitude selection method while Fusionabbreviation stands for Color Component Fusion method.We found that the best accuracy obtained for AR datasetis 95.92% by using Fusion approach. Similarly, thisapproach achieves 75.00%, 49.77%, and 93.47% forGeorgia, CLV, and MUCT datasets, respectively. Note thatthe performance of each color space is different accordingto the feature extraction methods. Moreover, the incre-ment of parameter R in the extended kernels has alsoimproved several experimental results. In most cases, theaccuracy when applying R=5 kernels is higher comparedto the others. However, there are also several cases wherehighest precision is achieved with R=1 so we cannot yettell if the higher the parameter R of kernels, the higheraccuracy we can achieve. But we believe that these resultsare the foundation of the advantageous effectiveness of

kernel size in HOG performance, which leads us to studydeeper about it in the future.

5. Conclusion

In this paper, we propose novel approaches to extract HOGdescriptor from color images such as the Color ComponentSelection method and Color Component Fusion in order toimprove the classification performance. In our observation,we recognize that the proposed methods outperform thecurrent maximum magnitude selection method in face clas-sification task, especially the Color Component Fusionmethod. The color space conversion and extended kernelsalso efficiently improve the accuracy of the classification.However, in several cases, the extended kernels still causethe accuracy to decrease. The Color Component Selectionmethod requires a long time to finish the extraction fromevery available case, and the Color Component Fusion maycause computer memory issues due to its deep dimension.Therefore, our future work is to figure out the optimal pairof matrices without experimenting all cases to enhance theinference speed. Feature selection methods are also recom-mended to reduce the vector dimension. Several CNN-

Table 4: Several experimental cases on the CLV database.

Colorspace

RCLV

Maxmagnitude

C1 C1 C1 C3 C3Fusion

C1 C2 C3 C2 C3

RGB

1 47.75 47.52 49.77 48.65 48.65 49.10 49.55

2 47.52 45.95 47.75 47.07 47.52 48.20 47.75

3 47.97 46.85 48.42 48.65 47.97 48.42 48.42

4 47.52 47.75 48.20 47.30 48.42 46.85 48.42

5 47.07 47.07 48.65 47.97 47.75 47.30 48.87

HSV

1 46.85 47.75 46.40 45.50 42.12 44.82 48.42

2 45.50 46.40 46.17 45.05 43.92 45.50 48.20

3 45.50 46.85 46.17 45.05 44.14 44.82 48.42

4 45.27 47.52 46.62 44.82 45.05 43.47 49.32

5 45.72 47.30 47.30 45.27 45.95 43.02 49.10

ISH

1 46.40 48.65 47.30 45.95 41.89 44.59 49.32

2 45.50 48.42 45.95 44.82 42.79 45.50 48.65

3 45.50 48.42 46.62 43.92 42.34 44.14 48.42

4 45.05 47.30 45.72 44.37 43.47 42.79 48.42

5 46.17 48.87 45.72 45.95 43.47 42.34 48.20

YCbCr

1 49.10 48.87 43.92 45.05 42.12 46.62 47.97

2 47.75 48.65 43.69 47.30 44.37 46.62 47.97

3 47.75 48.65 44.82 47.97 45.95 46.62 48.87

4 47.97 47.97 45.72 49.10 45.05 46.85 48.87

5 49.32 48.87 45.27 49.10 44.82 46.40 48.20

I1I2I3

1 48.87 48.65 44.82 44.14 46.40 43.92 49.32

2 48.42 48.42 46.85 44.59 47.52 43.47 48.65

3 48.20 48.42 47.52 46.17 46.85 43.24 48.42

4 48.20 47.30 47.75 45.72 46.62 41.89 49.10

5 48.65 48.87 47.30 45.27 46.17 41.89 48.65

Table 5: Several experimental cases on the MUCT database.

Colorspace

RMUCT

Maxmagnitude

C1 C1 C1 C2 C3Fusion

C1 C2 C3 C1 C2

RGB

1 91.71 92.21 91.71 92.71 92.46 92.21 92.21

2 91.46 92.71 92.46 92.21 92.71 91.96 92.21

3 91.71 92.46 92.96 92.21 91.71 92.96 92.96

4 91.96 91.71 92.71 92.46 91.96 92.71 92.96

5 92.21 91.71 92.46 92.46 92.21 92.96 92.71

HSV

1 90.95 92.46 92.71 90.20 90.20 88.94 93.22

2 89.95 92.71 92.71 91.46 91.21 88.94 93.47

3 89.45 92.21 91.46 91.71 91.71 89.45 93.47

4 88.94 91.71 91.46 91.21 92.21 89.45 92.96

5 88.94 91.71 91.96 90.70 92.46 88.94 93.22

ISH

1 90.20 92.21 92.96 89.45 89.95 86.18 93.47

2 89.20 92.21 91.96 91.46 90.20 88.69 92.96

3 89.20 92.96 91.71 91.71 91.71 89.20 93.47

4 88.44 92.71 91.96 91.46 92.46 88.19 93.22

5 88.94 92.46 92.46 90.70 91.96 88.94 93.22

YCbCr

1 92.71 92.21 91.21 89.95 89.45 91.46 92.96

2 92.46 91.96 92.46 91.96 90.95 92.21 92.96

3 92.71 92.71 92.96 92.21 91.21 91.71 92.46

4 92.46 92.96 92.46 91.96 91.21 90.95 92.21

5 92.46 92.46 91.46 90.95 91.71 90.70 91.96

I1I2I3

1 92.71 92.21 90.95 88.19 89.70 90.20 92.46

2 92.21 92.21 92.96 90.45 90.70 91.21 92.96

3 92.71 92.96 92.46 90.95 91.46 91.71 93.22

4 92.71 92.71 91.71 90.45 91.96 90.20 92.71

5 92.46 92.46 90.95 90.20 91.96 90.45 92.71

10 Journal of Sensors

Page 11: A Revisit Histogram of Oriented Descriptor for Facial

related works are considered to combine with the proposedapproaches for further improvement.

Data Availability

The datasets generated during and/or analyzed during thecurrent study are available from the corresponding authoron reasonable request.

Conflicts of Interest

We have no conflict of interest to declare.

References

[1] D. Huang, C. Shan, M. Ardabilian, Y. Wang, and L. Chen,“Local binary patterns and its application to facial image anal-ysis: a survey,” IEEE Transactions on Systems, Man, and Cyber-netics, Part C, vol. 41, no. 6, pp. 765–781, 2011.

[2] A. Humeau-Heurtier, “Texture feature extraction methods: asurvey,” IEEE Access, vol. 7, pp. 8975–9000, 2019.

[3] S. Fekri-Ershad, “Gender classification in human face imagesfor smart phone applications based on local texture informa-tion and evaluated Kullback-Leibler divergence,” Traitementdu Signal, vol. 36, no. 6, pp. 507–514, 2019.

[4] V. T. Hoang, Multi Color Space LBP-Based Feature Selectionfor Texture Classification, [Ph. D thesis], Universit’e du LittoralCˆote d’Opale, 2018.

[5] L. Zhang, J. Chen, L. Yue, and P. Wang, “Face recognitionusing scale invariant feature transform and support vectormachine,” in 2008 The 9th International Conference for YoungComputer Scientists, pp. 1766–1770, Hunan, China, 2008.

[6] N. Dalal and B. Triggs, “Histograms of oriented gradients forhuman detection,” in 2005 IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition (CVPR'05), vol.1, pp. 886–893, San Diego, CA, USA, 2005.

[7] M. Douze, H. Jégou, H. Sandhawalia, L. Amsaleg, andC. Schmid, “Evaluation of gist descriptors for web-scale imagesearch,” in Proceedings of the ACM International Conferenceon Image and Video Retrieval, pp. 1–8, New York, NY, USA,2009.

[8] J. Ding, Y. Wang, and W. Geng, “An HOG-CT human detec-tor with histogram-based search,” Multimedia Tools andApplications, vol. 63, no. 3, pp. 791–807, 2013.

[9] Z. Qi, Y. Tian, and Y. Shi, “Efficient railway tracks detectionand turnouts recognition method using HOG features,” Neu-ral Computing and Applications, vol. 23, no. 1, pp. 245–254,2013.

[10] J. Qingbo, Z. Enze, Y. Xinqi, X. Yu, and L. Yun, “Face recogni-tion method based on HOG and DMMA from single trainingsample,” Multimedia Tools and Applications, vol. 75, no. 21,pp. 13163–13177, 2016.

[11] S. A. Chowdhury, M. Mir, S. Kowsar, and K. Deb, “Humandetection utilizing adaptive background mixture models andimproved histogram of oriented gradients,” ICT Express,vol. 4, no. 4, pp. 216–220, 2018.

[12] M. Nabila, B. J. Yousra, andW. Eric, “Optimized HOG descrip-tor for on road cars detection,” in Proceedings of the 10th Inter-national Conference on Distributed Smart Camera-ICDSC’16,pp. 166–171, Paris, France, 2016.

[13] S. Aytaç Korkmaz and H. Binol, “Classification of molecularstructure images by using ANN, RF, LBP, HOG, and sizereduction methods for early stomach cancer detection,” Jour-nal of Molecular Structure, vol. 1156, pp. 255–263, 2018.

[14] A. K. Hmood, C. Y. Suen, and L. Lam, “An enhanced histo-gram of oriented gradient descriptor for numismatic applica-tions,” Pattern Recognition and Image Analysis, vol. 28, no. 4,pp. 569–587, 2018.

[15] N. A. Jebril, H. R. Al-Zoubi, and Q. A. Al-Haija, “Recognitionof handwritten Arabic characters using histograms of orientedgradient (HOG),” Pattern Recognition and Image Analysis,vol. 28, no. 2, pp. 321–345, 2018.

[16] M. Uddin, M. A. Islam, M. Shajalal, M. A. Hossain, M. Yousuf,and S. Iftekhar, “Paddy seed variety identification usingt20-hog and haralick textural features,” in Complex & Intelli-gent Systems, pp. 1–15, Springer, 2021.

[17] M. Chandrakala and P. D. Devi, “Two-stage classifier for facerecognition using hog features,”Materials Today: Proceedings,vol. 47, pp. 5771–5775, 2021.

[18] B. T. Hung, “Face recognition using hybrid HOGCNNapproach,” Research in Intelligent and Computing in Engineering:Select Proceedings of RICE 2020, , pp. 715–723, Springer, 2021.

[19] M. Zaffar, S. Ehsan, M. Milford, and K. McDonald-Maier,“CoHOG: a light-weight, compute-efficient, and training-freevisual place recognition technique for changing environ-ments,” IEEE Robotics and Automation Letters, vol. 5, no. 2,pp. 1835–1842, 2020.

[20] X. Zhang, H. Shangguan, A. Ning, A. Wang, J. Zhang, andS. Peng, “Pedestrian detection with EDGE features of colorimage and HOG on depth images,” Automatic Control andComputer Sciences, vol. 54, no. 2, pp. 168–178, 2020.

[21] H. Wang, S. Wei, and B. Fang, “Facial expression recognitionusing iterative fusion of MO-HOG and deep features,” TheJournal of Supercomputing, vol. 76, no. 5, pp. 3211–3221, 2020.

[22] X. Pan, “Fusing HOG and convolutional neural network spa-tial–temporal features for video-based facial expression recog-nition,” IET Image Processing, vol. 14, no. 1, pp. 176–182, 2020.

[23] G. Joshi, S. Singh, and R. Vig, “Taguchi-TOPSIS based HOGparameter selection for complex background sign languagerecognition,” Journal of Visual Communication and ImageRepresentation, vol. 71, 2020.

[24] K. Wang, H. Wang, and J. Wang, “Terrain matching by fusingHOG with Zernike moments,” IEEE Transactions on Aerospaceand Electronic Systems, vol. 56, no. 2, pp. 1290–1300, 2020.

[25] V. T. Hoang, D. P. van Hoai, T. Surinwarangkoon, H.-T. Duong,and K. Meethongjan, “A comparative study of rice varietyclassification based on deep learning and hand-craftedfeatures,” ECTI Transactions on Computer and InformationTechnology (ECTI-CIT), vol. 14, no. 1, pp. 1–10, 2020.

[26] M. F. Aslan, A. Durdu, K. Sabanci, and M. A. Mutluer, “CNNand HOG based comparison study for complete occlusionhandling in human tracking,” Measurement, vol. 158, 2020.

[27] W. Zhou, S. Gao, L. Zhang, and X. Lou, “Histogram of orientedgradients feature extraction from raw Bayer pattern images,”IEEE Transactions on Circuits and Systems II: Express Briefs,vol. 67, no. 5, pp. 946–950, 2020.

[28] H. T. M. Nhat and V. T. Hoang, “Feature fusion by using LBP,HOG, GIST descriptors and Canonical Correlation Analysisfor face recognition,” in 2019 26th International Conferenceon Telecommunications (ICT), pp. 371–375, Hanoi, Vietnam,2019.

11Journal of Sensors

Page 12: A Revisit Histogram of Oriented Descriptor for Facial

[29] H.-T. Duong and V. T. Hoang, “Dimensionality reductionbased on feature selection for rice varieties recognition,” in2019 4th International Conference on Information Technology(InCIT), pp. 199–202, Bangkok, Thailand, 2019.

[30] S. Banerji, A. Sinha, and C. Liu, “Haarhog: Improving the hogdescriptor for image classification,” in 2013 IEEE InternationalConference on Systems, Man, and Cybernetics, pp. 4276–4281,Manchester, UK, 2013.

[31] Z. Zhou and Y. Zhou, “Cross-channel similarity based histo-grams of oriented gradients for color images,” in 2019 IEEEInternational Conference on Systems, Man and Cybernetics(SMC), pp. 1621–1625, Bari, Italy, 2019.

[32] T. N. Van and V. T. Hoang, “Early and late features fusion forkinship verification based on constraint selection,” in 201925th Asia-Pacific Conference on Communications (APCC),pp. 116–121, Ho Chi Minh City, Vietnam, 2019.

[33] T. N. Van and V. T. Hoang, “Kinship verification based onlocal binary pattern features coding in different color space,”in 2019 26th International Conference on Telecommunications(ICT),, pp. 376–380, Hanoi, Vietnam, 2019.

[34] S. Fekri-Ershad and F. Tajeripour, “Impulse-noise resistantcolor-texture classification approach using hybrid color localbinary patterns and Kullback–Leibler divergence,” The Com-puter Journal, vol. 60, no. 11, pp. 1633–1648, 2017.

[35] T. Song, J. Feng, S. Wang, and Y. Xie, “Spatially weighted orderbinary pattern for color texture classification,” Expert Systemswith Applications, vol. 147, 2020.

[36] T. Mäenpää and M. Pietikäinen, “Classification with color andtexture: jointly or separately?,” Pattern Recognition, vol. 37,no. 8, pp. 1629–1640, 2004.

12 Journal of Sensors