a fast and robust eye-event recognition for human-smartphone interaction

Upload: ardhiansyah-baskara

Post on 06-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 A Fast and Robust Eye-event Recognition for Human-Smartphone Interaction

    1/2

    A Fast and Robust Eye-Event Recognition (FRER)

    for Human-Smartphone Interaction

    Ardhiansyah Baskara

    School of Electrical and Computer EngineeringPusan National University

    Busan, 609-735 Republic of Korea

    [email protected]

    Han-You Jeong

    School of Electrical and Computer EngineeringPusan National University

    Busan, 609-735 Republic of Korea

    [email protected]

     Abstract—The human-computer interaction using the com-puter vision have been extensively studied to provide a chanceto access the computers to the people with physical challenges,such as Lou Gehrig’s disease or stroke illness. Recently, asmartphone has become one of important gadgets in our daily life.However, there are still many challenges in the computer visionfor the human-smartphone interaction (HSI), such as hardware

    limitation and unstable distance and pose of the smartphone user.In this paper, we present the   fast and robust eye-event recognition(FRER)  scheme that consists of the eye-area extraction, the eyetracking, and the eye-event recognition blocks. We also proposethe   slope-based similarity checking (SSC)  algorithm for eye-eventrecognition of a person with arbitrary eye size. The experimentalresults show that the FRER scheme can successfully detect theeye events with 99.3 % at frame rate of 19 frames per second.

     Keywords: Human-smartphone interaction, computer vision, eye-event recognition.

    I. INTRODUCTION

    The human-computer interaction (HCI) based on the com-

    puter vision technology has been one of the hottest researchtopics, because the eye-event recognition is one of ways to

    get input command from users by using camera [1] [2]. Many

    researchers have extensively studied a novel way to establish

    the interaction between a user and a computer. The seminal

    paper in [1] presents the framework of eye-event recognition

    which detects the eye area through motion analysis, tracks

    the eye area using a similarity measure, and then recognizes

    the eye events based on the threshold values of the similarity.

    The authors in [2] present a robust implementation of this

    framework which supports a frame rate of 30 frames per

    second (FPS) using a desktop PC equipped with a webcam.

    Recently, a smartphone has been a ubiquitous mobile device

    for web browsing, instant messaging, and streaming services

    in our daily life. In this paper, we focus on the   human-

    smartphone interaction (HSI)  using the computer vision tech-

    nology. The goal of this research is to provide a chance to

    access the smartphone to the people with severe physical

    challenges, such as Lou Gehrig’s disease and stroke illness.

    Usually, the HSI faces a couple of additional challenges

    compared to the HCI: 1) how to detect/track the eye events

    with computationally efficient way; and 2) how to accurately

    recognize eye events of a person with arbitrary eye size. The

    EyePhone in [3] is the first hand-free interaction for driving

    Fig. 1. Overview of the FRER scheme

    the apps using the HSI. In [4], the EyeGuardian informs the

    user if his/her blink rate is exceptionally low. For the template

    of eye tracking, the EyePhone requires an additional step to

    collect open-eye templates at the initial phase, whereas the

    EyeGuardian uses computationally intensive Haar Cascade

    Classifier. For the eye-blink detection, both apps use thethreshold-based similarity checking (TSC)  which is not robust

    to a person with different eye size.

    In this paper, we propose the   fast and robust eye-event 

    recognition (FRER)  scheme. The FRER scheme first extracts

    the eye-area using the face detection, tracks the location of 

    eye area, and recognizes the eye events regardless of the eye

    size. The experimental results show that the FRER scheme can

    detect the eye event with success probability of 99.3 percent

    at frame rate of 19 FPS.

    II. THE  FRER SCHEME

    Fig. 1 shows the overview of the FRER scheme consisting

    of three blocks: the eye-area extraction (EE), the eye-tracking

    (ET), and eye-event recognition (ER). The EE block obtains

    the eye area through the following steps: The EE block first

    converts the RGB frame of smartphone camera in Fig. 1(a)

    into a grayscale frame as shown in Fig. 1(b). Next, the EE

    block employs the Haar Cascade Classifier to extract the face

    area (See Fig. 1(c)), and then obtains the eye area by cropping

    it from the face area as shown in Fig. 1(d).

    Once the eye area is obtained from the EE block, the ET

    block tracks the movement of the eye area using the Haar

    Cascade Classifier as shown in Fig. 1(e). To aim this, the ET

  • 8/17/2019 A Fast and Robust Eye-event Recognition for Human-Smartphone Interaction

    2/2

    Fig. 2. The NCC of two smartphone users

    block considers the extracted eye area as the region of interest

    (ROI) which includes the open-eye template in Fig. 1(f). This

    ROI is also used for skipping the computationally intensive

    EE block, which will be explained at the end of this section.

    As shown in Fig. 1(g), we consider two possible states of 

    eye: open and  closed . To recognize an eye event, the ER block 

    uses the open-eye template of the ET block as the reference.The ER block adopts the same similarity measure, called the

    normalized cross-correlation (NCC), as the existing works in

    [1–4]. The prime difference of the ER block is the use of 

    slope-based similarity checking (SSC)  algorithm to detect the

    eye event. Denoting the NCC value at frame index  t  by  η(t),the existing TSC algorithm uses a fixed threshold   T   of the

    NCC: An eye is open, if   η(t)   ≥   T , and closed otherwise.Instead, the SSC toggles its state when the slope of the NCC,

    denoted by  ς (t), exceeds a fixed threshold  S : It maintains thecurrent state, if   ς (t) =   |η(t) − η(t − 1)| ≤  S , and switchesto the other state, otherwise. Fig. 2 plots the NCC of two

    smartphone users taken from our experiments. We can see that,

    at the frame of eye closing/opening, both users have a similarslope value while the NCC value is quite different depending

    on the eye size.

    Finally, if  η(t)  is less than 0.5, the FRER scheme interpretsthat the ET block fails to track the eye area: It goes to the EE

    block to extract the eye area again. Otherwise, it skips the EE

    block and directly executes the ET block with the new ROI.

    III. NUMERICAL R ESULTS AND  D ISCUSSION

    In this section, we discuss the numerical results from our

    experiments with a test app running at the Samsung Galaxy

    Note 3 Neo. We develop this app using the Android APIs and

    the OpenCV library.

    To demonstrate the computational efficiency, we compare

    the FRER scheme with the basic scheme that executes all three

    blocks of FRER at each frame. Table I shows the frame rate

    of both schemes. At each scenario, we run the test app for

    three minutes with/without head movements. We can see that

    the FRER scheme achieves a higher frame rate than the basic

    scheme; The frame rate of the former is around 19 FPS, while

    that of the latter is about 15 FPS. We infer that the ET block 

    can reduce the computational load of the face detection.

    Fig. 3 show the accuracy of eye-blink recognition of three

    smartphone users with different eye size. To maximize the

    TABLE IFRAME  R ATE OF  E YE  T RACKING

    Scheme  Head

    movement

    Frame rate (FPS)

    Mean Max   Min

    Basic  No 15.04 15.74 13.89

    Yes 15.21 16.12 13.26

    FRER  No 18.57 28.39 14.64

    Yes 19.34 29.95 12.01

    Fig. 3. The accuracy of eye-blink recognition of three users.

    accuracy of eye-blink recognition, we set the threshold value

    of TSC algorithm to   T   = 0.865   and the threshold of SSCalgorithm to  S  = 0.08. We observe that the accuracy of SSCalgorithm is much higher than the existing TSC algorithm: On

    average, the former achieves the accuracy of 99.3 % while the

    latter attains the accuracy of 60.0 %. We can also see that

    the SSC algorithm achieves a high accuracy regardless of eye

    size, while the accuracy of TSC algorithm is dependent on the

    eye size of a user: The maximum difference in the accuracy

    of SSC algorithm among three users is 2.0 % while that of 

    TSC algorithm is 100 %.

    From the above results, we conclude that the FRER scheme

    is not only computationally efficient for a real-time HSI app,

    but also robust to the users with different eye sizes.

    ACKNOWLEDGEMENT

    This research was supported by National Research Foundation

    of Korea (NRF) Grant (No. 2009-0083495) and by Basic Science

    Research Program (No. 2013R1A1A1012290) through the NRF,

    which is funded by the Ministry of Science, ICT & Future Planning.

    REFERENCES

    [1] K. Grauman, M. Betke, J. Gips, and G. Bradski, “Communication viaeye blinks - detection and and duration analysis in real time,” in   Proc.

     IEEE CVPR01, Kauai, Hawaii, Dec. 2001, pp. 1010 - 1017.[2] M. Chau and M. Betke, “Real time eye tracking and blink detection with

    USB cameras,” in  Boston University Computer Science Technical Report  No. 2005-12, 2005.

    [3] E. Miluzzo, T. Wang, and A. T. Campbell, “Eyephone: activating mobilephones with your eyes,” in   Proc. ACM MobiHeld’10, New Delhi, India,Aug. 2010, pp. 15 - 20.

    [4] S. Han, S. Yang, J. Kim, and M. Gerla, “Eyeguardian: a framework of eye tracking and blink detection for mobile device users,” in  Proc. ACM 

     HotMobile’12, San Diego, CA, USA, Feb. 2012.