hand gesture recognition technology in human …rjh/courses/researchtopicsinhci/2015-16/... · hand...

6
Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School of Computer Science University of Birmingham Email: [email protected] Abstract—Hand gesture recognition has gradually become a hot topic in Human-Computer Interaction (HCI) in recent years. It is a technology that can mimic how our human beings communicate with each other to interact between human and computers in natural and intuitive ways. Computer recognition of hand gestures is widely used in many applications, such as entertainment system, medical systems, crisis management and Human-robot interaction, since hand gesture recognition provide a natural and intuitive computer interface to people. Firstly, this paper intends to illustrate some technical issues related hand gesture recognition, such as hand gesture types and input devices. Subsequently, this paper will focus on various applications in personal and industry domains respectively. For individual areas, I only focus on the entertainment system. In industry domain, medical systems and Human-robot interaction will be discussed. At the end of this article, some shortcomings of each area are revealed and the future trends of the hand gesture recognition are concluded as three guidelines. Index Terms—Human-computer interaction, Hand gesture recognition, Data glove, Depth-aware cameras, Stereo cameras, Computer games, Medical system, Human-robot interaction. I. I NTRODUCTION Since its appearance, the computer has become an indis- pensable part of our human society. The technology will increasingly influence our everyday life because of the pop- ularity of personal computers and other devices, such as iPhone, Xbox 360 and intelligent home. The efficient use of these intelligent devices means that more interaction between human and computer are required. Therefore, HCI has become an extremely active branch in the field of computer science research[1]. It is obvious that future Human-computer interac- tion modes are more similar to the natural, intuitive interaction. In other words, it is closer to the way of communication between people. Gesture recognition is considered as a critical technique in HCI. Compared to other parts of the body, hand gesture seems more flexible and natural[2]. Therefore, it is a suitable tool for control the machine. Hand gesture recognition has been applied to various fields.I divide these applications into two main classes personal do- main and industry domain. In the field of personal, entertain- ment is definitely the significant role[3]. In addition, hand gesture recognition is widely developed about the medical system and human-robot interaction in the industry domain[6]. For each of these applications, I describe human factors that need to be considered first. Then usability consideration is anther critical factor which motivates the users to interact with the devices. This paper is organized in the following way: Section 2 introduces some technical issues in hand gesture recognition; Section 3 presents some applications where hand gesture recognition has been developed into the entertainment system. section 4 illustrates some researches in the medical system and human-robot interaction. section 5 draws the conclusion and exposes the future trends about hand gesture recognition. II. TECHNICAL ISSUES IN GESTURE RECOGNITION A. Hand gesture type Hand gesture type, from simple to complicated, can be roughly divided into two categories: static and dynamic hand gesture[5]. Based on that, hand gesture recognition technology also contains three levels: two-dimensional static hand gesture recognition, two-dimensional dynamic hand gesture recogni- tion and 3D handgesture recognition[6]. The first two kinds of hand gesture recognition technology, is based on the two-dimensional level, they do not contain the depth data as the input sources.The third technology, however, is based on three-dimensional. The most fundamental difference between 3D hand gesture recognition and two- dimensional hand gesture recognition is that 3D hand gesture recognition requires depth data. Consequently, this makes the 3D Hand-Gesture recognition in two aspects of hardware and software is more complex than two-dimensional Hand-Gesture recognition[7]. As noted above, special hardware is needed to get the depth data. At present, there are three main hardware ways, Structure Light, Time of Flight and Multi-camera, to realize this goal in the world[8]. With the addition of the new advanced computer vision software algorithm, 3D gesture recognition can be realized. The next subsection talks about three main input devices which can track the hands movements and recognize the hand gesture. B. Input device Although there are numbers of vices based hand gesture recognition, the most useful and commercial input devices are as follow.

Upload: nguyenhuong

Post on 09-Feb-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Hand gesture recognition Technology in Human …rjh/courses/ResearchTopicsInHCI/2015-16/... · Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School

Hand gesture recognition Technology inHuman-computer interaction

Xueyuan LiSchool of Computer Science

University of BirminghamEmail: [email protected]

Abstract—Hand gesture recognition has gradually become ahot topic in Human-Computer Interaction (HCI) in recent years.It is a technology that can mimic how our human beingscommunicate with each other to interact between human andcomputers in natural and intuitive ways. Computer recognitionof hand gestures is widely used in many applications, such asentertainment system, medical systems, crisis management andHuman-robot interaction, since hand gesture recognition providea natural and intuitive computer interface to people. Firstly, thispaper intends to illustrate some technical issues related handgesture recognition, such as hand gesture types and input devices.Subsequently, this paper will focus on various applications inpersonal and industry domains respectively. For individual areas,I only focus on the entertainment system. In industry domain,medical systems and Human-robot interaction will be discussed.At the end of this article, some shortcomings of each area arerevealed and the future trends of the hand gesture recognitionare concluded as three guidelines.

Index Terms—Human-computer interaction, Hand gesturerecognition, Data glove, Depth-aware cameras, Stereo cameras,Computer games, Medical system, Human-robot interaction.

I. INTRODUCTION

Since its appearance, the computer has become an indis-pensable part of our human society. The technology willincreasingly influence our everyday life because of the pop-ularity of personal computers and other devices, such asiPhone, Xbox 360 and intelligent home. The efficient use ofthese intelligent devices means that more interaction betweenhuman and computer are required. Therefore, HCI has becomean extremely active branch in the field of computer scienceresearch[1]. It is obvious that future Human-computer interac-tion modes are more similar to the natural, intuitive interaction.In other words, it is closer to the way of communicationbetween people. Gesture recognition is considered as a criticaltechnique in HCI. Compared to other parts of the body, handgesture seems more flexible and natural[2]. Therefore, it is asuitable tool for control the machine.

Hand gesture recognition has been applied to various fields.Idivide these applications into two main classes personal do-main and industry domain. In the field of personal, entertain-ment is definitely the significant role[3]. In addition, handgesture recognition is widely developed about the medicalsystem and human-robot interaction in the industry domain[6].For each of these applications, I describe human factors thatneed to be considered first. Then usability consideration is

anther critical factor which motivates the users to interact withthe devices.

This paper is organized in the following way: Section 2introduces some technical issues in hand gesture recognition;Section 3 presents some applications where hand gesturerecognition has been developed into the entertainment system.section 4 illustrates some researches in the medical system andhuman-robot interaction. section 5 draws the conclusion andexposes the future trends about hand gesture recognition.

II. TECHNICAL ISSUES IN GESTURE RECOGNITION

A. Hand gesture type

Hand gesture type, from simple to complicated, can beroughly divided into two categories: static and dynamic handgesture[5]. Based on that, hand gesture recognition technologyalso contains three levels: two-dimensional static hand gesturerecognition, two-dimensional dynamic hand gesture recogni-tion and 3D handgesture recognition[6].

The first two kinds of hand gesture recognition technology,is based on the two-dimensional level, they do not containthe depth data as the input sources.The third technology,however, is based on three-dimensional. The most fundamentaldifference between 3D hand gesture recognition and two-dimensional hand gesture recognition is that 3D hand gesturerecognition requires depth data. Consequently, this makes the3D Hand-Gesture recognition in two aspects of hardware andsoftware is more complex than two-dimensional Hand-Gesturerecognition[7].

As noted above, special hardware is needed to get the depthdata. At present, there are three main hardware ways, StructureLight, Time of Flight and Multi-camera, to realize this goal inthe world[8]. With the addition of the new advanced computervision software algorithm, 3D gesture recognition can berealized. The next subsection talks about three main inputdevices which can track the hands movements and recognizethe hand gesture.

B. Input device

Although there are numbers of vices based hand gesturerecognition, the most useful and commercial input devicesare as follow.

Page 2: Hand gesture recognition Technology in Human …rjh/courses/ResearchTopicsInHCI/2015-16/... · Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School

Fig. 1. The data glove with sensors[9].

Fig. 2. The Kinect attached with XBOX 360[11].

1) Data gloves: In some gesture interaction scenarios suchas film shooting, accurate physical data, like bending of fingersis needed to capture. Data glove is an essential input deviceto the hand gesture recognition.

Normally, lots of small sensors are attached to a glove thatcan be worn on the hand to capture the position data of thehand[9] (as is show in Figure 1). Moreover, some kinds of datagloves can detect the movement of the joints of the fingers,or even give the user some feedback, such as vibration[10].Then, the movement information of the hand is collected andtransmitted to the computer in a wired or wireless way. Andthen the computer through the collation and analysis of data toidentify the user’s gestures, so as to achieve human-computerinteraction. One of the main application scenarios it is virtualreality.

2) Depth award cameras: Instead of the normal singlecameras, depth award cameras establish a depth map throughspecialized cameras[11]. Kinect, as an accessory to Xbox360, has a depth sensor called infrared emitter to detect thedepth data of the picture (Figure 2). The depth sensor workswith normal camera. They are complementary and provide usa better entertainment experience. Involved in the hardwareprinciple, structure light and time of flight are two noveltechniques, getting the depth data[12]. The structure light hasbeen applied to the Kinect generation one. Nevertheless, theKinect new generation adopts the time of flight, since thestructure light can just detect 1 to 4 meters.

Fig. 3. A kind of stereo cameras[8].

3) Stereo cameras: The basic principle is to use two ormore cameras at the same time to take pictures[13]. it seemsto be human eyes, insects compound eyes to observe theworld (Figure 3). By comparing the differences of imagesobtained from these different cameras at the same time, usingan algorithm to calculate the depth information, the 3D imageis obtained. This technique has the minimum requirement forhardware. However, it is also hard to implement the computervision algorithm.

As a conclusion, the advantage of the data glove is torecognize the gesture more accurately. On the contrary, dataglove is only suitable for specific areas such as animationand surgical operation, because of its high cost. Depth awardcamera relies on advanced hardware technique to capture thedepth data. Instead of the depth award camera that consumesmuch more power, the stereo camera completely relies oncomputer vision algorithms. Generally speaking, no singlemethod can be suitable for every application. Different inputdevices depend on the complex circumstances. In my opinion,the two latter should be the future development trend.

III. HAND GESTURE RECOGNITION IN PERSONAL DOMAIN

With the development of human-computer interaction andvirtual reality, personal computer games gradually becamea promising and commercially rewarding area. Because ofthe entertaining nature of the interaction, Users have a greatinterest in trying new game interface(figure 4), and they willbe immersed in the virtual environment of the game[13].

As for computer vision-based, the system of hand-gesture-controlled games should respond to users gestures as quicklyas possible. This is the so called fast-response requirement[14].In games, the recognition performance should have the highestpriority. In addition, applications sometimes do not have a real-time requirement, so the computer-vision algorithms must beefficient and robust enough. Another technical challenge is todistinguish between effective gestures and invalid gestures[14].One possible solution is to choose a specific gesture to markthe beginning of the recognition, as in the use of iPhone where

Page 3: Hand gesture recognition Technology in Human …rjh/courses/ResearchTopicsInHCI/2015-16/... · Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School

Fig. 4. Virtual environment of the game. (Source: http://htcvivemanual-muscular.rhcloud.com)

Fig. 5. The statistical results for five subjects in puzzle solving mode[15].

users make the slide gesture to mark the start of operating themobile phone. To finish the interaction, the user can set anend mark, for example, to put hands on both sides of thebody. Zhang et al[15] have presented a novel hand gesturerecognition system to test user-friendly interaction betweenhuman and computers. The system is based on two specificsensors, using Hidden Markov Models to recognize handgestures. After that, a novel game called virtual Rubiks Cube iscontrolled to estimate the performance of the system. There arefive subjects to test the system. Each subject performs about60 gesture to the hand gesture recognition system. As for theresult(Figure 5), It can accurately identify the gestures of anaverage of 57. In addition, The overall accuracy was 91.7%.That is to say, the system can effectively recognize humangestures, but it would take a lot of time to communicate wellbetween the two sides.

Intuitiveness is another important requirement in hand-gesture recognition and entertainment systems[16]. In thecommercial field, Most of the Wii games are able to mimicthe real trajectory of human motion in the sports games, suchas table tennis, tennis, and golf. Since users need to hold thehandle of the game machine rather than directly interactingwith the computer by the hands(Figure6), Wii games can easilymeet the requirement of intuitiveness. The Kinect sensor which

Fig. 6. The interaction with game handle. (Source: http://www.electric-design.co.uk)

Fig. 7. The interaction throught natural gestures. (Source:http://www.ubergizmo.com)

is an attached sensor for Microsofts Xbox360 breaks throughthis limitation and implement the interaction through naturalgestures(Figure 6). These interfaces use hand- body gesturerecognition to increase the users gaming experience.

User acceptance is also a problem that needs to be consid-ered in these applications[16]. They need to remember a fewwords that would be used in the process of interaction with thegame. Before the start of a game, a new tutorial is necessary toteach users how to use gestures. For a beginner, it takes sometime to get familiar with the process. Experienced users canquickly learn how to manipulate the game through gestures.

Page 4: Hand gesture recognition Technology in Human …rjh/courses/ResearchTopicsInHCI/2015-16/... · Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School

Fig. 8. The gesture pendant[18].

As the above analysis, in the field of the game industry, handgesture recognition is facing some special challenges. Theseissues need to be resolved before the entertainment systemis accepted by the majority of users eventually. Among thechallenges mentioned above, intuitiveness should be given alot of attention. In the future research, it should also focus onsolving this problem.

IV. HAND GESTURE RECOGNITION IN INDUSTRY DOMAIN

Compared with the game industry, medical system andHuman-robot interaction are more meaningful and have apromising prospect. So let us get a deeper understanding of thedevelopment of hand gesture recognition in these two areas.

A. Medical systems and assistive technologies

As the technology has become mature, hand gesture recog-nition has been developed to help improve medical systems.Hand gestures can be used to control the medical instruments,interact with the visualization displays, and help the disablepeople to recover quickly[17]. Some of these situations havebecome a reality.

Starner et al[18] have presented a wearable device thatcan control the medical monitoring and home automationsystems by hand. More specifically, the wearable device isa gesture pendant(Figure 8), hanging around the neck. Asusers rise the hand or walk around, the system can alsoanalyze their motion data. Once get the information, thesystem can help with the medical diagnosis and emergencysituations. Similarly, another research investigated a controller-free system, exploring medical image data via Kinect[19]. Inthis system, Microsoft Xbox Kinect is the only input device.The surgeons can interact with the device at a distance throughhand gestures to navigate the patients body (Figure 9). As wecan see, the main challenge in this situation is that surgeonsneed to interact without any physical touch. Therefore, thecontroller-free system is a significant and necessary system

Fig. 9. Through hand gestures to navigate the patient’s body[19].

to be applied in the medical industry. Moreover, a doctor-computer system has been developed by Graetzel et al[20]. Inthe system, standard input devices, like mouse and keyboard,are replaced by computer vision with using hand gestures. It ismore convenient for the surgeons to use the computers duringa surgery.

These systems mentioned above illustrate that hand gestureinteraction techniques how to play a critical role in the tradi-tional medical industries. Nevertheless, there are still manyproblems in these systems. Health care is an industry thatrequires a high degree of accuracy. In the process of gesturerecognition, any small errors can lead to serious consequences.Still, it is necessary to have addition research to improve theaccuracy and robustness of the system.

B. Human-robot interaction

Whether for fixed or mobile robot, the technique of gesturerecognition technology is one of the important aspects[21].Greater attention also needs to be given to combining the ges-tures and voice commands, improving robustness and accuracyof the system.Rogalla et al[22] have come up with a robot-assistant interaction system (Figure 10). In this system, gesturerecognition and voice are combined. Firstly, the system learnsgestures through gesture recognition and voice. then it will dothe command. This system just only trains six gestures in theimplementation. As a result, the system reported 95.9

Secondly, hand gestures include many valuable geometricfeatures in the robot task that requires navigation[24]; forinstance, the pointing gesture can command a mobile robotwalk to a place. For a fixed robot, The users may use aspecific hand gesture to command it pick up an item and placeit somewhere (Figure 11).Hand movements can be used tocontrol mechanical actions, as the human hand can imitate theform of the robot joint.

Third, for people with disabilities, when other traditionalinteractive devices, such as keyboard and mouse, can not be agood interaction, they can control the robot by hand gestures

Page 5: Hand gesture recognition Technology in Human …rjh/courses/ResearchTopicsInHCI/2015-16/... · Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School

Fig. 10. Interaction with Albert[22].

Fig. 11. The human robot interaction. (Source: http://www.idigitaltimes.com)

or voice[25].For example; Moon et al[23] have developedan intelligent robotic wheelchair, combining the gestures andvoice commands. The user can control the wheelchair bygesture and voice. When the information is inputted in thesystem, the wheelchair will go to the destination correctly.In additionally, wheelchair will automatically avoid obstaclessince sonar sensors are used to improve robustness of thesystem.

Most of the methods mentioned above use the stereo camerato get hand gestures. Moreover, some systems employ voicedetection to improving recognition accuracy. The majority ofthe system can only recognize the static hand gestures andjust remember no more than about 10 gestures. In my view,the dynamic interaction of two hands is a promising field offuture research.

V. CONCLUSION AND FUTURE TRENDS

The implementation of hand gesture recognition technologyinvolves all kinds of the challenges. It requires fast responsetime in the entertainment industry; it requires a high precisionin the medical field; it requires the user to quickly learngesture operations in the field of robotics. At the same time,gesture recognition technology helps explain why some vision-

based hand gesture systems can be more stable and mature forhuman-computer devices.

Compared with the entertainment industry, although peoplehave little interest in medical system and Human-robot inter-action, the significance of the two latter to the human beings isenormous. It will change every aspect of human life, make ourlife better and better. On the other hand, the market reactionshowed that the gesture recognition technology will continueto expand its influence in various fields. This technology canmake the human communicate with computers in extremelynatural and intuitive ways.

As for the future hand gesture interfaces, there are threeguidelines helping us evaluate the possibility of the accep-tance:

The first one is validation. As we said above, different areashave different definitions for the same gesture. For example,we slide our finger to the right to unlock the phone, while mayuse the OK gesture to indicate the beginning of the XBOXgames. To think deeply, statistical methods can be used toassess the validation of a system.

Another guideline is user independence. Data glove can bea vivid description of the concept. When you use a data glove,you need to take it with you. Furthermore, it sometimes drags aheavy cable. In addition, different people have different handsize. User independence is a critical guideline, since it candetermine users acceptability.

The third one is usability criteria. There is a lot of evaluationcriteria for the usability for one system. In entertainmentsystem, taking a beginner how long to start the game is a goodexample. In medical system, does a surgeon can check thepatient successfully using a hand gesture recognition system.

ACKNOWLEDGMENT

The author would like to thank Robert Hendley, for pro-viding useful advice in the research. Additionally, the authorwould also like to thank Tang yajing who offers encourage-ment and support.

REFERENCES

[1] Shah, Kinjal N and Rathod, Kirit R and Agravat, Shardul J, A survey onHuman Computer Interaction Mechanism Using Finger Tracking, arXivpreprint arXiv:1402.0693, 2014.

[2] Rautaray, Siddharth S and Agrawal, Anupam, Vision based hand gesturerecognition for human computer interaction: a survey, ArtificialIntelligence Review, 2015.

[3] Rautaray, Siddharth S and Agrawal, Anupam, Real time multiple handgesture recognition system for human computer interaction, Interna-tional Journal of Intelligent Systems and Applications, 2012.

[4] Nagi, Jawad and Giusti, Alessandro and Di Caro, Gianni A andGambardella, Luca M, Human control of uavs using face pose estimatesand hand gestures, ACM, 2014.

[5] Ionescu, Bogdan and Coquin, Didier and Lambert, Patrick and Buzuloiu,Vasile, Dynamic hand gesture recognition using the skeleton of the hand,EURASIP Journal on Advances in Signal Processing, 2005.

[6] Nagi, Jawad and Giusti, Alessandro and Di Caro, Gianni A andGambardella, Luca M, Human control of uavs using face pose estimatesand hand gestures, ACM, 2014.

[7] Wu, Xiaoyu and Yang, Cheng and Wang, Youwen and Li, Hui andXu, Shengmiao, An intelligent interactive system based on hand gesturerecognition algorithm and kinect, IEEE, 2012.

[8] Chen, Lulu and Wei, Hong and Ferryman, James, A survey of humanmotion analysis using depth imagery, Pattern Recognition Letters, 2013.

Page 6: Hand gesture recognition Technology in Human …rjh/courses/ResearchTopicsInHCI/2015-16/... · Hand gesture recognition Technology in Human-computer interaction Xueyuan Li School

[9] Tarchanidis, Kostas N and Lygouras, John N, Data glove with a forcesensor, Instrumentation and Measurement, IEEE Transactions on, 2003.

[10] Wilson, Andrew D, Using a depth camera as a touch sensor, ACMinternational conference on interactive tabletops and surfaces, 2010.

[11] Zhang, Zhengyou, Microsoft kinect sensor and its effect, MultiMedia,IEEE, 2012.

[12] Kolb, Andreas and Barth, Erhardt and Koch, Reinhard and Larsen,Rasmus, Time-of-Flight Cameras in Computer Graphics, ComputerGraphics Forum, 2010.

[13] Mayer, Igor and Warmelink, Harald and Bekebrede, Geertje, Learning ina game-based virtual environment: a comparative evaluation in highereducation, European Journal of Engineering Education, 2013.

[14] Wachs, Juan Pablo and Kolsch, Mathias and Stern, Helman and Edan,Yael, Vision-based hand-gesture applications, Communications of theACM, 2011.

[15] Zhang, Xu and Chen, Xiang and Wang, Wen-hui and Yang, Ji-hai andLantz, Vuokko and Wang, Kong-qiao, Hand gesture recognition andvirtual game control based on 3D accelerometer and EMG sensors,Proceedings of the 14th international conference on Intelligent userinterfaces, 2009.

[16] Chaudhary, Ankit and Raheja, Jagdish Lal and Das, Karen and Ra-heja, Sonia, Intelligent approaches to interact with machines usinghand gesture recognition in natural way: a survey, arXiv preprintarXiv:1303.2292, 2013.

[17] Kumar, Sandeep and Raja, P, Study on Smart Wheelchair based onMicrocontroller for Physically Disabled Persons, NONE, 2014.

[18] Starner, Thad and Auxier, Jake and Ashbrook, Daniel and Gandy,Maribeth, The gesture pendant: A self-illuminating, wearable, infraredcomputer vision system for home automation control and medicalmonitoring, Wearable computers, the fourth international symposiumon, 2000.

[19] Gallo, Luigi and Placitelli, Alessio Pierluigi and Ciampi, Mario,Controller-free exploration of medical image data: Experiencing theKinect, Computer-based medical systems (CBMS), 2011 24th inter-national symposium on, 2011.

[20] Graetzel, Chauncey and Fong, T and Grange, Sebastien and Baur,Charles, A non-contact mouse for surgeon-computer interaction, Tech-nology and Health Care, 2004.

[21] Kortenkamp, David and Huber, Eric and Bonasso, R Peter and others,Recognizing and interpreting gestures on a mobile robot, Proceedingsof the National Conference on Artificial Intelligence, 1996.

[22] Rogalla, O and Ehrenmann, M and Zollner, R and Becher, R andDillmann, R, Using gesture and speech control for commanding arobot assistant, Robot and Human Interactive Communication, 2002.Proceedings. 11th IEEE International Workshop on, 2002.

[23] Moon, Inhyuk and Lee, Myungjoon and Ryu, Jeicheong and Mun,Museong, Intelligent robotic wheelchair with EMG-, gesture-, and voice-based interfaces, Intelligent Robots and Systems, 2003.(IROS 2003).Proceedings. 2003 IEEE/RSJ International Conference on, 2003.

[24] Kinjo, Ken and Uchibe, Eiji and Doya, Kenji, Evaluation of linearlysolvable Markov decision process with dynamic model learning in amobile robot navigation task, Value and Reward Based Learning inNeurobots, 2015.

[25] Vardhan, D Vishnu and Prasad, P Penchala, Hand gesture recognitionapplication for physically disabled people, International Journal ofScience and Research, 2014.