a method for hand gesture recognition jaya shukla department of computer science shiv nadar...

26
A Method for Hand Gesture Recognition Jaya Shukla Department of Computer Science Shiv Nadar University Gautam Budh Nagar, India 203207 Ashutosh Dwivedi Department of Electrical Engineering Shiv Nadar University Gautam Budh Nagar, India 203207 2014 Fourth International Conference on Communication Systems and Network Technologies

Upload: bernadette-arline-allison

Post on 30-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

A Method for Hand Gesture Recognition

Jaya ShuklaDepartment of Computer Science

Shiv Nadar UniversityGautam Budh Nagar, India 203207

Ashutosh DwivediDepartment of Electrical Engineering

Shiv Nadar UniversityGautam Budh Nagar, India 203207

2014 Fourth International Conference on Communication Systems and Network Technologies

outline

• Introduction• The kinect device• Hand segmentation• Object shape detection and feature extraction• Experimental results• Conclusions

Introduction

• The Gestures can also be defined as the physical actions made by the humans that conveys some meaningful information to interact with the environment• Gestures provide non-haptic interface between our physical and cyber

physical world• They are expressive and can be made with the movements of fingers,

hands, arms, head, face or body. Gesture recognition system provides a more natural way to interact

Introduction

• With the involvement of human body part, we can categorize the gestures as:

1. hand and arm gestures: recognition of hand poses, sign languages and entertainment applications (playing game)

2. head and face gestures: a) nodding or shaking of headb) raising the eyebrowsc) looks of surprise, fear, anger

Introduction

3. body gestures: a) tracking movement of peopleb) analyzing movements of dancerc) recognizing human gaits for medical rehabilitation and athletic training

• Among these above-mentioned gestures in verbal/ nonverbal and non-haptic human interaction, hand gestures are the most expressive and used more frequently

Introduction

• First attempt to solve the problem of gesture recognition in HCI was resolved using glove-based devices But, glove based interface requires the user to wear cumbersome device• Vision based techniques can be used to overcome this restricted

interaction. However, vision based techniques faces the problems of background subtraction, occlusion, lighting changes, rapid motion or other skin colored objects in a scene

Introduction

• These problems can be solved with the help of depth camera• In 2010, Microsoft launched a 3D depth sensing camera known as

Kinect• In 2011, Ila et. al. proposed an algorithm for hand gesture recognition

system. But their method requires user to wear red color gloves[5]• In 2011, Meenakshi et.al. they converted the RGB information into

YCbCr. Segmentation based on YCbCr [6]• In 2010, Abhishek et. al. used HSV color space and the pixel values

between two thresholds hsvmax and hsvmin for skin segmentation [7].

Introduction

• In this work, we captured images for different hand gestures using one, two, three, four and five fingers with Microsoft Kinect• Using depth thresholding algorithm we removed background of the

images. Thus we are left with only hand images with different gestures

outline

• Introduction• The kinect device• Hand segmentation• Object shape detection and feature extraction• Experimental results• Conclusions

The kinect device

• It consists of an infrared (IR) projector and two cameras (RGB and IR), a multi-array microphone in a small base with motorized pivot• We are using Open source Natural Interface (OpenNI) library, which

produces 640 × 480 RGB and depth images at 30 fps

The kinect device

• The Kinect depth information can be used for aiding foreground and background segmentation, human face tracking, human pose tracking, skeleton tracking etc.

outline

• Introduction• The kinect device• Hand segmentation• Object shape detection and feature extraction• Experimental results• Conclusions

Hand segmentation

• In the proposed algorithm, we are working with some assumptions like:

1) The human hand in a depth image is the closest object.2) The position of the hand is in front of the human body.3) The distance of the hand from the camera is within some predefined range.4) There is no occlusion between the hand and the camera.

Hand segmentation

• The depth image needs to be converted into grayscale image to visualize the data as a visible image, as follows:

• Where, T1 and T2 are the threshold defined by the user. In this paper, we used T1 as 20 and T2 as 40

Hand segmentation

Preprocessing

• In the depth image obtained from kinect, there are some points for which Kinect is not able to measure the depth value It simply puts 0 for all these points considered as a noise• We assume that the space is continuous, and the pixels are highly

corelated to its neighbouring pixels. So, the missing depth value will also be same as its nearest neighbours• We used nearest neighbour interpolation algorithm to interpolate

these pixels and get a depth array that has meaningful values in all the pixels

Preprocessing

• Then, we use median filter with 5×5 windows on the depth array to make the data smooth

outline

• Introduction• The kinect device• Hand segmentation• Object shape detection and feature extraction• Experimental results• Conclusions

Contour detection

• In contour detection, the first step is to produce a binary image showing where certain objects of interest could be located• To find contour we are using an algorithm given in OpenCV library This

algorithm retrieves the connected components from the binary image and labels them

Convex hull

• Convex hull of a set of points is the smallest-area polygon enclosing those points. OpenCV library has been used for calculating the convex hull, which is based on the algorithm proposed by Skalansky[16]

Convexity defect

• Shapes of many complex objects are well characterized by such defects [16]

outline

• Introduction• The kinect device• Hand segmentation• Object shape detection and feature extraction• Experimental results• Conclusions

Experimental results

• we calculate the confusion matrix on a set of 75 images for each gestures showing one, two, three, four and five fingers• For classifying the gestures we use naive Bayes classifier. Naive Bayes

classifier requires a small amount of training data to estimate the parameters necessary for classification. We use machine learning toolkit Weka to classify. It is opensource software, written in Java

Experimental results

outline

• Introduction• The kinect device• Hand segmentation• Object shape detection and feature extraction• Experimental results• Conclusions

Conclusions

• We are obtaining binary images after applying the depth thresholding. Contour, convex hull and convexity defects obtained using images processing algorithms• There are various potential improvements to the work as a future

work. They are as follows:1) Ability to recognize gestures from two hands.2) Ability to recognize gestures from different orientations and rotations.3) Ability to recognize not only static gestures but dynamic gestures as well.