fyp - gayan denaindra (cb005044)

i

Smart Hand Motion and Gesture Recognition

System for PC

Gayan Denaindra Perera

Submitted to the

School of Computing

In partial fulfilment of the requirements for the Degree of

Bachelor of Engineering (Hons) in

Software Engineering

Supervisor

Ms. Chathura Sooriyaarachchi

Staffordshire University

March 2016, Colombo

https://lk.linkedin.com/pub/chathura-sooriyaarachchi/32/701/9a7

ii

Approval of Project Manager

____________________________________

Project Manager (Mr. Javed, Ahsan)

I certify that this dissertation satisfies the requirements for the degree of

………………………………………………………………………………………

____________________________________

(Mrs. Malsha Fernando)

Advisor/Assessor

I certify that I have read this dissertation and that in my opinion, it is fully

adequate in scope and qualify for the degree of

………………………………………………………………………………………

____________________________________

(Ms. Chathura Sooriyaarachchi)

Supervisor

iii

ABSTRACT

This dissertation focuses on hand gesture and motion recognition by proposing a

new architecture to solve the real time problem with a user’s hand which will

interact with a multiple application via hand gestures. In the first stage, the author’s

system allows detection of the hand from a video sequence. For this stage, many

detection methods are provided, but each proposal has limitations such as detecting

speed, and recognition accuracy but finally the author follows the vision based

technique for detecting the hand to overcome the above limitations. Detecting the

hand by vision based is a challenging task since so far researchers are researching

these areas and several environment issues will affect to the detecting process such

as different lighting conditions if the background contains many objects similar to

hand skin colour and background moving the object.

In the preprocessing stage, the author will achieve the detecting part via background

substation, selecting the skin colour space from the space region, thresholding, and

looping morphological operation. During this stage, the author is concerned about

the accuracy of each function.

For the gesture recognition the author’s own mathematical algorithm was used for

extract, the hand features, and to calculate the unique features. Compared with the

other existing similar systems this methodology shows more than 90% accuracy.

iv

ACKNOWLEDGMENT

I would like to thank Ms.Chathura Sooriyaarachchi and Ms. Malsha Fernando for

their continuous support and guidance in the preparation of this study. Without his

invaluable supervision, all my efforts could have been short-sighted.

Besides, especially my parents and brother, a number of friends had always been

around to support me morally. I would like to thank them as well.

I owe quit a lot to my family who provided me the chance to fulfill my career

objectives and for the support throughout my studies. I would like to dedicate this

study to them as an indication of their significance in this study as well as in my

life.

Finally, I am responsible for any errors that remain in this dissertation.

v

Table of Contents ABSTRACT ...................................................................................................................... iii

ACKNOWLEDGMENT ................................................................................................. iv

LIST OF TABLES .......................................................................................................... xvi

LIST OF FIGURES ........................................................................................................ xii

LIST OF ABBREVIATIONS ........................................................................................ xix

CHAPTER 1 ...................................................................................................................... 1

INTRODUCTION ............................................................................................................ 1

1.1 Problem overview…………………………………………………………………….1

1.2 Project objectives……………………………………………………………………..2

1.3 Project Scope………………………………………………………………………….3

1 A number of gestures patterns……………...…………………………...........3

2 A number of motions patterns. ........................................................................ 3

3 Support application .......................................................................................... 3

1.4 Proposed system features……………………………………………………………3

1.5 Project outline ............................................................................................................. 5

CHAPTER 2 ...................................................................................................................... 6

DOMAIN RESEARCH .................................................................................................... 6

2.1 Overview ...................................................................................................................... 6

2.2 Similar system Study .................................................................................................. 7

2.2.1 Flutter ................................................................................................. 7

2.2.2 Point grab ........................................................................................... 8

2.2.3 Wave Control ..................................................................................... 8

2.2.4 Control Air ......................................................................................... 9

2.2.5 AMD Gesture control ........................................................................ 9

2.2.6 Proposed approach. .......................................................................... 10

2.2.6.1 Overall comparison ....................................................................... 13

2.3 Human hand. ............................................................................................................. 14

2.4 Diversity of Hand ...................................................................................................... 16

2.5 Geometric features of human hand ........................................................... 16

2.6 Skin colour adaption. ................................................................................ 17

vi

2.7 Hand gestures ............................................................................................ 18

2.8 Proposed hand gestures. ........................................................................... 19

2.9 Proposed motions ..................................................................................... 20

2.10 Supported different motion behaviors……………………………………………21

2.10.1 Vertical motion .................................................................................... 21

2.10.2 Horizontal motion ................................................................................ 22

2.11 Best Programming language ................................................................... 24

2.12 Selecting an image capture devices. ....................................................... 26

2.12.1 Sensor camera ...................................................................................... 26

2.12.2 RGB pixel camera ................................................................................ 27

CHAPTER 3 .................................................................................................................... 32

TECHNICAL RESEARCH ........................................................................................... 32

3.1 Hand detection. ......................................................................................................... 32

2.3.1 Background Subtraction......................................................................... 32

2.3.2 Image Smoothing ................................................................................... 34

2.3.3 RGB image convert to HSV .................................................................. 35

2.3.4 Image thresholding................................................................................. 37

2.3.5 Morphological Operation ....................................................................... 38

2.3.6 Arm remover .......................................................................................................... 41

2.3.7 Gesture recognition. ................................................................................... 42

Artificial Neural Network (ANN) .......................................................... 42

K-Nearest Neighbor (KNN) ................................................................... 43

Support Vector Machine (SVM) ............................................................ 43

Hidden Markov Model (HMM) ............................................................. 44

The Scale Invariant Feature Transform (SIFT) ..................................... 44

Author’s approaches. ............................................................................. 44

2.3.8 Motion detection and recognition………………………………………..49

CHAPTER 4 .................................................................................................................... 52

SYSTEM REQUIREMENT SPECIFICATION .......................................................... 52

4.1 Application Description……………………………………………………...52

4.2 Functional requirements……………………………………………………...52

vii

4.2.1 Hand detection requirements. .......................................................... 52

4.2.2 Hand recognition requirements ........................................................ 53

4.2.3 Motion detection and recognizing requirements .............................. 53

4.2.4 Commands Executing requirements. ................................................ 54

4.3 Non- functional requirements………………………………………………………54

4.3.1 Environmental requirements ............................................................ 54

4.3.2 Performance requirements ............................................................... 54

4.3.3 Usability requirements…………………………………………......54

4.3.4 Scalability requirements ................................................................... 55

4.3.5 Maintainability requirements ........................................................... 55

4.3.6 Serviceable requirements ................................................................. 55

4.3.7 Reliability requirements ................................................................... 55

4.3.8 Hardware and Software requirements .............................................. 55

4.3.9 System requirements and specification………………………….....56

4.3.10 Application program interface requirements ................................. 56

CHAPTER 5 .................................................................................................................... 57

SYSTEM DEVELOPMENT PLAN .............................................................................. 57

5.1 System development methodologies……………………………………………….57

5.2 Gantt chart………………………………………………………………………….58

CHAPTER 6 .................................................................................................................... 59

DESIGN ........................................................................................................................... 59

6.1 Overall Block Diagram……………………………………………………………..61

6.2 UML Use-Case Diagram……………………………………………………………62

6.2.1 Main use-case diagram ........................................................................... 62

6.2.2 Use-case level 1 – Hand detection ......................................................... 63

6.2.3 Use case Specification – Level 1 ............................................................ 64

6.2.4 Use-case Level 2 – Recognizing hand posture ....................................... 65


6.2.6 Use-case Level 3 – Detect and recognizing hand motion ...................... 67


6.2 UML Activity diagram……………………………………………………………..69

viii

6.2.6 Hand Detection. .............................................................................................. 69

6.2.7 Gesture Recognition………………………………………………..70

6.2.8 Detect and recognizing hand motion ..................................................... 72

6.3 UML Class diagram………………………………………………………………...74

6.3.6 Proposed software design pattern .......................................................... 74

6.4.2 Proposed design Architecture ................................................................ 74

6.4.3 Class Diagram ........................................................................................ 76

6.4.4 Class Diagram with Packages - UML Architectural diagram ............... 77

6.4.5 Facade Class........................................................................................... 77

6.4.6 HandDetection Class .............................................................................. 77

6.4.7 GestureRecognition class ....................................................................... 78

6.4.8 MotionDetection Class........................................................................... 78

6.4.9 Command Class ..................................................................................... 78

6.4.10 WebCamera Class ................................................................................ 78

6.4.11 Starter GUI ........................................................................................... 78

6.4.12 Main GUI ............................................................................................. 78

6.4.13 Feedback GUI ...................................................................................... 78

6.4.14 Camera GUI ......................................................................................... 78

6.5 Sequence Diagram .................................................................................................... 79

6.5.1 Hand Detection ...................................................................................... 79

6.5.2 Gesture recognition ................................................................................ 80

6.5.3 Motion Detection and Recognition ........................................................ 82

6.6 Wireframes and Graphical user interface……………………………………83

CHAPTER 7 ........................................................................................................... 88

IMPLEMENTATION ............................................................................................ 88

7.1 Overview ......................................................................................................... 88

7.2 Graphical user interface (GUI)……………………………………………….88

7.3 HandDetection ................................................................................................ 89

7.4 Image capturing and resizing thei mage. ........................................................ 89

7.5 Image Smoothing ............................................................................................ 90

7.6 Background subtraction .................................................................................. 90

ix

7.7 RGB to HSV colour conversion. .................................................................... 91

7.8 Image thresholding.......................................................................................... 92

7.9 Arm remover. .................................................................................................. 93

7.10 Detected thumb region .................................................................................. 94

7.11 Gesture recognition ....................................................................................... 95

7.12 Motion detection and recognition ................................................................. 98

7.13 Command Executing ................................................................................... 102

CHAPTER 8 ...................................................................................................... 103

TESTING ........................................................................................................... 103

8.1 Test Plan ...................................................................................................... 103

8.2 Unit Testing ................................................................................................. 103

8.2.1 Web Camera......................................................................................... 104

8.2.2 Data Read and Write ............................................................................ 104

8.2.3 Image Pre-processing ........................................................................... 104

8.2.4 Gesture Recognition............................................................................. 105

8.2.5 Motion Detection ................................................................................. 105

8.2.6 Command Execution ............................................................................ 105

8.3 Scenario Testing...................................................................................... 106

8.3.2 Morphological Operation Erosion ....................................................... 106

8.3.3 Morphological Operation Dilation ....................................................... 107

8.3.4 Image Grayscale................................................................................... 107

8.3.5 Image Thresholding ............................................................................. 108

8.3.6 Image Smooth ...................................................................................... 108

8.3.7 Converts RGB image into HSV colour space. ..................................... 109

8.3.8 Background Subtraction....................................................................... 109

8.3.9 Detect thumb region of a hand. ............................................................ 110

8.3.10 Extract hand features percentage. ...................................................... 110

8.3.11 Detect motion. .................................................................................... 111

8.3.12 Recognizing motion ........................................................................... 111

8.3 Scalability testing ......................................................................................... 112

8.3.1 Test 1. ................................................................................................... 112

x

8.3.2 Test 2. ................................................................................................... 113

8.3.3 Test 3 .................................................................................................... 113

8.3.4 Test 4 .................................................................................................... 114

8.3.5 Test 5 .................................................................................................... 114

8.3.6 Test 6. ................................................................................................... 115

8.4 Test Environment ............................................................................................ 116

8.4.1 Hand Detection from cluttered Background – Level 1 ........................ 116




8.4.5 Hand Detection from Static Background – Level 3 ............................. 118



8.4.8 Hand Detection from Similar skin colour background– Level 3 ......... 120


8.4.10 Hand Detection from Artificial lighting – White light ...................... 121

8.4.11 Hand Detection from Artificial lighting – Yellow light .................... 121

8.5 Accuracy testing ........................................................................................... 122

8.5.1 Hand detection with wear a rings......................................................... 122

8.5.2 Hand detection with wear a blazer. ...................................................... 123

8.5.3 Hand detection with wear are glamourous as uncommonly. ............... 123

8.5.4 Hand detection with wear wrist watch ................................................. 124

8.5.5 Hand detection with wore glove .......................................................... 124

8.6 Performancetesting ..................................................................................... 125

8.6.1 Performance of load background image. ............................................. 125

8.6.2 Performance of Hand Detection........................................................... 125

8.6.3 Performance of Gesture Recognition ................................................... 126

8.6.4 Performance of Motion Detection ....................................................... 126

8.6.5 Performance of Motion Recognition ................................................... 127

8.6.6 Performance of Overall. ....................................................................... 127

8.6.7 CPU usage and Memory usage. ........................................................... 128

xi

8.7 Test evaluation ..................................................................................................... 128

8.7.1 Hand detection from different background .......................................... 128

8.7.2 Gesture and Motion recognition .......................................................... 129

8.7.3 Wearable conditions............................................................................. 130

CHAPTER 9 ...................................................................................................... 131

CRITICAL EVALUATION ............................................................................ 131

9.1 Domain Research .......................................................................................... 131

9.2 Technical research ........................................................................................ 131

9.3 System Design .............................................................................................. 134

9.4 System implementation ................................................................................. 135

9.5 System testing ............................................................................................... 135

CHAPTER 10 .................................................................................................... 137

CONCLUSION ................................................................................................. 137

10.1 Further Development .................................................................................. 137

10.2 Limitation and assumption .......................................................................... 138

REFERENCES .................................................................................................. 139

APPENDIX A .................................................................................................... 146

8.2.1 Web Camera............................................................................................ 146

8.2.2 Data Read and Write ............................................................................... 146

8.2.4 Image Pre-processing .............................................................................. 147

8.2.4 Gesture Recognition................................................................................ 149

8.2.5 Motion Detection .................................................................................... 151

8.2.6 Execute command ................................................................................... 152

APPENDIX B .................................................................................................... 153

APPENDIX C .................................................................................................... 154

Userguide ......................................................................................................... 154

Applicationinterfaces ....................................................................................... 157

xii

LIST OF FIGURES

Figure 1: Flutter logo .............................................................................................. 7

Figure 2: PointGrap logo ........................................................................................ 8

Figure 3: Wave Control logo .................................................................................. 8

Figure 4: Control Air logo ...................................................................................... 9

Figure 5: AMD logo................................................................................................ 9

Figure 6: Skype logo ............................................................................................. 10

Figure 7: VLC and KMP logo .............................................................................. 11

Figure 8: Microsoft office PowerPoint ................................................................. 12

Figure 9: System volume ...................................................................................... 12

Figure 10: Image Viewer ...................................................................................... 13

Figure 11: Human hand ........................................................................................ 14

Figure 12: Hand diversities ................................................................................... 16

Figure 13: Geometrics of the hand........................................................................ 16

Figure 14: Skin Colors .......................................................................................... 17

Figure 15: Hand gestures ...................................................................................... 18

Figure 16: Proposed hand gestures ....................................................................... 19

Figure 17: Proposed motions ................................................................................ 20

Figure 18: Vertical Pattern1 .................................................................................. 21

Figure 19: Vertical Pattern 2 ................................................................................. 21

Figure 20: Vertical Pattern 3 ................................................................................. 22

Figure 21: Horizontal Pattern 1 ............................................................................. 22




Figure 25: Leep motion ......................................................................................... 26

Figure 26: Microsoft Kinect.................................................................................. 27

Figure 27: Logitech C920 ..................................................................................... 28

Figure 28: Freetalk HD ......................................................................................... 29

Figure 29: Logitech C270 HD .............................................................................. 30

xiii

Figure 30: Background subtraction ....................................................................... 32

Figure 31: Image smoothing ................................................................................. 34

Figure 32: Erosion adapter .................................................................................... 39

Figure 33: Sample erosion .................................................................................... 39

Figure 34: Dilation adapter ................................................................................... 40

Figure 35: Sample Dilation ................................................................................... 40

Figure 36: Arm removal calculation ..................................................................... 41

Figure 37: Arm removed hand .............................................................................. 41

Figure 38: Neural network .................................................................................... 42

Figure 39: K-Nearest Neighbor ............................................................................ 43

Figure 40: Posture rotation.................................................................................... 45

Figure 41: Calculate height ................................................................................... 46

Figure 42: Percentage calculation ......................................................................... 47

Figure 43: Calculate vertical height ...................................................................... 47

Figure 44: Extract features .................................................................................... 48

Figure 45: Horizontal motion calculation ............................................................. 49

Figure 46: Vertical motion calculation ................................................................. 50

Figure 47: Spiral model ........................................................................................ 57

Figure 48: Block diagram ..................................................................................... 61

Figure 49: Main Use case diagram ....................................................................... 62

Figure 50: Hand detection - use case diagram ...................................................... 63

Figure 51: Gesture recognition - use case diagram ............................................... 65

Figure 52: motion detection and recognize -use cases .......................................... 67

Figure 53: Hand detection activity diagram.......................................................... 69

Figure 54: gesture recognition- activity diagram .................................................. 71

Figure 55: Motion detection and recognition - activity diagram .......................... 72

Figure 56: Architectural diagram .......................................................................... 77

Figure 57: Hand detection- sequence diagram ...................................................... 79

Figure 58: Gesture recognition - sequence diagram ............................................. 81

Figure 59: Motion detection and recognition- sequence diagram ........................ 82

Figure 60: Main Screen- Wireframe ..................................................................... 83

xiv

Figure 61: Main Screen – Screenshot ................................................................... 84

Figure 62: Capture background- wireframe .......................................................... 85

Figure 63: Capture background- Screenshot ......................................................... 85

Figure 64: Controller- wireframe .......................................................................... 86

Figure 65: Controller – Screenshot ....................................................................... 86

Figure 66: Feedback form – wireframe ................................................................ 87

Figure 67: Feedback form- Screenshot ................................................................. 87

Figure 68: Hand detection implementation........................................................... 89

Figure 69: Image capture ...................................................................................... 89

Figure 70: Image resize ......................................................................................... 89

Figure 71:Image smoothing .................................................................................. 90

Figure 72: Background subtraction ....................................................................... 90

Figure 73: Background pixels diffrencess ............................................................ 90

Figure 74: Adapter based image binarilization ..................................................... 91

Figure 75: Constant colour values ........................................................................ 91

Figure 76: Colour space conversion ..................................................................... 91



Figure 79: Image thresholding .............................................................................. 92

Figure 80: Calculate threshold value .................................................................... 93

Figure 81: Thresholding initialization .................................................................. 93

Figure 82: Arm remover ....................................................................................... 93

Figure 83: Detect thumb region ............................................................................ 94

Figure 84: Gesture recognition implementation ................................................... 95

Figure 85: Calculate height ................................................................................... 95

Figure 86: Calculate feature level 1 ...................................................................... 95



Figure 89: recognize thumb region ....................................................................... 96

Figure 90: Check posture model one .................................................................... 97

Figure 91: Load posture model two ...................................................................... 97

xv

Figure 92: Motion detection and recognition implementation ............................. 98

Figure 93: Add into motion list ............................................................................. 98

Figure 94: Detect horizontal motion ..................................................................... 99

Figure 95: Call commands 1 ............................................................................... 100

Figure 96: C all Commands 2 ............................................................................. 101

Figure 97 : Hand detection performance ............................................................ 128

Figure 98: gesture recognition performance ....................................................... 129

Figure 99: Step 1 guide ....................................................................................... 154

Figure 100: Step 2 guide ..................................................................................... 155

Figure 101: Step3 guide ...................................................................................... 155



Figure 104: KMP player ..................................................................................... 157

Figure 105: VLC player ...................................................................................... 158

Figure 106: Power point presentation ................................................................. 158

Figure 107: Skype ............................................................................................... 159

Figure 108: Image viewer ................................................................................... 159

xvi

LIST OF TABLES Table 1: VLC and KMP player comparison ......................................................... 11

Table 2: Overall system comparison..................................................................... 13

Table 3: Device comparison ................................................................................. 30

Table 4: Extracted features ................................................................................... 48

Table 5: Hand detection - usecase specification ................................................... 64

Table 6: Gesture recognition - use case specification ........................................... 66

Table 7: Motion detection and recognition - use case specification ..................... 68

Table 8: Key event execution ............................................................................. 102

Table 9: Web camera unit tests ........................................................................... 104

Table 10: Data read and write unit tests.............................................................. 104

Table 11: Image processing unit tests ................................................................. 104

Table 12: Gesture recognition unit tests ............................................................. 105

Table 13: Motion detection unit tests .................................................................. 105

Table 14: Command unit tests ............................................................................ 105

Table 15: Erosion test ......................................................................................... 106

Table 16: Dilation test ......................................................................................... 107

Table 17: Image grayscale test ............................................................................ 107

Table 18: Image thresholding test ....................................................................... 108

Table 19: Image smoothing test .......................................................................... 108

Table 20: Colour conversion test ........................................................................ 109

Table 21: Background subtraction test ............................................................... 109

Table 22: Thumb region detection test ............................................................... 110

Table 23: Features extraction test ....................................................................... 110

Table 24: Motion detection test .......................................................................... 111

Table 25: Motion recognition test ....................................................................... 111

Table 26: Scalability test 1 .................................................................................. 112





xvii

Table 31:Scalability test 6 ................................................................................... 115

Table 32: Environment test 1 .............................................................................. 116









Table 41: Environment test 10 ............................................................................ 121

Table 42: Environment test 11 ............................................................................ 121

Table 43: Accuracy test 1 ................................................................................... 122





Table 48: Performance test 1 .............................................................................. 125







Table 55: Unit test 1 ............................................................................................ 146

Table 56: Unit test 2 ............................................................................................ 146

Table 57: Unit test 3 ............................................................................................ 146

Table 58: Unit test 4 ............................................................................................ 147

Table 59: Unit test 5 ............................................................................................ 147

Table 60: Unit test 6 ............................................................................................ 147

Table 61: Unit test 7 ............................................................................................ 148

xviii

Table 62: Unit test 8 ............................................................................................ 148

Table 63: Unit test 9 ............................................................................................ 148

Table 64: Unit test 10 .......................................................................................... 149

Table 65: Unit test 11 .......................................................................................... 149

Table 66: Unit test 12 .......................................................................................... 149

Table 67: Unit test 13 .......................................................................................... 150

Table 68: Unit test 14 .......................................................................................... 150

Table 69: Unit test 15 .......................................................................................... 150

Table 70: Unit test 16 .......................................................................................... 151

Table 71: Unit test 17 .......................................................................................... 151

Table 72: Unit test 18 .......................................................................................... 152

xix

LIST OF ABBREVIATIONS

ANN Artificial Neural Network

HMM Hidden Markov Model

SIFT Scale Invariant Feature Transform

BOF Bag-Of-Feature

VCA Video contents analysis

RGB Red, Green, Blue

GUI Graphical user interface

MVC Model view Controller

HSV Hue, Saturation, Value

DOF Degree of Freedom

CMYK Cyan, Magenta, Yellow, Key (black)

HSL Hue, Saturation, Lightness

SDK Software Development Kid

API Application Program interface

HD High Definition

JDK Java Development Kid

WPF Windows Presentation Foundation

1

CHAPTER 1

INTRODUCTION

1.1 Problem overview

Traditional human-computer interaction devices such as the keyboard and mouse

have become ineffective in the recent past. It does not provide 100% successful

interaction between human and technology. Therefore, an effective method is

required to perform tasks using a different application in a virtual environment as

people’s lifestyle entirely depends on technology.

With the intention of bridging the gap a step needs to be taken forward to improve

this interaction between humans and technology by using biological reaction

mythologies such as speech, gesture, facial expressions and body expressions.

Used these features scientist created the newest direction for human-computer

interaction in computer base advance modern input devices like motion detection

sensors, touch screen, wearable tool, Microphones, micro cameras, and actuators.

But the main drawback is it's more expensive and it’s not given the comfortable

condition.

To overcomes these issues author introduce the new way of interacting with a

computer with the aid of hand gesture reactions using a normal web camera. The

advantage of the approach the author introduces is it improves interactions with the

computer or any other device as a non-contact human like input modality.

2

1.2 Project objectives.

Here author Project objectives define by SMART Conditions. The definition of

SMART is an acronym that stands for Specific, Measurable, Attainable, Realistic

and timely.

Specific – In this application should be able to identify hand gestures patterns

and different motion destination with the minimum time slots. And these hand

gestures will able to control the most relevant windows applications using

today.

Measurable – There are six gestures can be recognized by the developed

system and the two common motions are vertical and horizontal. The author

Considered Skype, KMP Player, VLC player, Windows image gallery, System

volume and Microsoft office presentation and application shutting down to be

controlled using 21different commands.

Attainable – During the first three and half months provided to complete the

project, the author will analyze the project domain areas such as image

processing techniques and methods, object detection approaches, object

recognition approaches and feature extraction methods. Finally, the author will

complete implementing a fully functional software using the above-mentioned

technique during the last three months.

Realistic – During the first three months, a research is conducted and

documented. And the implementation of the introduced software will be

completed taking into all the functions into account within the last three months.

Timely – The introduced system contains four major areas which are hand

detection, gesture recognition, motion detection and motion recognition. In

order to complete developing the software on time, the author has to manage

time efficiently and effectively according to a schedule. For an instance during

the first three and half months fully completed the domain and technical

3

research. The first month of the last period will be able to detect a hand, during

the second month the able to recognizing hand posture, at the final stage

developer recognizing the hand posture and it motion.

1.3 Project Scope.

1. A number of gestures patterns.

The user is able to handle the application using 6 specific gestures. Those

gestures are simple and memorable for the user. The author has decided not to

increase the number of gestures as the user will not be able to memorize a large

number of gesture patterns which would lead the application to be less userbility

and difficult to understand.

2. A number of motions patterns.

14 hand motion are available base on the horizontally and vertically. In the

horizontal direction the user can swipe from left to right or right to left. In

vertical direction the user can swipe from top to bottom and bottom to top.

3. Support application

The developed application will allow the user to Control popular applications

such as KMP player, VLC player, Skype, Powerpoint presentation, Image

viewer and system volume. These applications are most flexible when

considering its level of interaction with users hand and these applications are

being widely used by a large number of PC users worldwide.

1.4 Proposed system features.

The proposed system will be able to control 6 popular common applications

using today. Such as KMP player, VLC player, Image viewer, Power-point

presentation, Skype and system volume.

The proposed system will be able to recognizing 6 hand gestures.

4

The proposed system will be able to recognizing 14 motion behaviors based

on vertical and horizontal.

The proposed system will be able to response within 2.99 second.

The proposed system will be able to controlling under different lighting

conditions. Such as natural lighting, artificial white and yellow lighting.

The proposed system will be able to controlling different hand sizes, such as

small, Medium, Large.

The proposed system will be able to controlling with wearable tools. Such as

hand ring, hand bands, hand gloves.

The proposed system will be able to running on minimum system

requirements specifications.

This system will be able to ruing on windows XP and higher operating

systems, using boot adapter software will working MAC and Linux as well.

5

1.5 Project outline

In chapter 1, the author introduced the problem overview and objectives of the

entire project. Announced the project scope such as gesture patterns, motion

patterns supported applications and system features.

In chapter 2, covered the Domain area of this project. As mainly similar systems,

Human hand Diversity and Geometric features. Proposed motions and gestures,

selected programming language where going too implemented, suggest a capture

device.

In chapter 3, discussed what are the algorithms available and which are most

suitable for solve this author’s approach.

In chapter 4, talked system requirement specification for developing the end-user

product.

In chapter 5, introduce the system development plan for this project.

In chapter 6, Design the high priority fiction how to flow down during the run of

this system.

In chapter 7, Author discussed Main highlighted points of the implementations.

In chapter 8, covered the test plan and various testing techniques for this system.

In chapter 9, critically evaluate the overall dissertation

In chapter 10, mentioned System limitations and assumptions, and future

improvements.

6

CHAPTER 2

DOMAIN RESEARCH

2.1 Overview

At present, nowadays there are a great amount of interests in hand tracing and

detection approaches. The human hand is the part of the device which is used

mostly by individuals to interact with the digital world and a highly articulated

structure with 27 degrees of freedom (DOF). High DOF of the hand gesture

recognition indeed an extremely challenging risk.

The approach of recognition of hand gesture has three stages. The first stage is the

hand posture detection from the captured image frame and the second stage is the

recognition once the detected hand and motion is done similar to human conduct.

Segmentation is the first stage of all subtasks and the main challenging step in this

entire project. And finally, the recognized hand will move to recognizing motion.

This level is the most difficult to implement because motion recognition will

consume the time and accuracy of the overall system and the author not going to

control this system via sensor camera so it became major challenging level of this

application.

Past researchers find many different ways for a segmented hand from the given

image such as wearing gloves or figure mask have been used to extract the posture

and other ways are already managed environments like a uniform background such

as a black or white curtain. However wearing gloves or using uniform background

reduce the distance of the natural interaction feeling between user and system.

The author’s approaches must be able to detect a hand, recognizing the gesture in a

cluttered background using skin detection and background subtraction and finally,

use feature extraction to recognizing the hand posture easily and faster from any

environment.

7

Different types of research propose various techniques for hand detection such as

glove-base, sensor cameras, and respectively the vision base methods. The data

glove base methods depend on the sensor device for digitizing hand and figure

movement into multiple parametric data. Therefore using a sensor device are quite

expensive and wearing gloves or trackers is too uncomfortable and to enlarge the

“time to interface” or setup time.

For the current approaches, used vision–base methods amongst other approaches

since it’s more natural and useful for the real-time application.

2.2 Similar system Study

Hand gesture application is currently available on both mobile and desktop

platforms. In the mobile platform, there are lots of application available from the

google, windows, and IOS store. But these apps have very limited features such as

control the camera, and photo slides. Considering in desktop applications they also

contain limited features such as control media player and image viewer with limited

gestures. Throughout study on the current similar systems for both mobile and

desktop platform apps, end of this section carried out to examine the author’s

approach.

2.2.1 Flutter

Figure 1: Flutter logo

Source: Flutter

Flutter can play and pause a song which is playing, and movies with gestures using

a user’s hand. It will detect a user’s gesture via webcam and respond according to

it. Flutter can handle only media player, Spotify, Quicktime, Winamp, and iTunes.

8

The Drawbacks of flutter is that it supports only one gesture for a handle for that

paly/pause, and it may be effective to different lighting conditions. Flutter is a non-

GUI application.

2.2.2 Point grab

Figure 2: PointGrap logo

Source: Pointgrap

PointGrab’s flexible technology is designed for raid integration into new and

existing products as a software-only solution that can be integrated on any

consumer electronic device including mobile and PC as any home applications. The

Drawback is that all platform versions don’t support better performance without a

sensor webcam and application support for fewer gestures collections (point-grab,

2015). And also it controls only limited applications and it’s a non-GUI application.

2.2.3 Wave Control

Figure 3: Wave Control logo

Source: Wave Control

Without a touching screen controlling your music and video playback is possible

with waves of your hand over your phone. Drawbacks are still available in beta

version, user reviews are not satisfied and not working if the phone screen becomes

9

locked. And it is only control media player with limited single gestures

(GoogleStore, 2014).

2.2.4 Control Air

Figure 4: Control Air logo

Source: Control Air

Controlair an app that controls only Mac’s camera to identify the user's finger

movements, allowing the user to control the media player. Drawbacks are controlair

supported for only media player and its support only one single gesture (ControlAir,

2015).

2.2.5 AMD Gesture control

Figure 5: AMD logo

Source: AMD

AMD gesture control that enables to control the media player, image gallery, and

e-reader with virtually hands-free. This application contains vertical\horizontal

motions with one gesture. The major drawback is that the app can’t be used in every

PC because running the app required AMD A-Series processor.

10

2.2.6 Proposed approach.

The application allows controlling most valued applications for user’s life like

Skype, KMP player, VLC player, Powerpoint presentation, image viewer, and

system volume. Below the author has described the reasons for coming up with this

application over the others and what are the command gestures available.

Skype

Figure 6: Skype logo

Source: Skype

Voice calling today is the most powerful communication method for contacting

each other in the world. But finding the most powerful and famous software people

use these days is the big challenge. In the market there are lots of software’s existing

such as Viber, Skypes, Google chats and WhatsApp. According to Scene (2011),

the highest reviews was for skype used for video calling. Skype is able to easily

available for installation and the video quality is very much accurate compared to

other video calling software. According to Freemake (2015), skype is multi-

platform so it can be used in Linux, iOS Android, Windows, Blackberry, Amazon

fire, Smart TV, and PlayStation vita as well. Skype supports online calling and also

local calling as well. This project supports for an answer incoming call, ignoring

calls, and mute microphone as well. This proposed system able three functions such

as mute the mic, ignore the call and hang up to user.

11

KMP player and VLC Player

Figure 7: VLC and KMP logo

Source: VLC and KMP

Controlling a media player is another feature of this approach. Pause\play, Forward

and backward functions are enable to user control this proposed approach. The

author struggled to find between VLC and KMP player for a fine player which is

most used today. According to (Beebom, 2015) the biggest difference is KMP

player supports 3D, 2K, 4K UHD but VLC only supports limited video stream.

According to Macxdvd (2015), VLC is fully portable therefore it can be used in

most of the platform. KMP player doesn’t support every platform. Below

comparison chart helps to understand features dependencies between KMP and

VLC players.

Media

Player

Developer Operating

system

Mobile

OS

Features Additional

Features

KMP Pandora

TV

Windows 7,

Windows Vista,

Windows XP

Windows

OS

Remote Controllable

Equalizer

Media Library

Pitch Shifting

Streaming

Time Stretching

Visualizer

Altering of

playback speed

highly

customizable

configuration

VLC The

VideoLan

Organizati

on

Linux

Mac OS X

Windows 7

Windows Vista

Windows XP

Android Equalizer

Media Library

Pitch Shifting

Streaming

Time Stretching

Visualizer

Support for DVDs

of all regions A/V

sync adjustments

Table 1: VLC and KMP player comparison

12

CNET user reviews offer VLC player and SOFTONIC offers KMP player. In

software selection can’t justify which is the most downloaded application. Finally,

the author came up with both to control. The proposed application handles the

play/pause, forward\backward for users.

PowerPoint Presentation

Figure 8: Microsoft office PowerPoint

Source: Microsoft office

Powerpoint presentation is offer by Microsoft office package and it’s the most

leading presentation software where academic and business presentation purposes.

This project allows controlling presentations slides moves to forward and

backward.

Operating system volume

Figure 9: System volume

Source: Author’s work

Operating system volume is an in-built application and this proposed approach

able volume up, volume down, mute speakers and on.

13

Image Viewer

Figure 10: Image Viewer

Source: Iconachive

Image viewer is a default in-built software which is able to forward and backward

image, image zoom-In, and zoom-out.

Closing Application

If user needs to close any application, this proposed approach able to handle this

command as well.

2.2.6.1 Overall comparison

Features

Application

Available

gestures

Available

motion

Graphical

user

interface

Support

application

Accuracy User

friendly

Flutter 1

0 Low Low Medium Low

Point grab 2 4 Low Medium Medium Low

Wave Control 1 2 Low Low Medium Low

Control Air 2 2 Medium Low Medium Medium

ADM gesture

control

1 4 Medium Medium Medium Medium

Proposed

approach

6 4 High High High High

Table 2: Overall system comparison

14

This Proposed approach is aimed to control above most excited applications. And

it consists different memorable outstanding six gestures and motion types than

other similar existing applications. Comparing the accuracy with others, the

proposed system will be at highest levels with large scale of gestures and motions.

The approach is available with better graphical user interface based on the user

experiences and it is supports to many different application thand other systems.

When the program runs, the application will provide the virtual hand for user to

identify how the camera focusing the user. The application would able to run on

Windows XP or higher versions and using boot adapter application support on the

Mac and Linux as well.

2.3 Human hand.

Figure 11: Human hand

Source: Healthline

Healthline (2015), Hands are accomplished in a wide variety of function, with gross

and Moto movements. Moto movements enable a large object or perform heavy

labor. Moto movements perform delicate tasks for holding small objects or

performing detailed work. According to Taylor (2015) considering the physical

area in the human hand, each hand contains 27 single bones that giving the hand an

incredible range and accuracy of motion. The forearm’s ulna and radius help to the

15

many muscles that manipulate the bones of the hand and wrist. There are eight

small carpal bones positioned in the wrist area that is consistently bound in two

rows of four bones each. The mass that results from these bones is called the corpus.

The corpus is rounded on its proximal end, where it expresses with the ulna and

radius at the wrist. The carpus is slightly concave on the palmar side, forming a

canal known as the carpal tunnel through which tendons, ligaments, and nerves

extend into the palm. Its distal surface articulates with the metacarpal bones, which

are joined to the campus by the palmar carpometacarpal ligaments.

Finger

Digits that extend from the palm of the hand.

Palm

This is the bottom of the body of the hand.

Back

The back of the hand shows the dorsal venous network.

Wrist

The linking point between the arm and the hand, the wrist allows improve hand

motion. Each hand has 19 of bones. The palm contains five metacarpals, and

each finger except the thumb contain one proximal phalanx, one middle phalanx,

and one distal phalanx. The thumb doesn’t have a central phalanx. Each bone is

connected by sequence of ligaments (Healthline, 2015).

16

2.4 Diversity of Hand

Medlej (2014), Human hands are vary by individually just like much as facial

features. Essentially diversity occurs male’s hand differs from females, young from

old, and country regions as well. And Diversity can be measured by the width of

the palm area, nail shape, and height of the hand.

Figure 12: Hand diversities

Source: Tutorplus

2.5 Geometric features of human hand

Figure 13: Geometrics of the hand

Source: Tutorplus

17

Medlej (2014), one hand contains five fingers where all point into palm area.

Geometrically the thumb and forefinger give the largest opening bend, the little

finger and ring finger give the second largest opening bend, and sequentially the

forefinger, middle finger and ring finger are almost same size of the opening bend

in the hand. The maximum angle between thumb and little finger practically nearly

900 taken from the very base of thumb articulation.

2.6 Skin colour adaption.

Dennis (1998), Human skin colour mainly consist pigment define melanin, melanin

is controlling by 6 genes. Both light and dark skin people have melanin. There are

two forms are created pheomelanin, “which is red to yellow in colour, and

eumelanin, which is black brown to pure black”. People having with light skin

mostly generated pheomelanin until those who having with dark coloured skin

typically generated eumelanin. In addition, individual differ in the number and size

of melanin particles. There are two variables are further significant for determining

skin colour than the percentage of the different melanin ranges. “Lighter skin,

colour is also affected by red cells in blood flowing closer to the skin”.

Figure 14: Skin Colors

Source: Human Bio

18

2.7 Hand gestures

Pavlovic and Sharma (1997), Hand gestures are non-verbal communication among

people. To achievement the use of gesture in Human-computer interaction, it is

necessary to offer the means by which they can be interpreted by computers.the use

of hand gestures provides an attractive way to weighty interface device for human-

computer interaction. In the visual interpretation of hand, the gesture can help in

achieving the ease and naturalness desired for HCI. According to Chen and Kim

(2014) keyboard and mouse devices are a significant role in human-computer

interaction at present. But in the rapid development of software and hardware

industry, new types of HCI methods have been required. For this resons filling this

gap hand gesture which comes to the industry to interact with the technology.Hand

gesture consists two types know as static gesture and dynamic gestures.Static

gesture are posture will be without any motion and dynamic gestures are having

motion with the posture.in this section, the author introduces about of the different

postures and suitable posture with motion.

Figure 15: Hand gestures

Source: Health line(2016)

19

2.8 Proposed hand gestures.

Figure 16: Proposed hand gestures

Source: Authors work based on google (2016)

Theses 6 postures are very comfortable and memorable with the user. The

Application able to support these posture both right and left hand. And motion will

add to these postures it discuss in the next section.

20

2.9 Proposed motions

The motion will depend on the x and y coordinators and its support to both

horizontally and vertically directions. These motions are easily moving by user's

hand and there are special postures contain unique motion direction.

Posture type Motion direction

Horizontal motion gesture

Horizontal motion gesture

Vertical motion gesture

Non-motion gesture

Non-motion gesture

Non-motion gesture

Figure 17: Proposed motions

21

2.10 Supported different motion behaviors.

2.10.1 Vertical motion

Pattern 1 – Normal motion from bottom to top and top to bottom.

Figure 18: Vertical Pattern1

Source: Author’s work based on google (2016)

Pattern 2 – Curve motion from top to bottom and bottom to top

Figure 19: Vertical Pattern 2


22

Pattern 3 – Zigzag motion from top to bottom and bottom to top

Figure 20: Vertical Pattern 3

Source: Author’s work based on google(2016)

2.10.2 Horizontal motion

Pattern 1 – normal motion from left to right and right to left.

Figure 21: Horizontal Pattern 1

Source: Author’s work based on google(2016)

23

Pattern 2 – Curve motion from left to right and right to left.



Pattern 3 – Zigzag motion from left to right and right to left.



24

Pattern 4 – Zigzag motion from left to right and right to left.



2.11 Best Programming language

Developing image processing project most probably famous programming

languages are C++, C#, and Java. But there are some frameworks available for to

these staffs like MATLAB. The author here explain the above languages why it’s

important and why it’s not important and finally comes up with the best language

for implement this project.

C++

C++ is low-level language and it’s first low-level object-oriented language. Most

of the algorithmic project like image processing, video gaming, and operating

systems are done C++. Consider this project C++ better projects but there are some

issues in the UI development. And other hand dangerous situation is its will case

on the dependencies errors. If developer very comfortable with C++ now worries

directly go through it.

25

JAVA

Java is high-level and powerful pure object oriented language in today. It’ depend

on JDK and run on java development virtual machine. If considering the java get

work on this project there are some benefits available such as image processing

APIs, libraries and there are huge community background around in java language.

But still it time-consuming when the execution time of an algorithm. Therefore

author’s idea is for the beginners java is the best languages and but it’s too much

slow for real-time image processing project.

C#

C# is similar to java language which is developed by Microsoft in 2000. C# have

a concept of C++, C and object oriented execute on.Net framework. Above

mentioned java and C++ are platform independent and but still C# work on only

windows platform. The author can justify C# is a better programming language go

throw like author’s approach and there are good community backgrounds around

in C# lanuage. If consider with JDK and .NET C#, .NET provide better execution

speed than java. For beginners can easily understand the concept and C# visual

studio provides the high-level WPF GUI design developers to.

26

2.12 Selecting an image capture devices.

Hand Capture is the most interesting point of this project. Before the preprocessing

image, the application should be able to clear capture image from the background.

Normally capture device can divide two technologies know as Sensor cameras and

RGB pixel cameras. In this section cover device comparison for suitable hand

capture. During comparison focused pixel quality, frame per second (fps),

durability and price.

2.12.1 Sensor camera

Leap motion

Leap motion is making new ground in gesture control with high precision 3D

interaction on the desktop, and even though not seen specific applications for 3D

software yet, the ability is there all packed into a device smaller that your mouse or

mobile (Mings,2012 ).

Figure 25: Leep motion

Source: Leepmotion (2016)

According to (leapmotion, 2015), Leap motion uses a combination of headwear in

order to great a 3D model of the hands. It allows different platforms for others to

build many applications, resulting in gesture-controlled computer, games art

programming. Leap motion SDK provide an easy way for develop applications

27

and huge API documentation and recourses. As advance Leap motion sensor does

not depend on the environment condition and lighting effect. But the drawback is

it requires a leap motion and there will be an issue in marketing level since today

the market price is $76.99.

Microsoft Kinect

The latest Kinect sensor and SDK provide companies and developers with

everything they need to create interactive applications that respond to gesture,

voice, and movement. With the v2 sensor and SDK, 2.0 allows to create Kinect v2

applications and allowing to reach more potential customers than ever, from

businesses to consumers to other developers (Microsft, 2015).

Figure 26: Microsoft Kinect

Source: Kinect(2016)

Related features that provide command to your application voice and features, play

games where you are the controller, call friends and family with Skype in HD.

Introduce broadcast gameplay technology live with picture-in-picture is recognized

and signed-in automatically and dance central Spotlight downloadable token

included. Compared to leap motion Microsoft Kinect is a good choice but it is still

much expensive with a price of $ 874.99 (Kinect, 2015).

2.12.2 RGB pixel camera

28

Logitech c920

Lendino (2012), The Logitech c920 is a good looking camera .It’s made completely

of glossy and matte black plastic, with a clear plastic cover by the body. The design

is not as classy as Microsoft’s LifeCam studio, but smoother and more practical,

mostly with regards to Microphone placement. Overall c920 gives a better

performance as webcam, including 720p (1280-by-720 pixels) and 1080p (1920-

by-1080p) resolution. The auto focusing, light balance, and frame are outstanding.

C920 provides 29 frames per second at 1080p. Additionally, c920 added to the

stereo microphone and a Carl Zeiss lens with onboard H.264 video

compression. Considering this as a core device for the usage of the

application the cost which is $ 69.99 is the only drawback for this device.

Figure 27: Logitech C920

Source: Logitech (2016)

29

Freetalk HD Camera

Martin (2010), Freetalk HD webcam is smartly designed with a double-hinged

mount that allows it to sit on top of just about any monitor to provide resolution of

1,280x720 at a frame rate of 22fps. But if the user steps up to VGA frame the rate

will increase up to 30fbs. Additionally, autofocus lens, captures images always nice

and sharp and colours were surprisingly faithful, even appear of strong

backlighting. Drawbacks are during the testing phase the quality of image frame

rate is near of 22fps even using the spate VGA, it’s not given the 30fps directly and

the webcam cost is $ 60.99. This is very inconvenient as well as infrequent since

most webcams have microphones.

Figure 28: Freetalk HD

Source: Freetalk (2016)

Logitech C270HD (Proposed webcam)

Skinner (2010), this device measures 70x18x30mm and will easily fit in the palm

of your hand. It is ideal for someone who doesn’t want to ruin the aesthetics of their

laptop or monitor but still want a full-functional webcam. The Logitech C270 HD

webcam captures video at a resolution of up to 1280 X 720 and 30 frames per

30

second (fps) while still images are captured at 3mp. The footage captured by the

webcam was smooth with no pixelation. However, colours looked slightly washed

out. (Skinner, 2010).Other hand user does not need to pay much money, it’s just $

30.99. Overall comparisons with the price, resolution and fps C270 is a good

choice.

Figure 29: Logitech C270 HD

Source: Logitech (2016)

Features

Application

Vision

type

Pixel

value

FPS

(Frame per

second)

Price Image

resolution

Dimensional

level

Leep motion Sensors Not

supported

230f $79.99 640x360 3D

Microsoft Kinect Sensors Not

supported

30f $ 874.99 1920x1080 3D

Logitech c920 RGB

pixels

15mp 29f $69.99 1920x1080 2D

Freetalk HD

Camera

RGB

pixels

2mp 22f - 30f $ 60.99 1,280x720 2D

Logitech C270HD RGB

pixels

3mp 30f $ 30.99 1280 x 720 2D

Table 3: Device comparison

31

According to the data in the above table Logitect C270HD provides better features

for $30.99. And also this proposed application contains requirements with

minimum number of pixels, frame per second and resolutions effectively and a cost

of $30.99 is not much costly for general users.

32

CHAPTER 3

TECHNICAL RESEARCH

3.1 Hand detection.

2.3.1 Background Subtraction

Background subtraction method widely using method for moving and non-moving

object segmentation. According to (Collins, 2000), There are main three

approaches have to be detecting objects. Temporal detection, Background

subtraction, Optical flow. Temporal differences are very adaptive to dynamic

environments, generally, this approaches providing the poor performance for

extracting the relevant pixel areas from the background. Optical flow can be used

to detect an object that represents of camera movements. The disadvantage is

optical flow approach are computationally complex, and cannot be used to full-

frame video streams in real time video streaming. Most of the research papers

provided the successful method is background subtraction. Background subtraction

provides a complete feature data, but is extremely sensitive to dynamic scene

change due to lighting condition and moving object appear behind the sense event.

Below image will Show how background subtraction leads other approaches.

Figure 30: Background subtraction

Source: Collins (2016)

33

(a) When the car moves background subtraction leaves the ‘holes’ when the

stationary object moves. But considering the frame differences or optical doesn’t

not give the entire move object shown in image (b).

Haritaoglu et al (1998), Frame differences approach in based on a model of

background difference obtained until the scene contains no objects. The

background image is a model by representing each pixel by three main values. It’s

minimum and maximum intensity values and the maximum intensity. Foreground

objects are segmented from the background in each frame of the video sequence by

a four stage process like image thresholding, image noise cleaning, and

morphological filtering and complete the object detection. Xinggui (1992)

background subtraction and image differences are the two principles of object

detection in image processing. In background subtraction already capture a unique

background and every time current image will subtract form the background frame.

And finally obtain the object set the thresholding value as 40 convert to binary

image. In Image differences, subtract from the previous image by current frame and

it convert to binary image after obtain the object. Considering the final result of

these two techniques image differences not good approaches for detecting the

objects from the background. The result will contain a level of image noise highly

and for take clear detected object from the binary image will involve lots of image

processing techniques. It’s very affecting to accuracy of the real-time application.

Background subtraction contains two major principle known as adaptive and non-

adaptive methods. Non- adaptive method is according to (2010) it’s depend on the

number of video frames and not maintain unique background model. But it have

lots of limitations and drawbacks. In other hand adaptive background subtraction

method maintains a background model and the parameters of the back ground

model evolve over time. Adaptive method used for VCA applications because it’s

able to automatically analyzing video to detect and determine temporal events not

based on a single image.

34

fi : A pixel in a current frame, where i is the frame index.

μ : A pixel of the background model (fi and m are located at

the same location).

di: Absolute difference between fi and m.

bi: B/F mask - 0: background. 0 x ff: foreground.

T: Thresholding

α: Learning rate of the background.

i) di = |fi - μ|

ii) If di > T , fi belongs to the foreground; otherwise, it

belongs to the background.

2.3.2 Image Smoothing

Figure 31: Image smoothing

Source: Author’s work (2016)

Improve the accuracy of background subtraction reduce the image noises are very

important because it might take the lack of poor subtraction. Matematikos (2015),

in image smoothing, can be dived into two major roles like Linear filters, Non-

Linear filters. Uniform filter, triangular filter, and Gaussian filters inherit from

linear filters and there are two filters consist of non-linear filter known as consist

median filter, kuwahara filter. If compare with performance non-linear filter

(median filter) gave the better result in image smoothing. Madisetti and Williams

(1999), the Median filter is based on window moves over the base image and each

time compute the out pixels as the median values of the brightness within the input

35

window. But this window base algorithm takes much time to smoothing the image

and it’s a not good practice for applying on a real time application. Considering the

Gaussian blur from liner filter it avoid much execution time than Median filter.

According to Waltza and Millerb (1998) Gaussian filter allow for larger kernels can

be decomposed into the sequential application of small kernels and work into row

and column operation. Image smoothing accuracy does not give as Median but it’s

not effect to this author’s approaches. The equation function in one dimension gave.

𝐺(𝑥) =1

√2𝜋𝜎2𝑒−𝑥2

2𝜎2

In two dimension, it is the product of two such dimensions, one in each dimension

gave.

𝐺(𝑥) =1

√2𝜋𝜎2𝑒−𝑥2+𝑦2

2𝜎2

Where x is the distance from the region of the horizontal axis, y-axis is the distance

from the origin of the vertical axis, and σ is the standard deviation of the Gaussian

distribution (Nixon et al, 2008).

2.3.3 RGB image convert to HSV

Improve the hand detection Background subtraction is not enough. Therefore, the

system needs further treatment for detecting the hand from the posture. Enhance

the detection author introduces skin colour detection method improving hand

segmentation.

In computer vision finding a most similar colour space for human skin region is the

big task According to Tebal and Pulau (2013) RGB, Normalize RGB, YCbCr, YIQ,

YUV, HSV, YDbDr, CIE L*a*b are contain similar skin colour regions. According

to above series, YIQ gives the highest repairability between the skin and non-skin

pixels measured by F1-measure. But overall result emphasizes that pixels colour

information directly can’t use alone to achieve accurate skin detection via YIQ

36

colour space and it’s given different result different lighting conditions. Park

(2013), considering YCbCr is insensitive to colour variation in different lighting

condition.

The conversion equation is

𝑌𝐶𝑏𝐶𝑟

= [0.29900 0.587000 0.114000

−0.168736 −0.331264 0.5000000.500000 −0.418688 −0.081312

]𝑅𝐺𝐵

Y is luminance, Cb is the blue and Cr is the red. From above colour formula selected

colour ranges are Y from 0 to 255, Cr from 77 to 127, Cb from 133 to 173 as

respectively and according to Hong and Yang (2013) Cb from 96 to 143, Cr from

132 to 164. The drawback of YCbCr is colour space covering limited skin colour

regions.

Nelsons et al. (2004) compare with YCbCr, HSV colour space independent of the

intensity and according to Suresh et al., (2014) HSV supporting the best

performance for skin pixel detection , it’s the most adapted to skin-colour detection

also it’s compatible with the skin colour perception as well and HSV covering large

skin colour region. Definition of HSV, “H is Hue component that represent pure

colour such as pure red, brown and green S is given by saturation provides a

measure of the degree to which a pure colour diluted by white light Value (V)

attempts to represent brightness along the grey axis as white to black.” Jun and Hua

(2008), Calculation of convertion of RGB to HSV model as follow.

𝐻 = {𝜃, 𝐺 ≥ 𝐵

2𝜋 − 𝜃, 𝐺 < 𝐵

𝑆 =max(𝑅, 𝐺, 𝐵) − min(𝑅, 𝐺, 𝐵)

max(𝑅, 𝐺, 𝐵)

𝑉 =max(𝑅, 𝐺, 𝐵)

255

37

𝜃 = arccos{[(𝑅 − 𝐺) + (𝑅 − 𝐵)]/2

[(𝑅 − 𝐺)2 + (𝑅 − 𝐺)(𝐺 − 𝐵)]1/2}

Model range is according to Suresh et al., (2004) 00 <H <200 , 75 <S <190 and,

according to Jun and Hua (2008) 00 < H < 5 00 , 0.2 < S < 0.68, 0.35 < V < 1.0,

Under Artificial lighting model will 0 ≤ H ≤ 30, 60 ≤ S ≤160 and under natural

lighting range will be 0 ≤ H ≤ 15, 20 ≤ S ≤ 120 as well. But after testing different

users and different lighting conditions author chooesn appropriate range is 00 < H

< 380, 17 < S < 60, 32 < V < 74.

2.3.4 Image thresholding

Thresholding is common object segmentation strategies in image processing.

Finding a reasonable thresholding algorithm too much risk. Because basically

According to DongiuLiu and Yu (2009) gray level characterize the object in a gray

image has many thresholding methods. Extract objects from their background

based two method such as statistics of the one-dimensional histogram of gray levels

and a two-dimensional histogram of gray levels as well. These kinds image

proposing issues easily solve by using Otsu algorithm. Otsu method is a most

successful method for using in image thresholding. But comparing K-means

method, it objectively equivalent to Otsu algorithm. These two algorithms have

various performance, accuracy of the Objective differences. DongiuLiu and Yu

(2009), the differences is Otsu method searches the global thresholding and K-

means is based on local optimal values. In other hand, Otsu method need a

grayscale image before thresholding but K-means not require. Therefore, K-means

can be more efficient that Otsu. Compare the final thresholding results there are not

more much difference between them.

The one of the lack of advantage is K-means method needs the thresholding value

before thresholding image, can’t calculate via histogram like Otsu. Most of time

image thresholding focusing in different locations, different lighting conditions.

38

Therefore threshold value can’t be fixed value its might change. Avoiding these

kind of cases can’t handle using K-means method, and the most appropriate

algorithm author justify Otsu. Otsu will automatically calculates the threshold value

and overall algorithm is not depending the lighting and environment condition.

Greensted (2010), the sum of variance is given by,

𝜎𝜔2(𝑡) = 𝜔1(𝑡)𝜎1

2(𝑡) + 𝜔2(𝑡)𝜎22(𝑡)

Otsu show that minimizing the intra-class variance is the same as maximing inter-

class variance.

𝜎𝑏2(𝑡) = 𝜎2 − 𝜎𝜔

2(𝑡) = 𝜔1(𝑡)𝜔2(𝑡)[𝜇1(𝑡) − 𝜇2(𝑡)]2

2.3.5 Morphological Operation

After obtained hand detected image, it may contains image noises issues such as

split white pixels. These issues will cases in during recognizeing the hand gestures.

Therefore, avoiding these author apply morphological Operation as remove filter.

Efford (2000), Morphological image processing is a collection of non-linear

operation relation to the shape or elements in an image. Morphological operation

base on the relative ordering of each pixels values and not on their numerical values.

Morphological operations just not support for the only Binary image it may operate

grayscale image as well. Removing the noises morphological define two operations

know as Erosion and Dilation. According to Peter (2007) morphological operation

probe an image with a small template as a structuring element (SE). This structuring

element compared to the overall image and its size between zeros and once in

matrix define its shape. And it positioned at all possible locations compared with

the corresponding neighborhood of pixels.

39

Erosion

Efford (2000), “The erosion of a binary image 𝑓by a structuring elements produce

a new binary image g=f s with ones in all location (x, y) of a structuring

element’s origin at which that structuring element s fits the input image 𝑓i.e.

g (x, y) =1 is s fits 𝑓 and 0 otherwise, repeating for all pixel coordinates

(x,y)”.

Figure 32: Erosion adapter

Source: Efford (2015)

A- Input picture

B- Structuring element

C- Result

Figure 33: Sample erosion


40

Dilation

Efford (2000), “Image 𝑓 by structuring elements s produce a new binary image

g=𝑓 𝑠 with ones in all location (x,y) of a sturing element’s orgin at which that

sturcturing element s hits the input image 𝑓 ex g(x,y) =1 if s hits 𝑓 and

otherwise. Repeating for all pixel coordinates (x,y)”. Dilation is the opposite

effect of erosion and it add all layer of pixels to both inner and outer boundaries in

regions.

Figure 34: Dilation adapter


A- Input Image

B- Structure element

C- Output image

Figure 35: Sample Dilation


41

2.3.6 Arm remover

The detected hand contains unnecessary gesture recognition features such as wrist

and arm. These features may affect the abnormal result of during recognizing stage.

Therefore, it’s significant to remove arm before recognizing each posture. This arm

removes algorithm implement by author base on the wrist’s width calculation. Once

system scanning the image bottom to top it may give the almost constant values in

the arm area. But when they finish scanning arm area width will increase suddenly

at the wrist.

Figure 36: Arm removal calculation


In arm area width will give by l1 ≈ l2 ≈ l3 and width will suddenly increase

in the wrist as l3 < l4 < l5. Detection of the wrist increase start at l3 and width

will give by 29 around pixels after testing. This is the constant values and it's

identified by testing 6000 hand posture by the author. The result is given below the

table with different users on their hand sizes.

Figure 37: Arm removed hand


User 1 User 2 User 3 User 4

42

2.3.7 Gesture recognition.

Complete all image pre-processing stages next step is to recognizing the hand

gesture. For achieving this task there are several machine learning algorithms

available such as ANN, HMM, SVM, SIFT, and KNN. The author described

consequences of each algorithm and finally author promotes most suitable method.

Artificial Neural Network (ANN)

The neural network defined by Dr. Robert Hect-Nielsen, “a computer system made

up of a number of simple, highly interconnected processing elements, which

process information by their dynamic state response to external input.”

Caudill (1989), the neural network typically connect with the three layers. These

Layers are made up of number of interconnected as ‘nodes’ which contain an

‘activation function.’ Patterns are presented to the network via the ‘input layer’,

which communicates with one or more ‘hidden layers’ where the actual processing

is done via a system of weighted ‘connection’. The hidden layers then link to an

‘output layer’ where the answer is output as display below image. According to

Doya and Wang (2015), this algorithm can be applied technological application

like pattern recognitions, time Series Prediction, Signal processing, Control, Soft

Sensors that significantly use neural network concepts and techniques

Figure 38: Neural network

Source: Caudill (2015)

43

K-Nearest Neighbor (KNN)

Sayad (2010), K nearest neighbors is a high-level machine learning algorithm that

stores all available cases and predicts the numerical target based on a similarity

measure. KNN has been used in the statistical estimate and pattern recognition

already in the beginning of 1970’s as a non-parametric technique. According to

Thrirumruganathan (2010), in KNN there are no explicit training phases there for

training phases is fast. During the training, KNN keeps all the testing data more

exactly and a final decision will make based on the entire training data sets. Below

image example of KNN According to coomans & massart (1892), “green circle

should be classified either to the first class of blue squares or to the second class of

red triangles. If k=3 (solid line circle) it is assigned to the second class because

there are 2 triangles and only 1 square inside the inner circle. If k=5 (dashed line

circle) it is assigned to the first class”

Figure 39: K-Nearest Neighbor

Source: Sayad (2015)

Support Vector Machine (SVM)

Lamp (2012), SVM is a machine learning algorithm which is use for classification

or regression problems. In SVM base technique called kernel trick to transform user

data and then based on these transformations, it finds and the optimal boundary

between the conceivable outputs. Simply put, it does some extremely complex data

transformations, then figures out how to separate user’s dataset on the labels or else

44

output user have defined. According to Scikit-learn (2015), “SVM allows effective

in high dimensional spaces and still effective in the case where a number of

dimensions are greater than the number of samples”. But if the number of features

is much greater than the number of samples datasets, the SVM Preform poor results.

Hidden Markov Model (HMM)

Blunsom (2004), The Hidden Markov model (HMM) is a powerful high-level

machine learning algorithm of modeling multiplicative sequence that can be

described by an underlying process generating a recognizable sequence. In HMMs

can apply in many areas like signal processing, and in particular speech processing,

but HMMs can have with success to low-level NLP tasks such as part-of-speech

tagging, phrase chunking, and extracting target information from documents.

The Scale Invariant Feature Transform (SIFT)

Lowe (2004), an important of this approaches is it generates large numbers of

features that densely cover over the image full range of scale and locations. A

typical image of size 500x500 pixels will give rise to about 2000 stable features.

The List of features depends on the result of object recognition and where the

ability to detect a small object in the cluttered background needs that at least 3

features be properly matched from each object for reliable identification. SIFT

looking for four feature during the classification these are known as scale-space

extraction detection, key point localization, orientation assignment and keypoint

descriptor.

Author’s approaches.

Above mentioned machine learning algorithms are needed many kinds of

expensive training dataset or templates. For example, each relevant datasets of

posture should have different lighting conditions, different environment, and

45

different user’s hands. Obtain a better result system required at least 400

templates for each hand gesture. This author’s proposed approach supporting 6

gestures, therefore, the system needs to train the system 400 x 6 = 2400 templates.

But the drawback is once system detected user hand it will go through all 2400

datasets and compare with unique features. It’s too much time consuming and

time complexity response will happen after 4, 5 seconds. But compare with

author’s scope like media players, skype need a response time of maximum 2

seconds otherwise system usability would be lost. For this sake, the author

introduces author’s algorithm base on the mathematical framework and its give

the recognizing time complexity within 80 milliseconds. This algorithm contains

two levels. Level one recognizing the thumb in a posture and during level 2

recognizing the features percentage of each hand.

Recognizing thumb, once completed the binary processing image will be rotated

900 percent opposite of clockwise.

Figure 40: Posture rotation


46

After completed image rotation next identify the upper and lower x and y

coordinates in posture.

Figure 41: Calculate height


Height of the thumb (h) given by,

ℎ ≈ (𝑋′ − 𝑋′)/3

Obtain the height of the thumb then calculate total white pixels belongs to that

region. After that need to get an overall percentage of the thumb region from the

hand. But different users, different lighting conditions it may give different thumb

values. Therefore, author justifies maximum and a minimum of thumb region pixels

as a constant value.

8 8 9 9 9 10 11 11 12 12 12 13 14 15 16 16 17 17 17 18

Possible thumb region values with different conditions.

Above the test result gives, maximum and minimum values are given by 8 % and

18%. It’s tested by over 7500 posture templates base on different users and different

lighting conditions. After recognized the thumb value system will focus on the

features in different hand postures. The concept of this algorithm calculates the

percentages each feature parts of the hand.

47

For understanding the algorithm, consider square coloured as 25% like below

image. Mathematically 25% value does not depend on the size of the square and it

gives the static valves for every time.

Template image

Coloured as 25%

from square

Coloured same scale

in a large image

Coloured same scale

in a small image

Figure 42: Percentage calculation


For recognizing the hand posture the system will identify the upper X, Y coordinates

and lower X’, Y’ coordinates and calculates the height (h) of posture. Via calculated

height, the algorithm will characterize three features of the hand.

Figure 43: Calculate vertical height


48

Height given by: -

ℎ = 𝑋′ − 𝑋

All three features provide by,

Figure 44: Extract features


𝑓′ = ℎ/3

𝑓′′ = 𝑓′ + (ℎ/3) 𝑓′′′ = 𝑓′′ + (ℎ/3)

After calculated the f’, f’’, f’’’ next stage is counting the total pixels of covered

area as separately. Finalized the algorithm needs a percentage of each section

areas. The output result is shown below the table.

f’ 8,8,9,9,9,9,9,12,11,11,11,13,13,13,11,14,14,11,11,15,15,14,12,12,12,12,9,9,9,8,8……..

f’’ 28,28,19,19,19,19,19,22,21,21,21,23,23,23,21,24,14,21,21,25,25,24,22,14,21,21,……..

f’’ 38,38,39,39,39,39,39,32,31,31,31,33,33,33,31,34,34,31,31,35,35,34,32,34,31,31,………

Table 4: Extracted features


Above five finger posture gave these f', f'', f’’ by 1000 templates but it’s not given

the same values for each section. The case is values may depend on the different

lighting conditions and background conditions, therefore, author justifies the

maximum and minimum values of each covered section. For each example f’

(8%-15%), f’’ (14%-28%), f’’ (31%-39%).

49

2.3.8 Motion detection and recognition

Webopedia (2015), motion detection is referred to the capability of the surveillance

system to detect motion at the monument. There are many approaches to detect

motion in a continuous video stream. But all of them are based on the comparing

current frame with previous frames or background frame. But calculating the

motion destination system can’t directly use above approaches. For this sake, the

author has changed the behavior of traditional frame difference algorithm for

improving the accuracy of the algorithm. This author’s system able to identifying

straight vertical and horizontal and spring movement as well. After completed the

all recognizing process, the algorithm will take the X and Y coordinates and gets 20

user postures with different movements. In each and every posture takes the new

X” and Y” coordinate compare with the previous coordinate and finally identify the

overall motion destination where hand flowing. Below figure list defines how

horizontal movement identify.

Figure 45: Horizontal motion calculation


X and Y coordinates

of recognition stage

X’ and Y’ coordinates

of motion stage

X’’ and Y’’

coordinates of

motion stage

X’’’ and Y’’’

coordinates of motion

stage

50

Final horizontal motion right to left calculate by: -

X - X’ > 0,

X - X’’ > 0,

X - X’’’ >0,

Motion direction is left to given by:-

X - X’ < 0,

X - X’’ < 0,

X - X’’’ < 0

Motion direction is left to right. Identify the vertical motion

Figure 46: Vertical motion calculation


Final vertical motion bottom to top calculate by: -

Y- Y’ > 0,

Y - Y’’ > 0,

Y - Y’’’ > 0

X and Y coordinates of

recognition stage

X’ and Y’ coordinates

of motion stage

X’’ and Y’’ coordinates

of motion stage

X’’’ and Y’’’ coordinates

of motion stage

51

Motion direction top to bottom gives as: -

Y - Y’ < 0,

Y - Y’’ < 0,

Y - Y’’’ < 0

52

CHAPTER 4

SYSTEM REQUIREMENT SPECIFICATION

4.1 Application Description

The system will provide as a desktop application and it support 6 memorable

gestures,motions and 6 very common application using today. In this section

describe system functional requirements, system non-functional requirement, API

usage, hardware and software requirements.

4.2 Functional requirements

Under functional requirements discuss required responsible behaviors, and what

the functional behaviors system should provide are.

4.2.1 Hand detection requirements.

The application needs to capture a background image.

Resize the image for improving the performance of pre-processing techniques

For each, every captured posture (with hand) need to subtract with the

background image.

The application will be converted to RGB image to HVS colour space.

After convert the colour space, the application needs to be grayscale for

thresholding.

Application thresholding the image for obtaining a binary image.

Remove the noises application should be applied morphological operation.

Improve the accuracy of gesture recognition application needs to remove the

arm from hand detected posture.

53

4.2.2 Hand recognition requirements

The application will calculate the upper x,y coordinates of detected posture.

Next needs to identify is there thumb region contains in the detected posture

The application calculates the geometric features of hand and obtains a

percentage of each section.

After comparing the percentage values, the application recognizes the relevant

posture collection.

Application after obtains 15 postures stops the pre-processing, recognition steps

calculate the highest count of relevant posture collection.

After counting the collections, application finalizes the posture recognizing.

4.2.3 Motion detection and recognizing requirements

After recognized the posture, the application should need to stop the

recognizing process and calculate the each and every posture’s new x and y

coordinates.

Once calculate the x and y values, compare with the past values when already

calculated from recognition the process.

Application finalizes the value differences needs to add relevant motion

collection.

After complete 15 postures, stop motion detection process and calculate the

total count of the each and every collection.

After counting the collection, application finalizing the detection and

recognizing the motion.

54

4.2.4 Commands Executing requirements.

Application recognized the gestures, then check gesture will contains any

motion or not. If motion not available for the gesture, directly it access the

particular command or after recognized the motion it call command.

4.3 Non- functional requirements

Non-functional requirements are system should be provide physical requirements.

Such as environment, performance, usability, Scalability maintainability and

reliability.

4.3.1 Environmental requirements

The application should be able to use different lighting conditions (light and

dim).

The application should be able to detect hand from cluttered background.

4.3.2 Performance requirements

The application able to complete the hand detection within 100 milliseconds.

The application able to recognizing the posture within 100 milliseconds.

Motion detection and recognition has should be within 100 milliseconds.

4.3.3 Usability requirements.

The application needs to provide a user-friendly interface for guest users.

Icons and other tools are should be meaningful for understanding.

The application showing the user hand during access the system, therefore

user able to understand how system focuses your hand.

55

4.3.4 Scalability requirements

The application should be able to recognized posture within 15 postures and

recognized the motion within 20 postures. Also, this application supports to

both small and large different hand scales.

4.3.5 Maintainability requirements

Application implementation should be reusable (updates version…), therefore

require to use a design pattern and architecture framework.

4.3.6 Serviceable requirements

The application needs to provide a contact form for getting users feedbacks

and other issues.

4.3.7 Reliability requirements

Application able to call relevant command after recognized the hand and

motion in a system environment without experiencing failures.

4.3.8 Hardware and Software requirements

Separate web camera - HD, 3MP or higher, 30 fps.

Laptop or PC run the application.

Before run the software user must need install core applications - KMP player,

VLC player, Image Gallery, Skype, Microsoft office power point.

56

4.3.9 System requirements and specification

RAM - 512 MB or higher

VGA – 256 MB or higher, onboard or separated.

Supported .NET platform – 4.5 .NET or higher.

Operating system – Windows XP or higher.

4.3.10 Application program interface requirements

The application used EMGU CV, need an adapter with hardware for capturing

images and after captured images will resizing until 150 X 150 pixels frame.

57

CHAPTER 5

SYSTEM DEVELOPMENT PLAN

5.1 System development methodologies

Software development methodologies partake always been the main focus in the

software development lifecycle of any project. Each evolutionary change

introduces new ways of thinking and viewing the problem as well as introduce

strength and weakness of a system.

During the research selecting a methodology for academic level finally, satisfy with

the Spiral methodology. According to Kumar (2015) The Spiral model has four

phases planning, identify the risk needs to analysis, Development, planning what

going to do next iteration. A software project moves the each iteration during the

development duration. The baseline spiral, starting in the planning phase,

requirement are gathered and risk is assessed.

Figure 47: Spiral model


58

Considering about improvement with other models special spiral iterative steps

helps to solve the complex problem easily and it has contain risks management is

one of the inbuilt part of each iteration, which make it extra attractive compared to

other mentioned models and it helps to developers to identify the what the

challenging parts and how much they work on the each function than others. During

end of the iteration development team can start to investigation for what would be

have to do in during next iteration, this is good practice for getting an idea before

starting the next relevant iteration (Sparrow, 2015). And spiral methodology

contains several iterations, therefore each loop and each phase can be monitoring

is very easy and operative for the project without positive bugs.

5.2 Gantt chart

Please refer Appendix B.

59

CHAPTER 6

DESIGN

Software design is a process to show a clear picture about user requirements and

functional requirements and how to apply to the system. It helps the programmer

to keep the implementation simple and understandable. Basically, software design

yields three levels such as Architectural design, high-level design, and detailed

design.

Architectural design

Ratadz (2002) Architectural design defines a collection of hardware and software

components and their interference to the development of a computer application.

High-Level Design

(TheSoftwareExpert, 2010) High-level software design is the first design step after

analyzing all requirements for software. The goal is to define a software structure

which is able to fulfill the requirements and non-functional requirements.

Detail design

Ratadz (2002) Expanding deeply the introductory design of a system and subsystem

or component to the extent that the design is sufficient completed to begin

implementation.

In this section covered UML diagrams such as Block diagram, Use-case diagrams,

Class diagram, Activity diagrams and sequence diagrams for show clear picture

about system’s high priority functions. AgileModeling(2014), UML Use-case

diagram promotes the overview of function and behavioral requirements. Activity

diagram typically used for modeling the logic captured by a single use case or usage

scenarios. Class Digrams show the classes in the system, their relationships,

operations and attributes of the class. Sequence diagrams show the flow of logic

within the system in a visual manner and document and validate the logic for both

analysis and design purposes. In this design chapter author covered,

Block diagram.

Architectural diagram.

Use-case diagram and use-case specifications.

60

Activity diagram.

Class diagram.

Sequence diagram.

Wireframes and Screen shots.

61

6.1 Overall Block Diagram

The following block diagram shows overall operations that are needed to

implement. The application consists of four main steps to implement such as hand

detection, posture recognition, motion detection and motion recognition. Before

executing these function user must need capture a background image for

completing the pre-processing steps. In next section author detail mentioned above

high priority operation with the sub-steps.

.

Figure 48: Block diagram


62

6.2 UML Use-Case Diagram

6.2.1 Main use-case diagram

Main use-case diagram shows the major functional behaviors which need to

implement.

Figure 49: Main Use case diagram


63

6.2.2Use-case level 1 – Hand detection

This Level 1 use-Case diagram define the sub behaviors of the Hand detection

use-case.

Figure 50: Hand detection - use case diagram


64

6.2.3 Use case Specification – Level 1

Table 5: Hand detection - usecase specification


Use Case ID: 6.1

Use Case Name: Hand detection

Created By: Author Priority: 1

Date Created: 2016/2/10

Actors: User

Description: Following sub use-cases are included.

Capture back ground image

Capture image with hand

Resize image

Smooth image

Background subtraction

Colour space conversion

Image grayscale

Thresholding image

Erode image

Dilate image

Pre-conditions: User should be take a background

image.

Post- conditions: After completed all image per-

processing steps posture will send to

finalize the recognition stage.

Special Requirements: Prior to any operations, user must

connect a web camera to the system.

Assumptions: After the hand is detected, the detected

posture automatically will send to the

recognition process.

65

6.2.4 Use-case Level 2 – Recognizing hand posture

This Level 2 use-Case diagram expression sub behaviors of the recognizing Hand

posture Use-case.

Figure 51: Gesture recognition - use case diagram


66


Table 6: Gesture recognition - use case specification


Use Case ID: 6.2

Use Case

Name:

Recognizing hand posture



Actors: User


Rotate Image

Get Upper pixels

Get Lower pixels

Calculate thumb region

Extract posture feature Level 1



Calculate feature percentage

Pre-conditions: The application should be finalize the all per-

processing steps in hand detection.

Post- conditions: After the completion of all these steps, next is

execute motion detection and recognition

process.

Special

Requirements:

As a special requirement, system needs a clear

detected hand posture for recognition.

Assumptions: After completed all recognition steps, the

posture will automatically send to detect the

motion.

67

6.2.6 Use-case Level 3 – Detect and recognizing hand motion

This following Level 3 Use-Case diagram define sub behaviors of the detection

and recognize Hand motion Use-case.

Figure 52: motion detection and recognize -use cases


68


Table 7: Motion detection and recognition - use case specification


Use Case ID: 6.3

Use Case

Name:

Hand detection



Actors: User


Check is posture expect a motion or not

Get new upper pixels

Differences of new and old upper pixels.

Re-mark particular motion list

Recognizing the motion

Pre-conditions: The application should be finalize the all

recognition stages.

Post- conditions: After complete all these steps, next execute

relevant command.

Special

Requirements:

As a special requirement system need calculate

upper pixels where posture finalized the

recognition step.

Assumptions: None

69

6.2 UML Activity diagram.

6.2.6 Hand Detection.

Figure 53: Hand detection activity diagram


70

Above hand detection activity diagram defines two processes such as Get

background image and Detect hand. Firstly user needs to capture a background

image. After completing the capture background process system needs to resize

both images to 150 x 150 pixels to start hand detection. Before subtracting both two

captured images, the application apply smooth filters to the images for improving

the subtracting. As the first stage of hand detection, the user needs to subtract both

background and current images. But subtracted image may contain the non-skin

area. Avoid those non-skin pixels need change the colour space RGB to HSV. As

a result, Obtained a binary image system will change the HSV image to grayscale

and threshold. After threshold image may consist some noises and broken pixels

which reduce the accuracy of recognition posture. Therefore, it’s important to

implement morphology operation of dilation and erosions for remove the noises

from the binary image.

6.2.7 Gesture Recognition.

The following activity diagram shows main sub-steps which are need to implement

for the recognition of hand gesture. Improve the gesture recognition accuracy

application needs to remove the arm from the posture for reducing the complexity

of the postures. After the application will starting the recognition by check the

thumb values of each hand. After justifies is posture contains any thumb region

application will start the recognize processes. In recognition process first calculate

the upper and lower pixels in hand for obtains height. Next is to extract the features

of a hand and obtain percentage values of each fragment. Reduce the recognize

complexity application load posture models based on the constant thumb values.

After loading relevant posture models, start posture recognition based on feature

extracted values. As a result, each posture will add to particular hand model list.

Finalize the recognition process application calculates the highest values of each

list and finalizes the gesture recognition process.

71

Figure 54: gesture recognition- activity diagram


72

6.2.8 Detect and recognizing hand motion

Figure 55: Motion detection and recognition - activity diagram


73

Above motion detection and recognition activity shows all the core functionalities

need to implement in the application. For motion detection, application needs to

gather 20 postures and check overall motion flow. As step one, application checks

whether the count list is completed or not. If it’s completed, application directly

call the relevant control. If it‘s not application move forwards to detection the

motion.

This application allows two type of gestures known as, gesture without motion and

gesture with motion. In second decision node check is gesture has a motion or not.

If gesture doesn’t need motion, application directly call the command or if gesture

is expecting a motion application moves forward to achieve next step. While

application processes forward it will check which kind or motion gesture will be

expecting. After application justifies the motion behavior, first calculates the upper

x and y coordinates. Motion detection will happen calculating the differences of x

and y, after it add to particular motion list. This process will happen for 20 times

and then it will finalize the motion recognition and call relevant command.

74

6.3 UML Class diagram.

6.3.6 Proposed software design pattern

Basically, the software application must be re-usable, otherwise when there’s a

need to change the functional and nonfunctional requirements, developers will find

it difficult to modify as per for the requirements. Therefore selecting appropriate

Software design pattern very important. According to sourcemaking (1999) Using

design pattern, it improves the speed up the development process. So as the patterns

allow developers to communicate using well-known, well-understood names for

software interactions. There are three type of design patterns such as Creational

design patterns, Behavioral design patterns, and Structural design patterns. For the

implemention of this scenario, author selected Facade design pattern which falls

under structural design pattern. According to dzone (2012) use of a Facade

will define an entry point to each subsystem level and thus make them sequence

communication with Facade class. It reduce dependencies of the external code on

the inner working of the libraries and providing flexibility. Compare with the

author’s scenario all core functions needs to access sequential order, first needs to

detect a hand, and next recognize posture after detecting the motion, and recognize

motion finally call relevant command. Therefore, author justifies facade is the more

appropriate design pattern for implement this system.

6.4.2 Proposed design Architecture

Oracle (2007), MVC design architecture provides decouple data access and

business logic from the manner in which is displayed to the user. MVC represents

three components which are Model, View, and Controller. Model is responsible for

maintaining the data. The View renders the contents of a model. It specifies exactly

how model data should be represented. Controller is the application code which

interacts with the Model and View. According to Careerride (2011) MVC is

separation, therefore, business logics are re-usable across the application. And

75

developer can build logic in the class while the UI developer and Database

developer can designing their own development separately.

76

6.4.3 Class Diagram

----------------------------------

77

6.4.4 Class Diagram with Packages - UML Architectural diagram

Following diagram show how each component located in the MVC architecture. It

doesn’t contain Database storage component but used an audio resource file

inside Model.

Figure 56: Architectural diagram


6.4.5 Facade Class

The facade is the main controller of all subclass. Therefore, facade contains

references in almost every class for calling the methods sequentially.

6.4.6 HandDetection Class

In HandDetection class contains all image pre-processing steps such as

background subtraction, image smooth, colour space conversion, image grayscale,

image threshold and Morphology operation.

78

6.4.7 GestureRecognition class

GestureRecognition Class will cover the all hand gesture recognition methods.

Before starting the recognition, application removes the arm and sequentially calls

recognize method for recognizing the posture.

6.4.8 MotionDetection Class

Motion Detection Class will cover the detection and recognition step. Motion

detecting application must need collect 20 postures. Each and every posture scan

how x and y coordinate movement will behave after detected hand posture and adds

to relevant motion list. Finally the motion recognition application will check for the

list which have the maximum count.

6.4.9 Command Class

Command class contains all application commands relevant to the gestures and

motions such as play and pause, volume up and down, mute mic and volume, hang

up and ignore video calls, Image zoom-In and zoom-Out, image forward and

backward, video forward and backward.

6.4.10 WebCamera Class

Webcamera class covered all background image capture process.

6.4.11 Starter GUI

Controller UI

6.4.12 Main GUI

Application main Dash board

6.4.13 Feedback GUI

UI for Getting User feedbacks

6.4.14 Camera GUI

Background Captureing Window.

79

6.5 Sequence Diagram

6.5.1 Hand Detection

Figure 57: Hand detection- sequence diagram


Detect hand from the background Facade controller calls nine different algorithms

in hand detection class. As the first step, application needs to set the background

image first then each and every time application needs to set a background image

80

with the hand. After obtaining both images, application sequentially calls image

smooth for improving the accuracy of the background subtraction, Background

subtraction, Detect skin colour, image grayscale, Get Threshold value, Threshold

image, and final Erode and Dilate.

6.5.2 Gesture recognition

After setting the detected hand, the application needs to justify whether it’s a right

or left hand, which handles in first if condition. In second if condition set as false

until GetListcount() becomes value 15. The first step in start recognition gets

detected hand and count the total white blobs, and get upper pixels and lower pixels

in the detected hand posture. Then application will check if the hand contains any

thumb area or not. Therefore application must rotate the image, count the upper and

lower pixels, calculate thumb region and count the total white blobs. For

recognizing the overall hand posture, the application will divide into three main

features of a hand and calculate the total region pixels percentage.

After the completion of all recognition calculation steps, the application needs to

calculate the relevant posture models. Therefore the application checks the third if

condition whether posture contains a thumb or not. Then application checks

whether the three features are already calculated, Setvalue i, Setvalue ii

and Setvalue iii. After the completion of the recognition process, application

suddenly checks the GetListCount(). Above scenario will apply for 15 posture

and then the application calls the recognize gesture method to finalize the gesture

recognition.

81

Figure 58: Gesture recognition - sequence diagram


82

6.5.3 Motion Detection and Recognition

Figure 59: Motion detection and recognition- sequence diagram


The motion will be calculated by 20 images, where x and y coordinates are moving.

In first if condition checks whether the application completed the getting 20

postures for start recognize motion. If it’s not, checks whether the posture will

contain a motion or non-motion in second if condition. If posture doesn’t contain

any motion application sets the CountList() as 20 and calls relevant command

belongs to the gesture or else set the relevant parameters into

SetMotionImageX() and SetMotionImageY() then calculates the new

posture’s x and y coordinates. Finally the application will check the variation of

each coordinate and add to relevant motion List. It happens for 20 times. To finalize

83

the recognition step, the application calls InvokeControls() method to identify

the real direction of coordinates.

6.6 Wireframes and Graphical user interface.

GUI is the front-end application view of the user interacts with application.

Implementing this feature in application, author focused on making an Attractive

UI, easy to use and easy to understand. In this section it will clarify about different

interfaces and meaning of each UI components.

6.6.1 Main Screen

Figure 60: Main Screen- Wireframe


84

Figure 61: Main Screen – Screenshot


In main UI is the welcome interface for users. Welcome interface which contains

several options to choose such as give feedbacks, capture background image, go to

the official web page, start the Controller. Each icons give particular meaning, and

icons arrange by base on the user experience.

85

6.6.2 Capture background image

Figure 62: Capture background- wireframe


Figure 63: Capture background- Screenshot


Above shown background allows to capture the background image. Once user

choose a background location and click the save button image will be save.

86

6.6.3 Controller Screen

Figure 64: Controller- wireframe


Figure 65: Controller – Screenshot


This is the controller window. When user controlling the application, user can view

how application reacts to the user’s hand by this windows. There are two radio

buttons for user to submit for the hand which is going to use and also a dropdown

87

box will contains list of lighting condition such as Dim light, and Light which user

must be select.

6.6.4 Feedback Screen

Figure 66: Feedback form – wireframe


Figure 67: Feedback form- Screenshot


This is the Feedback form where user can inform to the developer about bugs or feedback.

88

CHAPTER 7

IMPLEMENTATION

7.1 Overview

The implementation of the system is based on Spiral methodology and system is

developed by C# platform language with using Visual studio 4.5, as the API is using

EMGU CV for capturing the image frames and resizing the image. During

implementation PascalCasing and camelCasing is used as they are the C#

traditional styles for understanding the implementation for future developers and as

a design architecture MVC is used to place the class file according to the model,

view, and controller. Facade design pattern is used for maintaining the proper

reusable code structure. This section will cover main implementation areas such as

Graphical user interface

Hand detection

Gesture recognition

Motion detection

Motion recognition.

7.2 Graphical user interface (GUI)

nugetmusthaves (2015), GUI implemented by MetroFramework, metroframework

brings modern design interface quality Windows 8 UI to.NET Windows forms

application. MetroFramework will support Windows XP SP1/SP2/SP3 and Vista

Windows 7 and Windows 8 as well.

89

Figure 69: Image capture

7.3 Hand Detection

Figure 68: Hand detection implementation


All image pre-processing steps are implemented for the hand detection scenario.

These steps are given as image smoothing, background subtraction, skin-colour

detection, image grayscale, image thresholding and noise removing.

7.4 Image capturing and resizing the image.

Capture.QueryFram() will Capturing the Bgr(Blue, Green, Red) image and it

returns to the Image Frame variable as shown below.


After application captured image it needs to resize until 150 x 150 pixels range.

Therefore ImageFrame stored Image Variable will pass to

ImageFrame.Resize(); function to scale the image to the specific size.

Figure 70: Image resize


After resized image will be passed to resizedImage for the store. Now resized

image is ready for starting the pre-processing.

90

Figure 71:Image smoothing

Figure 73: Background pixels diffrencess

7.5 Image Smoothing

For image smoothing application will use Gaussian blur algorithm which improves

the the performance of the background subtraction. The image smooth function

contains major three steps. Before starting, image will needs to convert to the

matrix, and smoothing will start during image converting to 3x3 matrix.


7.6 Background subtraction

After completing the image smoothing, it will next move to Background

subtraction. There are two images involved to the background subtraction

processing known as an image without hand and image with the hand. As first step

the system will subtract the relevant x and y coordinates of both images.

Figure 72: Background subtraction


Finalize the differences calculation pixels distance via diffR, diffG, diffB

values.


This distance value used for comparing with the pixel’s adaptive value detects the

hand form background. The adaptive value depends on the background lighting and

environment conditions. In this system supported for both bright lighting and dim

lighting conditions. Therefore, in bright lighting condition, the adaptive value will

91

Figure 74: Adapter based image binarilization

be 2200 and dim- lighting condition value will be 900. Each distance value greater

than the adaptive value that pixels belong to the hand region and non-hand region

pixels equals to black colour. It’s given by


7.7 RGB to HSV colour conversion.

After background subtraction needs to detected skin colour regions which clarify

the successful detection. Therefore, the application will do colour space conversion

from RGB to HSV. Improve the realistic result defines maximum and minimum

HSV value as a constant.

Figure 75: Constant colour values


R (Red) to H (Hue) conversion will give by,

Figure 76: Colour space conversion


92

G (Green) to S (Saturation) conversion will give by,



B (Blue) to V (Value) conversion will give by,



7.8 Image thresholding

Image thresholding will be given the binary image with white colour hand region

and black colour background. Image thresholding will consist three steps known as

image grayscale, calculates the thresholding value, and finalizing the thresholding.

In the first step of image, grayscale will completely need to calculate the image

thresholding value. Therefore grayscale image will converts to Histogram and

collecting each histogram values sequences.

Figure 79: Image thresholding


93

Finalize the process by calculating the variation of the both, background and

foreground images, and thresholding value given by,

Figure 80: Calculate threshold value

Source: Author’s work (2016) After obtained thresholding value grayscale image will convert to binary image.

During the conversion each pixel RGB values are compared with the thresholding

value it’s given by,

Figure 81: Thresholding initialization


7.9 Arm remover.

After completing all image pre-processing steps as a next step, the system needs to

remove the arm from the detecting posture which it will be able to improve the

recognition accuracy. Therefore application will be able to recognized where wrist

starting position in the hand posture. During runtime application scanning bottom

to top calculate horizontal white pixels when the total result will be 29 and once

width detected application will return it. This 29 is the width of starting point in

wrist and it’s given by,

Figure 82: Arm remover


94

7.10 Detected thumb region

Detecting thumb region is the first step of the gesture recognition process. Before

calculating thumb region, it needs the image to rotate opposite of the clockwise and

calculate the total white pixels, upper x, y pixels and lower x, y pixels. Thumb

region detection consist of two steps, first needs to calculate total height.

Finalize the second step thumb Height will need to calculate the percentage of the

overall posture.

Figure 83: Detect thumb region


95

7.11 Gesture recognition

Figure 84: Gesture recognition implementation


Gesture recognition will be done via mathematical framework with three steps. As

the first step extracts the hand features, then the application will calculate the each

extracted feature percentage and finally application will recognizing the posture of

those calculated feature percentage values.

During feature extraction system will extract 3 features of the detected posture.

These features define as Setlevei, Setlevelii, Setleveliii and before

calculated these values application need to calculate upper pixels coordinates, lower

pixels coordinates and hand range height. The hand range is given by,

Figure 85: Calculate height


Setleveli will be,

Figure 86: Calculate feature level 1


96

Setlevelii will be,



Setleveliii will be,



After calculating the extracted feature of Setleveli, Setlevelii,

Setleveliii next moves to recognizing the posture. During recognizing posture

application will process two steps such as validate thumb region and recognition

posture.

The thumb region validates given by,

Figure 89: recognize thumb region


When thumb region is detected after that the application will load

HandPosturemodelOne() which consists thumb region postures and when the

application doesn’t detect thumb region, the application will load

HandPosturemodelTwo() which does not consist thumb region postures. After

selected the relevant posture model it will move to start posture recognizing.

97

HandPosturemodelOne() have two postures and it’s given by,

Figure 90: Check posture model one


HandPostureModelTwo() have 4 postures and it’s given by,

Figure 91: Load posture model two


98

Figure 93: Add into motion list

7.12 Motion detection and recognition

Figure 92: Motion detection and recognition implementation


Motion detection calculates based on the pixels differences. During recognition

process application will obtain the x, y coordinates and when motion detection

process collects each and every new x’, y’ coordinates and calculate the

differences. Depend on the positive and negative application detects motion which

way to flow and difference equals to zero application detects and the motion will

be nil. Finally motion detected posture will be added to the relevant motion list.

It will give by for y coordinates,


99

Motion detection during in x coordinates will give by,

Figure 94: Detect horizontal motion


100

Figure 95: Call commands 1

After detecting the motion by 20 images application stops detection and starts

motion recognizing. Motion recognition will be calculated by counting the all

motion lists then gests the highest value. Counting will happen by separately which

of x and y motion direction. After recognized the motion it will call relevant

command for as a response to user.


101

Figure 96: C all Commands 2


102

7.13 Command Executing

Command execution will be control by key-board SendKey event. Application

after recognized the all gestures and motion it’s directly call relevant key

command.

Send-Keys command Key-board key pattern

SendKeys.Send("^(i)"); Alt + i

SendKeys.Send("^(m)"); Alt + m

SendKeys.Send("%{PGDN}"); Ctrl + Pgdn

SendKeys.Send("^{ADD}"); Alt + (+)

SendKeys.Send("^{SUBTRACT}"); Alt + (-)

SendKeys.Send("%{F4}"); Ctrl + F4

Table 8: Key event execution

103

CHAPTER 8

TESTING

8.1 Test Plan

Software testing is a process of executing the application with the aim of satisfied

the specific requirements or not. Overall testing strategies will be covered quality

of the both source code and accuracy of the performance as well. For testing this

application used 6 testing strategies known as

Unit testing.

Scenario testing.

Performance testing.

Scalability testing.

Environment testing

Accuracy testing.

This proposed approach acquires different image processing techniques, during

testing have to test each processing steps how behave under lighting conditions,

environment conditions, and hand sizes as well. Performance and accuracy testing

will cover hand detection, gesture recognition, motion detection and motion

recognition. For each testing stages used 100 hand postures by four people. Finalize

the testing process next detailed discuss on test evaluations, what are the most

suitable solutions for highest positive false rate faced in during testing.

8.2 Unit Testing

Unit testing which used for testing individual module and components determine is

there any errors or bugs in the source code of the system. Below test cases show a

summary of the each unit test, see APPENDIX A for Completed test cases.

104

8.2.1 Web Camera

Test Scenario Test Case ID Pass/Fail

Capture image frame

via web camera.

2.1 Pass

Table 9: Web camera unit tests

8.2.2 Data Read and Write


Save Background

image.

2.2 Pass

Load Background

image.

2.3 Pass

Table 10: Data read and write unit tests

8.2.3 Image Pre-processing


Calculates the image Histogram

value.

2.4 Pass

Calculates the Image

thresholding values.

2.5 Pass

Converts Red pixels to Hue. 2.6 Pass

Converts Green pixels to

Saturation.

2.7 Pass

Converts to Blue pixels to Value. 2.8 Pass

Image Converts 3x3 Metrix. 2.9 Pass

Calculate pixels differences. 2.10 Pass

Table 11: Image processing unit tests

105

8.2.4 Gesture Recognition


Arm remove. 2.11 Pass

Rotate image. 2.12 Pass

Calculate Upper and Lower

Pixels.

2.13 Pass

Total White Pixels. 2.14 Pass

Extract hand Feature. 2.15 Pass

Table 12: Gesture recognition unit tests

8.2.5 Motion Detection


Calculates differences of x

and y coordinates.

2.16 Pass

Add motion value in to

relevant List.

2.17 Pass

Table 13: Motion detection unit tests

8.2.6 Command Execution


Execute command. 2.18 Pass

Table 14: Command unit tests

106

8.3 Scenario Testing

After completed the each unit testing next have to check overall functions output

result. Therefore sleeted scenario testing method which of checks use-cases

scenarios of this application.

8.3.2 Morphological Operation Erosion

Test Id: 3.1 Test

Scenario:

Reduce the volume of the white blobs and

image noise.

Input Data Expected result Actual result

Remove the image

noises.

Steps: Smooth image.

Background subtraction.

Skin colour conversion.

Image Thresholding.

Apply morphological operation for remove image noises

from the binary image

Comment: Completely removed image noise and reduce the Volume of the white

blobs.

Table 15: Erosion test

107

8.3.3 Morphological Operation Dilation

Test Id: 3.2 Test Scenario: Increase the volume of the white blobs.


Increase the image

volume.




Image Thresholding.

Apply morphological operation for remove image noises from the

binary image.

Apply morphological operation for increase the volume of the

white blob.

Comment: Completely increased the volume of the white blobs.

Table 16: Dilation test

8.3.4 Image Grayscale

Test Id: 3.3 Test Scenario: Image Grayscale


Converts RGB image

into Grayscale image.

Steps: Smooth image


Skin colour conversion

Image Grayscale

Comment: Completely increased the volume of the white blobs.

Table 17: Image grayscale test

108

8.3.5 Image Thresholding

Test Id: 3.4 Test Scenario: Image Thresholding.


HSV image converts to

Binary image.




Image Thresholding.

Comment: Image thresholding successfully completed.

Table 18: Image thresholding test

8.3.6 Image Smoothing

Test Id: 3.5 Test Scenario: Image smoothing.


Output image should be a

smoothness image.

Steps: Capture image via web camera.

Apply Gaussian blur image smooth filter.

Comment: Completely added a smooth filter.

Table 19: Image smoothing test

109

8.3.7 Converts RGB image into HSV colour space.

Test Id: 3.6 Test Scenario: Converts to HSV colour spaces


In RGB colour image converts

to HSV.



Convert RGB colour space to HSV colour space.

Comment: Completely convert RGB image into HSV colour spaces.

Table 20: Colour conversion test

8.3.8 Background Subtraction

Test Id: 3.7 Test Scenario: Background subtraction.

Input Data

(Capture frame and background)

Expected result Actual result

Background image

will be removed

from the current

frame.

Steps: First capture background image.

Capture image with the hand.

Smooth both images.

Subtract current frame from background image.

Comment: Background subtraction successfully.

Table 21: Background subtraction test

110

8.3.9 Detect thumb region of a hand.

Test Id: 3.8 Test Scenario: Calculates hand thumb region.


Arms removed

detected hand

Calculates thumb region

percentage.

Calculates thumb region

percentage

Steps: Detect hand from the posture

Remove arm

Calculates thumb region pixels percentage.

Comment: Successfully calculates the thumb region.

Table 22: Thumb region detection test

8.3.10 Extract hand features percentage.

Test Id: 2.9 Test Scenario: Calculate hand features percentage


Arm removed

detected hand

Detect three geometry of hand

features and calculate each

area percentages.

Detect three geometry of

hand features and calculate

each area percentages.

Steps: Detect hand from the posture

Remove arm

Detect three geometry of hand features

Calculates each area percentage.

Comment: Successfully calculates hand extracted features percentage.

Table 23: Features extraction test

111

8.3.11 Detect motion.

Test Id: 3.10 Test Scenario: Detect motion


Old x and y coordinates

where get from the

recognition stage and new

x and y coordinate in the

current frame.

System will identify

the old and new

coordinates

differencess

System will identify the old

and new coordinates

differencess

Steps: Recognizing the posture.

Get x and y coordinates.

Stop recognition and start the motion detection stage.

Get new x and y coordinates.

Detected the motion via coordinate differences.

Comment: Successfully detected the motion.

Table 24: Motion detection test

8.3.12 Recognizing motion

Test Id: 3.11 Test

Scenario:

Motion recognizing.


Motion destination

lists count

Count higest sum of the motion

lists

Count higest sum of the

motion lists

Steps: Recognizing the posture.

Detect the motion.

Add to particular motion destination list.

Calculate highest sum of the lists.

Comment: Successfully motion recognized.

Table 25: Motion recognition test

112

8.3 Scalability testing

Scalability testing is a non-functional testing method for the software application.

According to Smartbear (2016) Scalability testing performed as a series of load test

with the different hardware and software settings, CPU speed, servers and a number

of the user involved as well. In this section discuss how much capability for

recognizing gestures and motion from different users hand sizes that perform in this

system with 100 postures.

8.3.1 Recognizing gestures in different hand sizes from better hand

detection environment.

Test Id: 3.1 Test Scenario: Different hand sizes from better

environment.

Hand sizes False negative rate

(FNR)

False positive rate

(FPR)

Small 98 % 2%

Medium 100% 0%

Large 98 % 2%

Steps: Active different users.

Hand detection.

Recognizing gesture.

Comment: When hand detection gave the best accuracy, the application will

provide better recognition process for different hand sizes.

Table 26: Scalability test 1

113

8.3.2 Recognizing gestures in different hand sizes from the less cluttered

environment.

Test Id: 3.2 Test

Scenario:

Different hand size from less cluttered

background.


(FNR)

False positive rate

(FPR)

Small 90 % 10%

Medium 98 % 2%

Large 90% 10%


Hand detection.


Comment: When background becomes less cluttered posture recognition

provides a good average result. See test evaluation 8.7.1 section.


8.3.3 Recognizing gestures different hand sizes from much-cluttered

background

Test Id: 3.3 Test Scenario: Different hand size from too much-

cluttered background.


(FNR)

False positive rate

(FPR)

Small 10 % 90%

Medium 10 % 90%

Large 10 % 90%


Hand detection.


Comment: When hand detection gave the poor accuracy, the application will

provide poor recognition process. See test evaluation 8.7.1 section.


114

8.3.4 Recognizing motion from better Hand detection environment.

Test Id: 3.4 Test

Scenario:

Different hand sizes from a better

environment.


(FNR)

False positive rate

(FPR)

Small 100 % 0 %

Medium 100 % 0 %

Large 100 % 0 %


Hand detection.


Detect and recognizing motion

Comment: When hand detection gave the high accuracy, the application will

provide better recognition process.


8.3.5 Recognizing motion from the less cluttered environment.

Test Id: 3.5 Test

Scenario:

Different hand size from less cluttered

background.


(FNR)

False positive rate

(FPR)

Small 98 % 2 %

Medium 98 % 2 %

Large 98 % 2 %

Steps: Active different users

Hand detection.


Detect and recognizing motion.

Comment: When hand detection gave the less accuracy, the application will

provide good average recognition process for motion. See test

evaluation 8.7.1 section.


115

8.3.6 Recognizing motion from the much-cluttered environment.

Test Id: 3.6 Test Scenario: Different hand size from much-

cluttered background


(FNR)

False positive rate

(FPR)

Small 10 % 90 % Medium 10 % 90 %

Large 10 % 90 %


Hand detection.


Detect and recognizing motion.

Comment: When hand detection gave less accuracy, the application will provide

poor motion recognition process. See test evaluation 8.7.2 section.


116

8.4 Test Environment

Khanduja (2008), A testing environment is a setup of software and hardware on

which developer testing of the newly build software product. Basically, this step

consists of the physical setup which includes hardware, a logical setup that contains

server operating system, client operating system, and databases servers. In this

section discuss how this application response in different environment condition

such as cluttered background, static background different lighting conditions.

8.4.1 Hand Detection from cluttered Background – Level 1

Test Id: 4.1 Test Scenario: Hand detection from cluttered background.


Remove all unnecessary

background pixels and

detect the hand without

arm.

Steps: Capture background image and image with hand.

Apply background subtraction.

Convert colour space.

Image thresholding.

Accuracy

rate:

Accuracy rate will give by 98%.

Comment: Application successfully detects the hand from the background. This

testing was done by using 100 postures and there are 2 images will not give

the expected result.

Table 32: Environment test 1

117





background pixels and detect

the hand without arm.




Image thresholding.

Accuracy rate: Accuracy rate will give by 100% for all 100 test postures.

Comment: Application successfully detects the hand from the background.






background pixels and

detect the hand without

arm.




Image thresholding.

Accuracy

rate:

Accuracy rate will give by 100% for all 100 test postures.



118







Steps: Apply background subtraction

Convert colour space

Image thresholding.

Accuracy

rate:


Comment: Application successfully detects the hand from the background. This

testing was done by using 100 postures and there are 2 images will not

give the expected result.


8.4.5 Hand Detection from Static Background – Level 3

Test Id: 4.5 Test Scenario: Hand detection from static background.






Apply background subtraction

Convert colour space

Image thresholding.

Accuracy

rate:

Accuracy rate will give by 100% for all 100 test postures



119










Image thresholding.

Accuracy

rate:













Image thresholding.

Accuracy

rate:




120

8.4.8 Hand Detection from Similar skin colour background– Level 3

Test Id: 4.8 Test Scenario: Hand detection from Skin colour background.




the hand.

Steps: Apply background subtraction.

Converts colour space.

Image thresholding.

Accuracy

rate:


Comment: When background contains any skin colour region hand detection give poor

accuracy. This testing was done by using 100 postures and there are 95

images will not give the expected result. See test evaluation 8.7.1 section.








Steps: Apply background subtraction

Converts colour space

Image thresholding.

Accuracy

rate:




121

8.4.10 Hand Detection from Artificial lighting – White light

Test Id: 4.10 Test Scenario: Hand detection from white light condition.







Image thresholding.

Accuracy

rate:




8.4.11 Hand Detection from Artificial lighting – Yellow light

Test Id: 4.11 Test Scenario: Hand detection from yellow light condition.







Image thresholding.

Accuracy

rate:




122

8.5 Accuracy testing

During accuracy testing focus how application support to measure clear hand

detection with different wearable conditions.

8.5.1 Hand detection with wear a rings

Test Id: 5.1 Test Scenario: Hand detection wearing ring.







Image thresholding.

Accuracy

rate:



Table 43: Accuracy test 1

123

8.5.2 Hand detection with wear a blazer.

Test Id: 5.2 Test Scenario: Hand detection with wearing long shirt.







Image thresholding.




8.5.3 Hand detection with wear are glamourous as uncommonly.


Test Id: 5.3 Test Scenario: Hand detection with hand-bands.





Steps: Capture background image



Image thresholding.

Accuracy

rate:

Accuracy rate will not gave the expected level.

Comment: Application successfully detects the hand from the background. But when

recognizing the posture arm doesn’t remove successfully. See test

evaluation 8.7.3 section.

124

8.5.4 Hand detection with wear wrist watch

Test Id: 5.4 Test Scenario: Hand detection with watch.





Steps: Capture background image.



Image thresholding.




8.5.5 Hand detection with wore glove

Test Id: 5.5 Test Scenario: Hand detection with surgical glove.






Apply background subtraction

Converts colour space

Image thresholding.

Accuracy

rate:




125

8.6 Performance testing

Performance testing is a type of testing intended to determine the receptiveness,

quantity, reliability, and /or scalability of a system under a given work load. This

section discusses how much application will consume the time for each core area

of this system.

8.6.1 Performance of load background image.

Test Id: 7.1 Test Scenario: Load back ground image.

Expected time

(Milliseconds)

Actual minimum time

(Milliseconds)

Actual maximum time

(Milliseconds)

50 11 32


Load background image

Comment: The average time of load background image will 21.4 millisecond.

Table 48: Performance test 1

8.6.2 Performance of Hand Detection

Test Id: 7.2 Test Scenario: Hand Detection.

Expected time

(Milliseconds)

Actual minimum time

(Milliseconds)

Actual maximum time

(Milliseconds)

700 340 620


Capture background image with hand

Hand detection

Comment: The average time of hand detection will 252.2 millisecond.


126

8.6.3 Performance of Gesture Recognition

Test Id: 6.3 Test Scenario: Gesture recognition.

Expected time

(Milliseconds)

Actual minimum time

(Milliseconds)

Actual maximum time

(Milliseconds)

1000 655 980

Steps: Capture background image.

Capture background image with hand.

Hand detection.

Gesture recognition.

Comment: Average time of gesture recognition will 778 millisecond.


8.6.4 Performance of Motion Detection

Test Id: 6.4 Test Scenario: Motion Detection

Expected time

(Milliseconds)

Actual minimum time

(Milliseconds)

Actual maximum time

(Milliseconds)

1200 876 1166



Hand detection

Gesture recognition

Motion detection

Comment: Average time of motion detection will 1118.4 millisecond.


127

8.6.5 Performance of Motion Recognition

Test Id: 6.5 Test Scenario: Motion Recognition.

Expected time

(Milliseconds)

Actual minimum time

(Milliseconds)

Actual maximum time

(Milliseconds)

500 244 366



Hand detection

Gesture recognition

Motion detection

Motion recognition

Comment: Average time of motion detection will 302 millisecond.


8.6.6 Performance of Overall.

Test Id: 6.6 Test Scenario: Overall performance.

Expected time

(Milliseconds)

Actual minimum time

(Milliseconds)

Actual maximum time

(Milliseconds)

3500 2136 3164



Hand detection

Gesture recognition

Motion detection

Motion recognition

Call Command

Comment: Average time of motion detection will 2670.83 millisecond.


128

8.6.7 CPU usage and Memory usage.

Test Id: 6.5 Test Scenario: Overall performance

CPU Usage Memory Usage

Expected Usage 8.5 % 60 MB

Actual Usage 6.7 % 40.1 MB

Comment: CPU and memory usage will be change when number of

background application processing.


8.7 Test evaluation

8.7.1 Hand detection from different background

Hand detection will perform with a cluttered level of background independently.

When the background becomes better condition application will give the perfect

hand detection to the binary image. In other hand background will contain too

much-cluttered level hand detection doesn’t perform very well.

Figure 97 : Hand detection performance


129

This application achieves the major task of hand detection via two algorithm knows

as background subtraction and skin colour detection. Obtaining the better accuracy

of detecting, the application needs totally separated different colour images. But if

a user captures a cluttered background it may contains some similar skin colour

pixels areas or objects. Detecting a clear hand by normal webcam in this

environment does not give a possible output. Therefore to avoid these kind of

environmental issues the author proposed a suitable solution which is to use a

sensor camera or depth camera continuing their works

8.7.2 Gesture and Motion recognition

Gesture recognition and motion recognition requirements are almost the same.

Achieving a better recognition accuracy application expects better clear detected

hand. Hand detection and recognition performance works almost parallel.

Furthermore clarification see chart below,

Figure 98: gesture recognition performance


130

Above diagram shows, hand detection and recognition levels are working together

simultaneously. When hand detection proved a poor detection rate recognition will

slack the performance. Therefore hand detection is the vital function of both motion

and gesture recognition of the system.

8.7.3 Wearable conditions.

During testing application justified hand detection is not depended on the wearable

condition such as hand bands, wrist watches, rings it has given the clear result. But

when user wearing unnecessary items, application doesn’t provide hand detection

with removing the arm as shown in test case ID 5.3.Theses conditions might be

affecting during gesture recognition process. Therefore during working time user

must need to have less wearable or without wearable.

131

CHAPTER 9

CRITICAL EVALUATION

This dissertation powers the solution for the interacting with the PC without using

the mouse or any other outdated input device. The author has managed to build a

product with efficient algorithms that are related to the theories grasped during the

course, thus filling gaps of the current systems successfully.

During a limited period of time, the author has acknowledged different theories

such as hand detection, gesture recognition, motion detection and motion

recognition which are required to implement the proposed system. The scope of

this proposed solution was defined by the issues raised by the current system and

the literature survey. Hence this was a challenging and time-consuming task.

9.1 Domain Research

During domain research, the author was able to find drawbacks of similar systems

such as accuracy performance, suitable applications, and possibility of the

recognizing gestures and motions, that required treatment and the author managed

to solve this via the proposed system. The author had done the investigation from

online articles, user reviews for which kind of application areas able to controlling

this proposed system. Therefore, the author had suggested three media software,

one presentation software, and one communication software such as “power point

presentation, KMP player, VLC player, image viewer, system volume and skype”.

The author was suggested Logitech c270 web camera based on the price, frame per

second (fps) and resolution.

9.2 Technical research

During the technical research, the author conducted the main three phases of this

proposed system like hand detection, gesture recognition, motion detection and

recognition.

132

In the past research papers, various techniques for hand detection were proposed,

for example wearing gloves such as surgical gloves, using sensor cameras and

standing in front has static coloured background. These methods build a wide gap

between human and computer interactions (HCI). To avoiding these issues author

has proposed the most suitable, accurate and open method for hand detection based

on the image pre-processing algorithms. There are 5 core algorithms which are

required to detect the hand. Those are, background subtraction, image smoothing,

image thresholding, noise removing, and arm remover.

While selecting a background subtraction method, the author came with different

approaches for temporal detection, optical flow, background subtraction.

Considering author’s approach temporal and optical methods provided a lack of

performance and accuracy. Because these algorithms were commonly used for

detecting an object that represents with camera movements.

After critical study done on related research papers, author concluded by

emphasizing background subtraction is most accurate and real-time static object

detection method widely used than other approaches.

After finalizing the background subtraction, the system will require an algorithm

for improving the reliability of the hand detection. Therefore, the author chose skin

colour detection as the next level of hand detection. Skin detection was too

expensive with using RGB colour space. The author came across different colour

spaces are which is most suitable skin colour detection such as, Normalize RGB,

YCbCr, YIQ, YUV, HSV, and YDbDr. In the research papers, the widely used

colour spaces were YCbCr and HSV. Comparing with YCbCr and HSV, YCbCr

was effective with different lighting conditions and user regions but HSV does not

depend on the lighting and melanin colour level of the skin. Author proposed HSV

colour space for skin detection due to the fact that it was widely used among past

researches.

133

After converting the colour space, it was required to convert to a binary image.

Image banalization or Image thresholding will reduce the expensive values of the

image and it improves accuracy during posture recognition. Today most leading

thresholding algorithms are otsu’s thresholding and k-means method. Comparing

both these algorithms, k-means performs within less time sequence. But as

drawbacks k-means method does not provide better performance than otsu’s

method and its threshold value does not depend on the environment condition. After

cruicial comparison, author concluded with otsu’s method which gives better

performance than other methods and it provides the threshold value based on the

environment conditions.

After obtaining the threshold, the image needs further optimization to detect the

hand, such as remove image noises and split pixels. Therefore, the author used

morphological operation which is the most common binary image optimizing

algorithm described in past research papers.

Once all the image pre-processing steps have been carried out until clear hand

detected. It was interpreted that the detected hand may contain part of the arm which

is not required for recognition there by, it would lessen the performance of gesture

recognition.

Finding proper arm removal algorithm was another challenging task. Because most

of the research papers mentioned algorithms which lacks the accuracy as per stated

by the author for the proposed solution. Therefore, the author was able to create a

novel algorithm based on proposed system’s requirements and it was able to

identify large scalar hand sizes.

Traditional gesture recognition approaches are based on the templates or any kind

of separate datasets. To improve the recognition accuracy, these datasets should be

included and tested using different environmental conditions, lightings and scalar

of data. The major drawback was identified when the dataset was too large and

recognition processing was too time-consuming. Therefore, these recognition

approaches does not fit with the real-time application which author had proposed.

134

After solving these issues author was able to develop his own algorithm based on

the mathematical framework which responds within 1200 milliseconds inclusive of

all image pre-processing steps.

After completion of the gesture recognition process, there was a need to check if

posture contains a motion or not. Most of the research papers are followed frame

differences which are compared with a current frame with the previous frame. The

common issue was the failure of frame differences to measure the variation of the

motion. Hence, the author has transformed the traditional image differences

algorithm into the mathematical coordinate based approach. Therefore, as an input

parameter, the system needs two input parameters such as x and y coordinates of

the recognized posture where just completed recognizing process, and x, y

coordinate of the postures during motion detection. This algorithm will be able to

detect the measurement of the pixels variation, and recognition of frame

differences.

9.3 System Design

After finalizing all the requirements and the technologies which are to be used by

the system, the designing process was carried out. Designing was done using

Unified Modeling Language (UML) diagrams and these diagrams allowed the

author to implement the system with a proper understanding of the depth of the

system. At the beginning of the design process, the author managed to break it

down to three major roles, such as hand detection, gesture recognition, motion

detection and recognition. Hence understanding the user requirements and user-

interactions of each role defined by UML diagrams such as Use-case diagrams,

activity diagrams, and sequence diagram. Furthermore understanding the overall

system, author designed Block diagram, Class diagram, and system architectural

diagram which were evaluated with a design quality criteria to ensure the quality

of the final outcome of the design. During design the class diagram author has

manage create reusable class structure for feature development.

135

9.4 System implementation

During the implementation, the author used software architectural principles such

as design patterns and design architecture. For design patterns, the author used

Facade design patterns and as for architecture design, the author has used MVC

architecture. Furthermore, the author was able manage memory utilization aspect

for developing the system.

9.5 System testing

Testing is a critical process that requires completionof the proposed system in order

to determine whether the implemented system provide the expected absolute results

or not. In order to test the system, the author covered testing strategies such as unit

testing, scenario testing, performance testing, accuracy testing, scalability testing

and environment testing. For testing purposes, the author used 100 different hand

postures by four users “based on different hand diversities witch is common in

Asian countries” for the scalability testing, chose three different lighting conditions

and eight different locations for the environment testing, the author wore five

different wearable accessories for the accuracy testing as well. And implemented

58 test cases to determine whether the system and its features are working as it was

originally established to be done. Finally comparing the overall responding time,

lighting condition this proposed system provided 90% than other exits similar

systems.

In environment testing it was understood there is no relationship between hand

detection and cluttered background and these two condition behave in intentionally.

When background contains much cluttered like skin colour regions, hand detection

be not given the expected accuracy level. Therefore, avoid these issues author

proposed to use sensor cameras as a confutation solution. During testing, author

justified gesture recognition and hand detection working in parallel. This proposed

system recognized gestures with large hand scalars such as small, medium, large

from clear hand detection condition.

136

When user wears rings, wrist watch, gloves, hand bands and blazer, the system

provides expected accuracy results for hand detection. But however when users

wear uncommon glamourous hand bands, the system does not function efficiently

in removal arm. For example, when the user wore too many hand bands, unusual

increase in arm width was recognized. Hence in such detected posture, the proposed

arm removal algorithm was not able to detect the actual natural arm.

The author obtained a remarkable learning experience by successfully completing

the project. The experience which the author obtained by performing different

project tasks for over six months added a huge value to the author’s professional

and academic expertise.

137

CHAPTER 10

CONCLUSION

The Main objective of this project is to introduce and develop a new input modality

for real-time control application using hand motion and gestures, which reduces the

distance between user and computer of the traditional input methods such as

keyboard, mouse, and other external devices.

The proposed system was implemented based on the computer vision technology

which is able to detect the user's hand, recognizing posture, detect motions and

recognizing it. According to the gestures and motions inputted the system will

control the relevant function.

The research guided the author to implement a reliable detection and recognition

motion and gestures within fewer time sequences and finally, it was justified during

testing and evaluation that the project has successfully met the expected standards.

The author would like to conclude by appreciating the personal experience gained

during the project. And it must be stated that the project indeed has provided a great

knowledge during the project research and analysis. The system was implemented

by C#.Net, many of technical aspects such as image processing and other

programming techniques, manipulations were gained during this project.

10.1 Further Development

Introduce new future modern applications to controlling.

Introduce voice recognition system for handling.

Enable a feature controlling this application for disable persons.

Train the system for all universal hand diversities.

138

10.2 Limitation and assumption

The main limitation of the final application is that it detects a hand via a

vision based technology where the background wouldn’t have skin colour regions

and it should be static. Secondly, the user has to use separate web cameras to access

the application.

The assumption is that this application cannot be used by Polydactyly people (who

have more than five fingers in one hand).

139

REFERENCES

Albiol, A., Torres, L., & Delp, E. J. (2001, October). Optimum color spaces for

skin detection. In ICIP (1) (pp. 122-124).

AL-mohair, H. K., Mohamed-Saleh, J. U. N. I. T. A., & Suandi, S. A. Impact of

Color Space on Human Skin Color Detection Using an Intelligent System.

Babu, A. A., Varma, S., & Nikhare, R. HAND GESTURE RECOGNITION

SYSTEM FOR HUMAN COMPUTER INTERACTION USING CONTOUR

ANALYSIS.

Blunsom, P. (2004). Hidden markov models. Lecture notes, 15, 18-19.

Chen, F. S., Fu, C. M., & Huang, C. L. (2003). Hand gesture recognition using a

real-time tracking method and hidden Markov models. Image and vision

computing, 21(8), 745-758.

Chernov, V., Alander, J., & Bochko, V. (2015). Integer-based accurate conversion

between RGB and HSV color spaces. Computers & Electrical Engineering.

Choi, J., Seo, B. K., & Park, J. I. (2009, December). Robust hand detection for

augmented reality interface. In Proceedings of the 8th International Conference on

Virtual Reality Continuum and its Applications in Industry (pp. 319-321). ACM.

Choras, R. S. (2007). Image feature extraction techniques and their applications for

CBIR and biometrics systems. International journal of biology and biomedical

engineering, 1(1), 6-16.

Dahiya, D. (2010, June). Enterprise systems development: impact of various

software development methodologies. In Software Engineering and Data Mining

(SEDM), 2010 2nd International Conference on (pp. 117-122). IEEE.

Dardas, N. (2012). Real-time Hand Gesture Detection and Recognition for Human

Computer Interaction (Doctoral dissertation, University of Ottawa).

140

Dardas, N. H., & Georganas, N. D. (2011). Real-time hand gesture detection and

recognition using bag-of-features and support vector machine techniques.

Instrumentation and Measurement, IEEE Transactions on, 60(11), 3592-3607.

Dhawan, A., & Honrao, V. (2013). Implementation of Hand Detection based

Techniques for Human Computer Interaction. arXiv preprint arXiv:1312.7560.

Doya and Wang. 2004. A Basic Introduction to Neural Networks. [ONLINE]

Available at: http://scikit-learn.org. [Accessed 20 October 15].

Doya and Wang. 2015. Neural Networks. [ONLINE] Available at:

http://www.journals.elsevier.com. [Accessed 20 October 15].

Duan, H., & Luo, Y. (2013, March). A Method of Gesture Segmentation Based on

Skin Color and Background Difference Method. In Proceedings of the 2nd

International Conference on Computer Science and Electronics Engineering.

Atlantis Press

El-gayar, M. M., & Soliman, H. (2013). A comparative study of image low level

feature extraction algorithms. Egyptian Informatics Journal, 14(2), 175-181.

Ford, A., & Roberts, A. (1998). Color space conversions. Westminster University,

London, 1998, 1-31.

Forsyth, D. (2014). Object Detection with Discriminatively Trained Part-Based

Models. Computer, (2), 6-7.

Gonzalez, R. C. (2009). Digital image processing. Pearson Education India.

Greensted, A. (2009).Blob Detection. The Lab Book Pages,Image Sciences

Institute UMC Utrecht.

Hand. 2015. Healthline. [ONLINE] Available at: http://www.healthline.com.

[Accessed 21 October 15].

141

Haritaoglu, I., Harwood, D., & Davis, L. S. (1998). W 4 S: A real-time system for

detecting and tracking people in 2 1/2D. In Computer Vision—ECCV'98 (pp. 877-

892). Springer Berlin Heidelberg.

Hasan, M. M., & Mishra, P. K. (2012). Novel algorithm for multi hand detection

and geometric features extraction and recognition. International Journal of

Scientific & Engineering Research, 3(5).

Istqbexamcertification. 2015. What is Spiral model- advantages, disadvantages and

when to us. [ONLINE] Available at: http://istqbexamcertification.com. [Accessed

07 October 15].

Jie, Y., Yang, Y., Weiyu, Y., & Jiuchao, F. (2013, November). K-means multi-

threshold image segmentation based on firefly algorithm. In 3rd International

Conference on Multimedia Technology (ICMT-13). Atlantis Press.

Jun, H., & Hua, Z. (2008, May). A real time face detection method in human-

machine interaction. In Bioinformatics and Biomedical Engineering, 2008. ICBBE

2008. The 2nd International Conference on (pp. 1975-1978). IEEE.

Lamp. 2012. Why use SVM?. [ONLINE] Available at: http://www.yaksis.com.


Leap motion. 2015. LeapMotion. [ONLINE] Available at:

https://www.leapmotion.com. [Accessed 21 October 15].

Li, H., Wang, Y., Liu, W., & Wang, X. (2013, August). Detection of static salient

objects based on visual attention and edge features. In Proceedings of the Fifth

International Conference on Internet Multimedia Computing and Service (pp. 252-

255). ACM.

Lionnie, R., Timotius, I. K., & Setyawan, I. (2011, July). An analysis of edge

detection as a feature extractor in a hand gesture recognition system based on

nearest neighbor. In Electrical Engineering and Informatics (ICEEI), 2011

International Conference on (pp. 1-4). IEEE.

142

Lipton, A., Kanade, T., Fujiyoshi, H., Duggins, D., Tsin, Y., Tolliver, D., &

Wixson, L. (2000). A system for video surveillance and monitoring (Vol. 2).

Pittsburg: Carnegie Mellon University, the Robotics Institute.

Liu, D. J., & Yu, J. (2009, August). Otsu method and K-means. In Hybrid

Intelligent Systems, 2009. HIS'09. Ninth International Conference on (Vol. 1, pp.

344-349). IEEE.

Liu, H., Duan, X., Zou, Y., & GAO, D. (2009, December). Detection of hands-

raising gestures using shape and edge features. In Robotics and Biomimetics

(ROBIO), 2009 IEEE International Conference on (pp. 1480-1483). IEEE.

Liu, N., & Lovell, B. C. (2001, January). MMX-accelerated real-time hand tracking

system. In IVCNZ 2001 (pp. 381-385).

Lockton, R. (2002). Hand gesture recognition using computer vision. 4th Year

Project Report, 1-69.

Lowe, D. G. (2004). Distinctive image features from scale-invariant

keypoints.International journal of computer vision, 60(2), 91-110.

MarksThinkTank. 2014. ControlAir. [ONLINE] Available at:

https://itunes.apple.com. [Accessed 21 October 15].

MarksThinkTank. 2014. google store. [ONLINE] Available at:

https://play.google.com. [Accessed 21 October 15].

Microsoft Corporation. 2015. Meet Kinect for Windows. [ONLINE] Available at:

http://www.microsoft.com. [Accessed 13 October 15].

Mittal, A., Zisserman, A., & Torr, P. H. (2011, September). Hand detection using

multiple proposals. In BMVC (pp. 1-11).

Mutha, S., & Kinage, K. S. (2015). Study on Hand Gesture Recognition. University

of Science and Technology Beijing, 1(4), 51-57

143

Ng, C. W., & Ranganath, S. (2002). Real-time gesture recognition system and

application. Image and Vision computing, 20(13), 993-1007.

Otsu, N. (1975). A threshold selection method from gray-level histograms.

Automatica, 11(285-296), 23-27.

Panwar, M. (2012, February). Hand gesture recognition based on shape parameters.

In Computing, Communication and Applications (ICCCA), 2012 International

Conference on (pp. 1-6). IEEE.

Park, H. (2008). A method for controlling mouse movement using a real-time

camera. Brown University, Providence, RI, USA, Department of computer science.

Pavani, S. K., Delgado, D., & Frangi, A. F. (2010). Haar-like features with

optimally weighted rectangles for rapid object detection. Pattern Recognition,

43(1), 160-172.

Pcadvisor. 2015. Logitech C270 HD Webcam review. [ONLINE] Available at:

http://www.pcadvisor.co.uk. [Accessed 05 October 15].

Pointgrab. 2010. pointgrap. [ONLINE] Available at: http://www.pointgrab.com/.


Rautaray, S. S., & Agrawal, A. (2012). Real time hand gesture recognition system

for dynamic applications. Int J UbiComp, 3(1), 21-31.

Rivera. 1996. Strengths and weaknesses of hidden Markov models. [ONLINE]

Available at: http://compbio.soe.ucsc.edu. [Accessed 28 October 15].

Roomi, S. M. M., Priya, R. J., & Jayalakshmi, H. (2010). Hand gesture recognition

for human-computer interaction. Journal of Computer Science, 6(9), 1002-1007.

Rupe, Jonathan. "Vision-based hand shape identification for sign language

recognition." (2005). Rochester Institute of Technology, Technique.

Sheen. 2010. The nature of code. [ONLINE] Available at: http://natureofcode.com.


144

Sparrow. 2015. What is the Spiral Model? [ONLINE] Available at:

http://www.ianswer4u.com. [Accessed 15 October 15].

Stauffer, C., & Grimson, W. E. L. (1999). Adaptive background mixture models

for real-time tracking. In Computer Vision and Pattern Recognition, 1999. IEEE

Computer Society Conference on. (Vol. 2). IEEE.

Stergiopoulou, E., & Papamarkos, P. (2006, October). A new technique for hand

gesture recognition. In Image Processing, 2006 IEEE International Conference

on (pp. 2657-2660). IEEE.

Stergiopoulou, E., Sgouropoulos, K., Nikolaou, N., Papamarkos, N., &

Mitianoudis, N. (2014). Real time hand detection in a complex background.

Engineering Applications of Artificial Intelligence, 35, 54-70.

Suresh, A., Upendar, R, & Ramakrishna, P. (2014). Real-time hand gesture

detection and recognition robot using ARM 7.International society of thesis

publishes, IEEE Transactions on, 1(3), 2321-2667.

Thirumuruganathan. 2010. A Detailed Introduction to K-Nearest Neighbor (KNN)

Algorithm. [ONLINE] Available at:

https://saravananthirumuruganathan.wordpress.com. [Accessed 28 October 15].

Uhlig. 2015. Advantages and Disadvantages of the Scrum Project Management

Methodology. [ONLINE] Available at: http://smallbusiness.chron.com. [Accessed

07 October 15].

Varwani, H., Choithwani, H., Sahatiya, K., Gangan, S., Gyanchandani, T., & Mane,

D. (2013). Understanding various Techniques for Background Subtraction and

Implementation of Shadow Detection. International Journal of Computer

Technology and Applications, 4(5), 822.

Vezhnevets, V., Sazonov, V., & Andreeva, A. (2003, September). A survey on

pixel-based skin color detection techniques. In Proc. Graphicon (Vol. 3, pp. 85-

92).

145

Vishwanathan, S., & Murty, M. N. (2002). SSVM: a simple SVM algorithm. In

Neural Networks, 2002. IJCNN'02. Proceedings of the 2002 International Joint

Conference on (Vol. 3, pp. 2393-2398). IEEE.

Waltz, Frederick M., and John WV Miller. "Efficient algorithm for gaussian blur

using finite-state machines." Photonics East (ISAM, VVDC, IEMB). International

Society for Optics and Photonics, 1998.

Wang, Y. Q. (2014). An Analysis of the Viola-Jones face detection algorithm.

Image processing On Line, 4, 128-148.

Wu, T. F., Lin, C. J., & Weng, R. C. (2004). Probability estimates for multi-class

classification by pairwise coupling. The Journal of Machine Learning Research, 5,

975-1005.

Yoruk, E., Konukoglu, E., Sankur, B., & Darbon, J. (2006). Shape-based hand

recognition. Image Processing, IEEE Transactions on, 15(7), 1803-1815.

Zhang, Q., Chen, F., & Liu, X. (2008, July). Hand gesture detection and

segmentation based on difference background image with complex background.

In Embedded Software and Systems, 2008. ICESS'08. International Conference

on (pp. 338-343). IEEE.

Zhu, Y., Yang, Z., & Yuan, B. (2013, April). Vision Based Hand Gesture

Recognition. In Service Sciences (ICSS), 2013 International Conference on (pp.

260-265). IEEE.

146

APPENDIX A

8.2.1 Web Camera

Test Id: 2.1 Test Scenario: Capture image frame via web camera.

Expected result Actual result

Capture images forme

webcam

Capture images forme webcam.

Steps: Settle up the back ground image.

Start capture images by web camera.

Comment: Successfully images captured by web camera.

Table 55: Unit test 1

8.2.2 Data Read and Write

Test Id: 2.2 Test Scenario: Image saving


Background image Save Background image Save Background image


Save background image.

Comment: Successfully images Save


Test Id: 2.3 Test Scenario: Image load


Background image Load background image Load background image


Save background image.

Load background image.

Comment: Successfully images Save


147

8.2.4 Image Pre-processing

Test Id: 2.4 Test Scenario: Calculate Histogram value


Gray colour image Calculates histogram value Calculates histogram value

Steps: Capture image


Colour space conversion.

Image grayscale.

Calculates the histogram value.

Comment: Successfully calculates the histogram value.


Test Id: 2.5 Test Scenario: Calculate Thresholding value


Gray colour image Calculates thresholding value Calculates thresholding

value




Image grayscale.

Calculates the histogram value.

Finalize the thresholding value.

Comment: Successfully calculates the histogram value.


Test Id: 2.6 Test

Scenario:

Converts Red pixels into Hue colour model.


Smooth Image Converts all red pixels into Hue

model

Converts all red pixels into

Hue model




Comment: Successfully converts Red to Hue


148

Test Id: 2.7 Test

Scenario:

Converts Green pixels into Saturation model.


Smooth image Converts all Green pixels into

Saturation model

Converts all Green pixels

into Saturation model




Comment: Successfully converts Green to Saturation model.


Test Id: 2.8 Test Scenario: Converts Blue pixels into Value model.


Smooth image Converts all Blue pixels into

Value model

Converts all Blue pixels

into Value model




Comment: Successfully converts Blue to Value.


Test Id: 2.9 Test Scenario: Image Converts 3x3 Metrix.


Capture image 3x3 Image Metrix 3x3 Image Metrix


Image Smooth

Comment: Successfully Image converts 3x3 Metrix.


149

Test Id: 2.10 Test Scenario: Image pixels differences.


Image with hand

and without hand

Caluclate the difference of both

pixels

Caluclate the difference of

both pixels


Image Smooth


Comment: Successfully Calculates the difference of each pixels.


8.2.4 Gesture Recognition

Test Id: 2.11 Test Scenario: Arm remove


Hand detected

image

Remove arm Remove arm

Steps: Hand Detestation

Arm remove

Comment: Successfully remove arm from the detected posture.


Test Id: 2.12 Test Scenario: Image rotate


Hand detected

image

Rotate detected image Rotate detected image


Rotate image

Comment: Successfully Image rotated.


150

Test Id: 2.13 Test Scenario: Calculate upper and lower pixels


Hand detected

image

Rotate detected image Rotate detected image


Rotate image

Comment: Successfully Image rotated.


Test Id: 2.14 Test Scenario: Calculate Total white pixels.


Hand detected

image

Calculats total white pixels Calculats total white pixels


Rotate image

Calculate the image

Comment: Successfully Calculates total white pixels.


Test Id: 2.15 Test Scenario: Extract hand features


Hand detected

image

Calculates Extract fetures Calculates Extract fetures


Rotate image

Calculate Upper and Lower pixels

Extract Feature of the detected hand

Comment: Successfully Calculates total white pixels.


151

8.2.5 Motion Detection

Test Id: 2.16 Test

Scenario:

Calculates differences of x and y coordinates.


During gesture

recognition and After

completed recognition

x, y coordinates.

Calculates the differencss of

the both posture

Calculates the differencss

of the both posture


Recognized posture

Recognized motion

Calculates the differences of x and y coordinates.

Commen

t:

Successfully calculated Differences of x and y coordinates.


Test Id: 2.17 Test Scenario: Add motion value into relevant List.


During gesture

recognition and

After completed

recognition of x, y

coordinates.

Calculates the differencss of the

both posture

Calculates the differencss

of the both posture


Recognized posture

Recognized motion

Calculates the differences of x and y coordinates.

Add motion value relevant list.

Comment: Successfully added motion values each relevant list.


152

8.2.6 Execute command

Test Id: 1.18 Test Scenario: Execute command


35 Hand detected

postures

Exexcute relavent application

command

Exexcute relavent

application command


Recognized Gesture

Detected and recognized motion

Execute relevant command

Comment: Successfully executed command belongs to relevant gesture and

motion.


153

APPENDIX B

154

APPENDIX C

User guide

Figure 99: Step 1 guide


155



Figure 101: Step3 guide


156





157

Application interfaces

Figure 104: KMP player


158

Figure 105: VLC player


Figure 106: Power point presentation


159

Figure 107: Skype


Figure 108: Image viewer


160

APPENDIX D

Interview Questions

1. Name :-

2. Location :-

3. Do you have enough knowledge about image processing real time

applications?

4. What are the algorithms you mage themselves?

5. What are the Characteristics do you look for recognize user hand?

6. Why you saying this is the best application than others exists?

7. Why you say your recognize algorithm better than other approaches?

fyp - gayan denaindra (cb005044)

Documents