gesture interfaces project

7/29/2019 Gesture Interfaces Project

1/30

Hi All

I reckon most of you are on your way having studied / collected enough resources/materials/tweakedaround with stuff, to propose a valid system.Others who have been hand waving, or who plan to do some last minute mash up, well, best of luckand wish you immense concentration in the next 24 hours.

Below I have re-iterated on what the output will be. These will be evaluated by myself and andanother faculty (so Objectivity in judgement is ensured).

A few people who are off to meet Nomads in Bijapur have asked me for an extension of a few dayswhich is OK, which means I expect something cooler from them as they spend time on the train backand forth.

For the rest, I am fixing a deadline for Tuesday at 10 am. Please email me an attachment with SUBJECT : ''Multimodal Submission'' as a PDF document.

GUIDELINES >>>>>

It is intended to reinforce, by experience, your understanding of the lecture

topics. It has intentionally been made quite open-ended in terms of choice ofsystem and methods used. You have the following tasks to complete:

Choose a new technology or system to investigate. Justify your choice,citing some relevant previous research and commercial systems that have

been developedYou should write a maximum of 1500 words (minimum font size, 12pt;

minimum page borders 2cm) that describes the system you have chosen toinvestigate, why you chose that system, what issues in MultimodalInteraction that it will let you explore, and what previous research has

been done on these issues.If you have a prototype/output, please include screenshots as work in

progress.

Choosing a system

You should pick an example of technology, preferably new. This could be

something like a touch screen mobile phone, a gesture recognition or voicecommand activated system, a new display or interaction device, a computer gameetc. Alternatively it could be a computerised interface for an everyday task, such as

a machine that sells train tickets. It could be a web-based interface such as a librarycatalogue. Your choice should consider the following factors:

Is this system new? Has it been introduced as an improvement that isintended to make interaction easier? Might you be able to compare the

new and old system in your evaluation?Is there a well-defined, but not completely trivial, task that this system is

meant to be used for? Note, your system might have multiple applications,


2/30

but you should focus on one specific task for the purpose of this

assignment.

Will it be practical for you to use this system for an evaluation?Note that difficulty with any of the above should lead you to reconsider and

choose another system. You will not be given any special concessions onthe remainder of the assignment for having made a poor choice.

Previous research

You should look for previously published research (in the form of journal orconference articles) about the system you are investigating. For example, what

previous investigations have been done comparing gesture recognition with other

interaction methods? What criteria have been proposed for assessing usability ofwebsites?

Hi All

To clear any confusion in my previous email (see below) , I stand corrected!! (Theseare rough guidelines to aid you into the research process)

Ofcourse your idea/concept is the focus.''new technology and investigation of a system'' is not the main focus.

All I am saying is please investigate/research into the modules that your system willneed by looking into Previous work/Research and also talk about themThis will make sure that , indeed your idea is feasible.

RegardsSharath

Gesture-based computingon the cheapWith a single piece of inexpensive hardwarea multicolored gloveMIT

researchers are makingMinority Report-style interfaces more accessible.

Larry Hardesty, MIT News Office

today's news

Merging tissue and electronics
http://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.html


3/30

New tissue scaffold could be used for drug development and implantable therapeutic devices.

Turning on key enzyme blocks tumor formationAugust 27, 2012

Making crowdsourcing easierAugust 24, 2012

Engineers achieve longstanding goal of stable nanocrystalline metalsAugust 23, 2012

The hardware for a new gesture-based computing system consists of nothing more than an ordinary webcam

and a pair of brightly colored lycra gloves.

Photo: Jason Dorfman/CSAILMay 20, 2010

Share on facebookShare on twitterShare on redditMore Sharing ServicesShare

Share on email

Ever since Steven Spielbergs 2002 sci-fi movieMinority Report, in which a black-clad Tom Cruise stands

in front of a transparent screen manipulating a host of video images simply by waving his hands, the idea

of gesture-based computer interfaces has captured the imagination of technophiles. Academic and industry

labs have developed a host of prototype gesture interfaces, ranging from room-sized systems with multiple

cameras to detectorsbuilt into laptops screens. But MIT researchers have developed a system that could

make gestural interfaces much more practical. Aside from a standard webcam, like those found in many

new computers, the system uses only a single piece of hardware: a multicolored Lycra glove that could be

manufactured for about a dollar.

Other prototypes of low-cost gestural interfaces have used reflective or colored tape attached to the

fingertips, but thats 2-D information, says Robert Wang, a graduate student in the Computer Science

and Artificial Intelligence Laboratory who developed the new system together with Jovan Popovi, an
http://web.mit.edu/newsoffice/2012/turning-on-key-enzyme-blocks-tumor-formation-0827.htmlhttp://web.mit.edu/newsoffice/2012/making-crowdsourcing-easier.htmlhttp://web.mit.edu/newsoffice/2012/stable-nanocrystalline-metals-achieved-0823.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://img.mit.edu/newsoffice/images/article_images/original/20100519153226-1.jpghttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://img.mit.edu/newsoffice/images/article_images/original/20100519153226-1.jpghttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://img.mit.edu/newsoffice/images/article_images/original/20100519153226-1.jpghttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2012/stable-nanocrystalline-metals-achieved-0823.htmlhttp://web.mit.edu/newsoffice/2012/making-crowdsourcing-easier.htmlhttp://web.mit.edu/newsoffice/2012/turning-on-key-enzyme-blocks-tumor-formation-0827.html


4/30

associate professor of electrical engineering and computer science. Youre only getting the fingertips; you

dont even know which fingertip [the tape] is corresponding to. Wang and Popovis system, by contrast,

can translate gestures made with a gloved hand into the corresponding gestures of a 3-D model of the hand

on screen, with almost no lag time. This actually gets the 3-D configuration of your hand and your

fingers, Wang says. We get how your fingers are flexing.

The most obvious application of the technology, Wang says, would be in video games: Gamers navigating

a virtual world could pick up and wield objects simply by using hand gestures. But Wang also imagines

that engineers and designers could use the system to more easily and intuitively manipulate 3-D models of

commercial products or large civic structures.

Robert Wang demonstrates the speed and precision with which the system can gauge hand position in

three dimensionsincluding the flexing of individual fingersas well as a possible application in

mechanical engineering.

Video:Robert Y. Wang/Jovan Popovi

Patchwork approach

The glove went through a series of designs, with dots and patches of different shapes and colors, but the

current version is covered with 20 irregularly shaped patches that use 10 different colors. The number of

colors had to be restricted so that the system could reliably distinguish the colors from each other, and

from those of background objects, under a range of different lighting conditions. The arrangement and

shapes of the patches was chosen so that the front and back of the hand would be distinct but also so that

collisions of similar-colored patches would be rare. For instance, Wang explains, the colors on the tips of

the fingers could be repeated on the back of the hand, but not on the front, since the fingers wouldfrequently be flexing and closing in front of the palm.

Technically, the other key to the system is a new algorithm for rapidly looking up visual data in a database,

which Wang says was inspired by the recent work of Antonio Torralba, the Esther and Harold E. Edgerton

Associate Professor of Electrical Engineering and Computer Science in MITs Department of Electrical

Engineering and Computer Science and a member of CSAIL. Once a webcam has captured an image of the

glove, Wangs software crops out the background, so that the glove alone is superimposed upon a white

background. Then the software drastically reduces the resolution of the cropped image, to only 40 pixels

by 40 pixels. Finally, it searches through a database containing myriad 40-by-40 digital models of a hand,

clad in the distinctive glove, in a range of different positions. Once its found a match, it simply looks upthe corresponding hand position. Since the system doesnt have to calculate the relative positions of the

fingers, palm, and back of the hand on the fly, its able to provide an answer in a fraction of a second.

Of course, a database of 40-by-40 color images takes up a large amount of memoryseveral hundred

megabytes, Wang says. But today, a run-of-the-mill desktop computer has four gigabytesor 4,000

megabytesof high-speed RAM memory. And that number is only going to increase, Wang says.

Changing the game

People have tried to do hand tracking in the past, says Paul Kry, an assistant professor at the McGillUniversity School of Computer Science. Its a horribly complex problem. I cant say that theres any


5/30

work in purely vision-based hand tracking that stands out as being successful, although many people have

tried. Its sort of changing the game a bit to say, Hey, okay, Ill just add a little bit of information the

color of the patchesand I can go a lot farther than these purely vision-based techniques. Kry

particularly likes the ease with which Wang and Popovis system can be calibrated to new users. Since

the glove is made from stretchy Lycra, it can change size significantly from one user to the next; but in

order to gauge the gloves distance from the camera, the system has to have a good sense of its size. To

calibrate the system, the user simply places an 8.5-by-11-inch piece of paper on a flat surface in front of

the webcam, presses his or her hand against it, and in about three seconds, the system is calibrated.

Wang initially presented the glove-tracking system at last years Siggraph, the premier conference on

computer graphics. But at the time, he says, the system took nearly a half-hour to calibrate, and it didnt

work nearly as well in environments with a lot of light. Now that the glove tracking is working well,

however, hes expanding on the idea, with the design of similarly patterned shirts that can be used to

capture information about whole-body motion. Such systems are already commonly used to evaluate

athletes form or to convert actors live performances into digital animations, but a system based on Wang

and Popovis technique could prove dramatically cheaper and easier to use.

Share on facebookShare on google_plusoneShare on stumbleuponShare on diggShare on redditShare on deliciousMore Sharing Services233

Share on email

Gesture recognition is a topic incomputer scienceandlanguage technologywith the goal of

interpreting humangesturesvia mathematicalalgorithms. Gestures can originate from any bodily

motion or state but commonly originate from thefaceorhand. Current focuses in the field include

emotion recognition from the face and hand gesture recognition. Many approaches have been madeusing cameras andcomputer visionalgorithms to interpretsign language. However, the identification

and recognition of posture, gait,proxemics, and human behaviors is also the subject of gesture

recognition techniques.[1]

Gesture recognition can be seen as a way for computers to begin to understand humanbody

language, thus building a richer bridge between machines and humans than primitivetext user

interfacesor evenGUIs(graphical user interfaces), which still limit the majority of input to keyboard

and mouse.

Gesture recognition enables humans to interface with the machine (HMI) and interact naturally

without any mechanical devices. Using the concept of gesture recognition, it is possible to point a

finger at thecomputer screenso that thecursorwill move accordingly. This could potentially make

conventionalinput devicessuch asmouse,keyboardsand eventouch-screensredundant.

Although this technology is still in its infancy, applications are beginning to appear.Flutter, a start-up

out of Palo Alto, CA, is allowing anyone with Mac/Windows computer and webcam to download an

app that allows them to control Music & Video apps such as Spotify, iTunes, Windows Media Player,

Quicktime, and VLC using gestures.

Gesture recognition can be conducted with techniques fromcomputer visionandimage processing.

The literature includes ongoing work in the computer vision field on capturing gestures or more

general human pose and movements by cameras connected to a computer.[2][3][4][5]

Gesture recognition and pen computing:
http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=stumbleupon&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/1&frommenu=1&uid=503bab7c6a692cb7&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/3&frommenu=1&uid=503bab7c8572ca6f&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Computer_sciencehttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/3&frommenu=1&uid=503bab7c8572ca6f&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/3&frommenu=1&uid=503bab7c8572ca6f&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=stumbleupon&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/1&frommenu=1&uid=503bab7c6a692cb7&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=stumbleupon&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/1&frommenu=1&uid=503bab7c6a692cb7&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html


6/30

In some literature[examples needed]

, the term gesture recognition has been used to refer more narrowly

to non-text-input handwriting symbols, such as inking on agraphics tablet,multi-touchgestures,

andmouse gesturerecognition. This is computer interaction through the drawing of symbols with

a pointing device cursor (see discussion atPen computing).

Contents

[hide]

1 Gesture types

2 Uses

3 Input devices

4 Algorithms

o 4.1 3D model-based algorithms

o 4.2 Skeletal-based algorithms

o 4.3 Appearance-based models

5 Challenges

o 5.1 "Gorilla arm"

6 See also

7 References

8 External links

[edit]Gesture types

In computer interfaces, two types of gestures are distinguished:[6]

We consider online gestures, which

can also be regarded as direct manipulations like scaling and rotating. In contrast, offline gestures are

usually processed after the interaction is finished; e. g. a circle is drawn to activate a context menu.

Offline gestures: Those gestures that are processed after the user interaction with the object. An

example is the gesture to activate a menu.

Online gestures: Direct manipulation gestures. They are used to scale or rotate a tangible object.

[edit]Uses

Gesture recognition is useful for processing information from humans which is not conveyed throughspeech or type. As well, there are various types of gestures which can be identified by computers.

Sign language recognition. Just as speech recognition can transcribe speech to text, certain

types of gesture recognition software can transcribe the symbols represented throughsign

languageinto text.[7]

For socially assistive robotics. By using proper sensors (accelerometers and gyros) worn on

the body of a patient and by reading the values from those sensors, robots can assist in patient

rehabilitation. The best example can be stroke rehabilitation.

Directional indication through pointing. Pointing has a very specific purpose in our[clarification

needed]society, to reference an object or location based on its position relative to ourselves. The

use of gesture recognition to determine where a person is pointing is useful for identifying the
http://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Multi-touchhttp://en.wikipedia.org/wiki/Multi-touchhttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognition#Gesture_typeshttp://en.wikipedia.org/wiki/Gesture_recognition#Gesture_typeshttp://en.wikipedia.org/wiki/Gesture_recognition#Useshttp://en.wikipedia.org/wiki/Gesture_recognition#Useshttp://en.wikipedia.org/wiki/Gesture_recognition#Input_deviceshttp://en.wikipedia.org/wiki/Gesture_recognition#Input_deviceshttp://en.wikipedia.org/wiki/Gesture_recognition#Algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#3D_model-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#3D_model-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Skeletal-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Skeletal-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Appearance-based_modelshttp://en.wikipedia.org/wiki/Gesture_recognition#Appearance-based_modelshttp://en.wikipedia.org/wiki/Gesture_recognition#Challengeshttp://en.wikipedia.org/wiki/Gesture_recognition#Challengeshttp://en.wikipedia.org/wiki/Gesture_recognition#.22Gorilla_arm.22http://en.wikipedia.org/wiki/Gesture_recognition#.22Gorilla_arm.22http://en.wikipedia.org/wiki/Gesture_recognition#See_alsohttp://en.wikipedia.org/wiki/Gesture_recognition#See_alsohttp://en.wikipedia.org/wiki/Gesture_recognition#Referenceshttp://en.wikipedia.org/wiki/Gesture_recognition#Referenceshttp://en.wikipedia.org/wiki/Gesture_recognition#External_linkshttp://en.wikipedia.org/wiki/Gesture_recognition#External_linkshttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/wiki/Gesture_recognition#External_linkshttp://en.wikipedia.org/wiki/Gesture_recognition#Referenceshttp://en.wikipedia.org/wiki/Gesture_recognition#See_alsohttp://en.wikipedia.org/wiki/Gesture_recognition#.22Gorilla_arm.22http://en.wikipedia.org/wiki/Gesture_recognition#Challengeshttp://en.wikipedia.org/wiki/Gesture_recognition#Appearance-based_modelshttp://en.wikipedia.org/wiki/Gesture_recognition#Skeletal-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#3D_model-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Input_deviceshttp://en.wikipedia.org/wiki/Gesture_recognition#Useshttp://en.wikipedia.org/wiki/Gesture_recognition#Gesture_typeshttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Multi-touchhttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Wikipedia:Please_clarify


7/30

context of statements or instructions. This application is of particular interest in the field

ofrobotics.[8]

Control through facial gestures. Controlling a computer through facial gestures is a useful

application of gesture recognition for users who may not physically be able to use a mouse or

keyboard.Eye trackingin particular may be of use for controlling cursor motion or focusing on

elements of a display.

Alternative computer interfaces. Foregoing the traditional keyboard and mouse setup to

interact with a computer, strong gesture recognition could allow users to accomplish frequent or

common tasks using hand or face gestures to a camera.[9][10][11][12][13]

Immersive game technology. Gestures can be used to control interactions within video games

to try and make the game player's experience more interactive or immersive.

Virtual controllers. For systems where the act of finding or acquiring a physical controller could

require too much time, gestures can be used as an alternative control mechanism. Controlling

secondary devices in a car, or controlling a television set are examples of such usage.[14]

Affective computing. Inaffective computing, gesture recognition is used in the process of

identifying emotional expression through computer systems.

Remote control. Through the use of gesture recognition, "remote controlwith the wave of a

hand" of various devices is possible. The signal must not only indicate the desired response, but

also which device to be controlled.[15][16][17]

[edit]Input devices

The ability to track a person's movements and determine what gestures they may be performing can

be achieved through various tools. Although there is a large amount of research done in image/video

based gesture recognition, there is some variation within the tools and environments used between

implementations.

Wired gloves. These can provide input to the computer about the position and rotation of the

hands using magnetic or inertial tracking devices. Furthermore, some gloves can detect finger

bending with a high degree of accuracy (5-10 degrees), or even provide haptic feedback to the

user, which is a simulation of the sense of touch. The first commercially available hand-tracking

glove-type device was the DataGlove,[18]

a glove-type device which could detect hand position,

movement and finger bending. This uses fiber optic cables running down the back of the hand.

Light pulses are created and when the fingers are bent, light leaks through small cracks and the

loss is registered, giving an approximation of the hand pose.

Depth-aware cameras. Using specialized cameras such asstructured lightortime-of-flight

cameras, one can generate a depth map of what is being seen through the camera at a shortrange, and use this data to approximate a 3d representation of what is being seen. These can be

effective for detection of hand gestures due to their short range capabilities.[19]

Stereo cameras. Using two cameras whose relations to one another are known, a 3d

representation can be approximated by the output of the cameras. To get the cameras' relations,

one can use a positioning reference such as alexian-stripeorinfraredemitters.[20]

In combination

with direct motion measurement (6D-Vision) gestures can directly be detected.

Controller-based gestures. These controllers act as an extension of the body so that when

gestures are performed, some of their motion can be conveniently captured by software.Mouse

gesturesare one such example, where the motion of the mouse is correlated to a symbol being

drawn by a person's hand, as is theWii Remote, which can study changes in acceleration overtime to represent gestures.

[21][22][23]Devices such as the LG Electronics Magic Wand, the Loop
http://en.wikipedia.org/wiki/Roboticshttp://en.wikipedia.org/wiki/Roboticshttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-12http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-12http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-16http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-16http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/wiki/Wired_glovehttp://en.wikipedia.org/wiki/Wired_glovehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Stereo_camerashttp://en.wikipedia.org/wiki/Stereo_camerashttp://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-22http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-22http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-22http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/wiki/Stereo_camerashttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Wired_glovehttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-16http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-12http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Robotics


8/30

and the Scoop useHillcrest Labs' Freespace technology, which uses MEMS accelerometers,

gyroscopes and other sensors to translate gestures into cursor movement. The software also

compensates for human tremor and inadvertent movement.[24][25][26]

Single camera. A normal camera can be used for gesture recognition where the

resources/environment would not be convenient for other forms of image-based recognition.

Earlier it was thought that single camera may not be as effective as stereo or depth aware

cameras, but a start-up based out of Palo Alto namedFlutteris challenging this theory. It has

released an app that could be downloaded to by any windows/mac computer with built-in

webcam, thus, allowing an accessibility to a wider audience.[27]

[edit]Algorithms

Different ways of tracking and analyzing gestures exist, and some basic layout is given is in the diagram above. For

example, volumetric models convey the necessary information required for an elaborate analysis, however they prove to

be very intensive in terms of computational power and require further technological developments in order to be

implemented for real-time analysis. On the other hand, appearance-based models are easier to process but usually lack

the generality required for Human-Computer Interaction.

Depending on the type of the input data, the approach for interpreting a gesture could be done in

different ways. However, most of the techniques rely on key pointers represented in a 3D coordinate

system. Based on the relative motion of these, the gesture can be detected with a high accuracy,

depending of the quality of the input and the algorithms approach.

In order to interpret movements of the body, one has to classify them according to common properties

and the message the movements may express. For example, in sign language each gesture

represents a word or phrase. The taxonomy that seems very appropriate for Human-Computer

Interaction has been proposed by Quek in "Toward a Vision-Based Hand Gesture Interface".[28]He

presents several interactive gesture systems in order to capture the whole space of the gestures: 1.Manipulative; 2. Semaphoric; 3. Conversational.

Some literature differentiates 2 different approaches in gesture recognition: a 3D model based and an

appearance-based.[29]The foremost method makes use of 3D information of key elements of the body

parts in order to obtain several important parameters, like palm position or joint angles. On the other

hand, Appearance-based systems use images or videos for direct interpretation.
http://en.wikipedia.org/wiki/Hillcrest_Labshttp://en.wikipedia.org/wiki/Hillcrest_Labshttp://en.wikipedia.org/wiki/Hillcrest_Labshttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-TechJournal-25http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-TechJournal-25http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-TechJournal-25http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Hillcrest_Labs


9/30

A read hand (left) is interpreted as a collection of vertices and lines in the 3D mesh version (right), and the software

uses their relative position and interaction in order to infer the gesture.

[edit]3D model-based algorithms

The 3D model approach can use volumetric or skeletal models, or even a combination of the two.

Volumetric approaches have been heavily used in computer animation industry and for computer

vision purposes. The models are generally created of complicated 3D surfaces, like NURBS or

polygon meshes.

The drawback of this method is that is very computational intensive, and systems for live analysis are

still to be developed. For the moment, a more interesting approach would be to map simple primitive

objects to the persons most important body parts ( for example cylinders for the arms and neck,

sphere for the head) and analyse the way these interact with each other. Furthermore, some abstract

structures likesuper-quadricsandgeneralised cylindersmay be even more suitable for approximating

the body parts. The very exciting about this approach is that the parameters for these objects arequite simple. In order to better model the relation between these, we make use of constraints and

hierarchies between our objects.

The skeletal version (right) is effectively modelling the hand (left). This has less parameters than the volumetric version

and it's easier to compute, making it suitable for real-time gesture analysis systems.

[edit]Skeletal-based algorithms

Instead of using intensive processing of the 3D models and dealing with a lot of parameters, one can

just use a simplified version of joint angle parameters along with segment lengths. This is known as a

skeletal representation of the body, where a virtual skeleton of the person is computed and parts of

the body are mapped to certain segments. The analysis here is done using the position and

orientation of these segments and the relation between each one of them( for example the angle

between the joints and the relative position or orientation)

Advantages of using skeletal models:

Algorithms are faster because only key parameters are analyzed.

Pattern matching against a template database is possible

Using key points allows the detection program to focus on the significant parts of the body
http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5http://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5


10/30

These binary silhouette(left) or contour(right) images represent typical input for appearance-based algorithms. They are

compared with different hand templates and if they match, the correspondent gesture is inferred.

[edit]Appearance-based models

These models dont use a spatial representation of the body anymore, because they derive theparameters directly from the images or videos using a template database. Some are based on the

deformable 2D templates of the human parts of the body, particularly hands. Deformable templates

are sets of points on the outline of an object, used as interpolation nodes for the objects outline

approximation. One of the simplest interpolation function is linear, which performs an average shape

from point sets, point variability parameters and external deformators. These template-based models

are mostly used for hand-tracking, but could also be of use for simple gesture classification.

A second approach in gesture detecting using appearance-based models uses image sequences as

gesture templates. Parameters for this method are either the images themselves, or certain features

derived from these. Most of the time, only one ( monoscopic) or two ( stereoscopic ) views are used.

[edit]Challenges

There are many challenges associated with the accuracy and usefulness of gesture recognition

software. For image-based gesture recognition there are limitations on the equipment used andimage

noise. Images or video may not be under consistent lighting, or in the same location. Items in the

background or distinct features of the users may make recognition more difficult.

The variety of implementations for image-based gesture recognition may also cause issue for viability

of the technology to general usage. For example, an algorithm calibrated for one camera may not

work for a different camera. The amount of background noise also causes tracking and recognition

difficulties, especially when occlusions (partial and full) occur. Furthermore, the distance from the

camera, and the camera's resolution and quality, also cause variations in recognition accuracy.

In order to capture human gestures by visual sensors, robust computer vision methods are also

required, for example for hand tracking and hand posture recognition[30][31][32][33][34][35][36][37][38]

or for

capturing movements of the head, facial expressions or gaze direction.

[edit]"Gorilla arm"

"Gorilla arm" was a side-effect of vertically-oriented touch-screen or light-pen use. In periods of

prolonged use, users' arms began to feel fatigue and/or discomfort. This effect contributed to the

decline of touch-screen input despite initial popularity in the 1980s.[39][40]

Gorilla arm is not a problem for short-term use, since they only involve brief interactions which do not

last long enough to cause gorilla arm.

[edit]See also

Pen computingDiscussion of gesture recognition for tablet computers

Mouse gesture

Computer vision

Dialogue-Assisted Visual Environment for Geoinformation(DAVE_G)

Gestures

Hidden Markov model

Language technology

Omek Interactive
http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-29http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-31http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-33http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-35http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-37http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-37http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Dialogue-Assisted_Visual_Environment_for_Geoinformationhttp://en.wikipedia.org/wiki/Dialogue-Assisted_Visual_Environment_for_Geoinformationhttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Omek_Interactivehttp://en.wikipedia.org/wiki/Omek_Interactivehttp://en.wikipedia.org/wiki/Omek_Interactivehttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Dialogue-Assisted_Visual_Environment_for_Geoinformationhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-37http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-35http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-35http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-33http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-33http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-31http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-31http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-29http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-29http://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7


11/30

SixthSense

SoftKinetic

Sketch recognition

Multi-touch gestures

AnyTouch by ayotle and digitas_fr

Ayotle Home Page

Recognizing gestures:Interface design beyond

point-and-clickBy Robert Cravotta, Technical Editor - August 16, 2007inShare

Save FollowPRINTEMAIL

The most basic and simplest gesture is pointing, and it is an effective method for most

people to communicate with each other, even in the presence of language barriers.

However, pointing quickly fails as a way to communicate when the object or concept that

a person is trying to convey is not in sight to point at. Taking gesture recognition beyond

simple pointing greatly increases the type of information that two people cancommunicate with each other. Gesture communication is so natural and powerful that

parents are increasingly using it to enable their babies to engage in direct, two-way

communication with their care givers, through baby sign language, long before the

babies can clearly speak (Reference 1).

The level of communication between users and their electronic devices has been largely

limited to a pointing interface. To date, a few common extensions to the pointing

interface exist. They include single- versus double-click or tap devices and devices that

allow users to hold down a button while moving the pointing focus, such as mice,

trackballs, and touchscreens. A user's ability to naturally communicate with a computingdevice through a gesture interface and a speech-recognition interface, such as a

multitouch display or an optical-input system, is still largely an emerging capability.

Consider the new and revolutionary mobile phone that relies on a touchscreen-driven

user interface instead of physical buttons and uses a predictive engine that helps users

with typing on the flat panel. This description could apply to Apple's iPhone, which the

company launched in June, but it can also apply to the IBM Simon, which the company

launched with Bell South in 1993, 14 years earlier than the iPhone. Differences exist

between the two touch interfaces. For example, the newer units support multitouch

gestures, such as pinching an image to size it and flicking the display to scroll the
http://en.wikipedia.org/wiki/SixthSensehttp://en.wikipedia.org/wiki/SixthSensehttp://en.wikipedia.org/wiki/SoftKinetichttp://en.wikipedia.org/wiki/SoftKinetichttp://en.wikipedia.org/wiki/Sketch_recognitionhttp://en.wikipedia.org/wiki/Sketch_recognitionhttp://en.wikipedia.org/wiki/Multi-touch_gestureshttp://en.wikipedia.org/wiki/Multi-touch_gestureshttp://vimeo.com/43108191/http://vimeo.com/43108191/http://www.ayotle.com/http://www.ayotle.com/http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickmailto:?subject=Recognizing%20gestures:%20Interface%20design%20beyond%20point-and-click&body=http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickmailto:?subject=Recognizing%20gestures:%20Interface%20design%20beyond%20point-and-click&body=http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-amailto:?subject=Recognizing%20gestures:%20Interface%20design%20beyond%20point-and-click&body=http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickhttp://www.ayotle.com/http://vimeo.com/43108191/http://en.wikipedia.org/wiki/Multi-touch_gestureshttp://en.wikipedia.org/wiki/Sketch_recognitionhttp://en.wikipedia.org/wiki/SoftKinetichttp://en.wikipedia.org/wiki/SixthSense


12/30

content. This article touches on the nature of how gesture interfaces are evolving and

what they mean for future interfaces.

Much of the technology driving many of today's latest and innovative gesturelike

interfaces is not exactly new: Most of these interfaces can trace their heritage in

products or projects from the past few decades. According toReference 2, multitouchpanel interfaces have existed for at least 25 years, and that length of time is on par with

the 30 years that elapsed between the invention of the mouse in 1965 and the mouse's

reaching its tipping point as a ubiquitous pointing device, which happened with the

release of Microsoft Windows 95. Improvements in the hardware for these types of

interfaces enable designers to shrink and lower the cost of end systems. More important,

however, these improved interfaces enable designers to leverage additional low-cost

software-processing capacity to use it to better identify more contexts so they can better

interpret what a user is trying to tell the system to do. In other words, most of the

advances in emerging gesture interfaces will come not so much from new hardware as

from more complex software algorithms that best use the strengths and compensate for

the weaknesses of each type of input interface.Reference 3provides a work-in-

progress directory of sources for input technologies.

In addition to the commercial launch of the iPhone, this year has borne witness to the

Korean and European launch of the LG Electronics-manufactured, Prada-designed LG

Prada phone, the successful commercial launch of Nintendo's Wii gesture-interface

console, and the pending launch of the multitouch Microsoft Surface Platform

(see sidebarMultitouch surfaces). Are the lessons designers learned from previous

iterations of gesture interfaces sufficient to give today's latest innovative products the

legs they need to survive more than a year or two and finally usher in the promising age

of more natural communication between humans and machines? These platforms have

access to large amounts of memory and worldwide connectivity through the Internet for

software updates. So, perhaps the more relevant question is: Can the flexible,

programmable nature of these platforms enable the gesture interfaces to adjust to the

set of as-yet-unlearned lessons without going back to the drawing board?

Gesture-recognition interfaces are not limited to just gaming and infotainment products.

Users of Segway's PTs (personal transporters) intuitively command their transporters by

leaning in the appropriate direction to move forward, stop, and turn left or right (Figure

1). Some interfaces focus on capturing a rich range of subtle gestures to emulate using

a real-world tool rather than issuing abstract commands to a computer. For example,

Wacom's Intuos and Cintiq tablets coupled with tablet-enhanced paint- and graphics-

software programs can faithfully capture an artist's hand and tool motions in the six

dimensions of up and down, left and right, downward pressure on the tablet surface,

stylus-tilt angle, stylus-tilt direction, and stylus rotation. This feature enables the software

to re-create not only the gross motions, but also the fine motions, such as twisting a

user's hand to more realistically emulate the behavior of complex objects, such as paintand drawing tools.
http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-a


13/30

Another example of capturing subtle motions to enable the emulation of the direct

manipulation of real-world tools is Intuitive Surgical's da Vinci Surgical System. This

system employs a proprietary 3-D-vision system and two sets of robotsthe masters

and the EndoWrist instrumentsto faithfully translate the intent of a surgeon's hand and

finger motions on the masters to control the EndoWrist instruments during roboticlaparoscopic surgery (Figure 2). Decoupling the surgeon's hand motions from the on-

site surgical instruments through the masters not only allows the surgery to require only

a few small cuts to insert the surgical tools into the patient, but also affords the surgeon

a better posture to delay the onset of fatigue when performing long procedures. It also

enables greater surgical precision, an increased range of motion, and improved dexterity

through digital filtering than if the surgeon directly manipulates the surgical tools, such

as in traditional laparoscopic surgery.

The 3-D-vision system is a critical feedback interface that enables surgeons to

effectively use the da Vinci Surgical System and avoid mistakes. Additionally, the system

complements the visual-feedback interface with some simple haptics or force feedback

such as that to detect when internal and external collisions occur during a motion.

Research organizations, such as at Johns Hopkins University, are using the da Vinci

Surgical System to study technologies that support a sense of touch. The da Vinci is a

perfect 'laboratory,' as it provides high-quality motion and video data of a focused and

stylized set of goal-directed tasks, says Gregory D Hager, professor of computer

science at Johns Hopkins. We envision using the statistical models we develop as a

way of making the device more 'intelligent' by allowing it to recognize what is happening

in the surgical field.

Unseen potentialGreat experiences don't happen by accident, says Bill Buxton, principal researcher at

Microsoft. They are the result of deep thought and deliberation. His decidedly low-tech

example involves two manual juicers that look similar and have the same user interface

(Reference 4). If you can use one, you can use the other. The juice tastes the same

from each, and each takes the same amount of time to make the juice. However, they

differ in the method and the timing of a user's applying the maximum force. The juicer

with the constant-gear-ratio effect requires the user to apply the maximum force at theend of the lever pull, whereas the other juicer delivers a variable-gear-ratio effect that

reduces the pressure the user needs to apply at the end of the lever pull. In essence, the

qualitative difference between the juicers is the result of nonobvious mechanisms hidden

in the interface.

These examples of gesture-recognition interfaces are direct-control interfaces, in which

users explicitly tell or direct the system to do what they want. However, the emerging

trend for embedded or invisible human-machine interfaces is an area of even greater

potential. Embedded processing, which is usually invisible to the end user, continues to

enable designers to make their products perform more functions at lower cost and withbetter energy efficiency. As the cost of sensors and processing capacity continue to drop
http://www.edn.com/file/12082-Figure_2.pdfhttp://www.edn.com/file/12082-Figure_2.pdfhttp://www.edn.com/file/12082-Figure_2.pdfhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/file/12082-Figure_2.pdf


14/30

and the processors are able to optimize the essential functions of the systems they

control, an opportunity arises for the extra available processing to provide an implicit or

embedded human-machine interface between the user and the system. In other words,

users may imply their intent with the system without consciously being aware they are

doing just that. This emerging capability is essential to enabling systems to usepredictive compensation to better accommodate a user's inexperience or errors and

allowing the system to still perform what the user intended.

The Simon's PredictaKey keyboard explicitly listed to the user its top six predicted-letter

candidates and allowed the user to explicitly select from that list. To take advantage of

the prediction engine, the user had to explicitly engage with the engine's suggestions

and choose from them. In contrast, the iPhone's typing interface manifests itself in

several obvious and hidden ways to improve typing speed and accuracy. First, it

presents specialized key layouts for each application so that only keys that are relevant

are available for input. As the user types, the system may predict the word and present it

to the user while they are typing; if the word is correct, the user can select it by pressing

the space key on the display or just continue typing. Likewise, the system tries to

identify potentially misspelled words and presents the word with the correct spelling in a

similar fashion to allow the user to accept or ignore the proposed correction.

However, the new and invisible magic in the iPhone typing interface is that it

compensates for the possibility of the user's pressing the wrong letter on the display

panel by dynamically resizing the target area or tap zone assigned to each letter without

changing the display size of any of the letters, based on its typing engine's predictions of

what letter the user will select next (Reference 5). The letters that the prediction engine

believes the user may press next receive a larger tap zone that can overlap with the

display area of nearby, lower probability letters, which receive a smaller tap zone as a

result. This feature increases the chances of selecting the predicted letter and

decreases the chances of selecting an unpredicted letter that is adjacent to the predicted

letter.

Although not considered a strict user interface, such as that between a user and a

computer, some automobile-safety features implement an early form of implicit

communication interfaces for predictive-safety features. As an example, to determine

whether to warn the driver of an imminent lane departure, the system can examine the

turn signal to determine whether the impending lane departure is intentional or

accidental. People unintentionally and implicitly communicate their presence to

passenger-detection systems that may control whether safety systems should deploy in

the event of an accident. For example, the automobile may adjust how the air bag

deploys to avoid certain types of injuries for passengers of different sizes. Electronic

stability-control systems can compare the driver's implied intention, by examining the

steering and braking inputs, with the vehicle's actual motion; they can then appropriately

apply the brakes on each wheel and reduce engine power to help correct understeer
http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-a


15/30

(plowing), oversteer (fishtailing), and drive-wheel slippage to help the driver maintain

some control of the vehicle.

The control systems for the highest maneuverable fighter aircraft offer some insight into

the possible future of consumer-level control of complex systems. Because these aircraft

employ high levels of instability to realize their maneuverability, the pilot can no longerexplicitly and directly control the aircraft subsystems; rather, the embedded-processing

system handles those details and enables the pilot to focus on higher level tasks. As

automobile-control systems can better predict a driver's intentions and correlate those

intentions with the state of the vehicle and the surrounding environment, they may be

able to deliver even higher levels of energy efficiency by reducing energy loads in

situations in which they are currently unnecessarywithout sacrificing safety. In each

case, the ability of the system to better understand the user's intention and act

appropriately correlates to the system's ability to invisibly and accurately predict what the

user can and might do next.

No matter how rich and intuitive an interface is, its ultimate success and adoption

depend on how well the user and the system can signal each other and compensate for

the possible range of misunderstandings. Uncertainty or unpredictability between how to

command a system and its resultant behavior can kill the immediate usefulness and

delay the adoption of the gesture interface. Merely repetitively informing the user that

there is an error is insufficient in modern electronic equipment. These devices often

guide the user about the nature of the error or misunderstanding and how they might

correct the condition. Modern interfaces employ a combination of sensors, improved

processing algorithms, and user feedback. This combination provides a variety of

mechanisms to reduce ambiguity and uncertainty between the user and the system so

that each can more quickly and meaningfully compensate for unexpected behavior of

the other (see sidebarI'll compensate for you).

One way to compensate for potential misunderstandings is for the system to control and

to reduce the set of possible inputs to only those with a valid context, such as with the

iPhone's specialized key layouts. Applications that can segment and isolate narrow

contexts and apply strong goal-defined tasks in each one are good candidates for this

type of compensation. Handwriting systems based on the Graffiti recognition system,

such as Palm PDAs, improved the usability of a handwriting interface by narrowing the

possibility for erroneous inputs, but doing so involved a significant learning curve for

users before they could reliably use the system. Speech-recognition systems that

require no training from a speaker increase their success rate by significantly limiting the

number of words the systems can recognize, such as the 10 digits, or by presenting the

user with a short menu of responses.

Another method of compensating for misunderstandings is to eliminate or move

translations from the user to the system. HP Labs India is working with a pen-based

device, the GKB (gesture keyboard), which allows users to enter phonetic scripts, suchas Devanagari and Tamil scripts, as text input without the benefit of a language-specific
http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-a


16/30

keyboard. Another example is the Segway PT that once required a user to translate a

forward and backward twist to correspond to a signal to turn left or right. Now, it instead

allows the user to indicate left or right by leaning in the desired direction. In this case,

the newer interface control removes the ambiguity of which twist direction aligns with

which turn direction, and it aligns the control with the natural center-of-gravity usescenario for the system, which greatly increases its chances as a useful and sustainable

interface.

Another important way to compensate for potential errors or misunderstandings is to

give users enough relevant feedback so that they can appropriately change their

expectations or behavior. Visual feedback is a commonly used mechanism. The mouse

cursor on most systems performs more functions than just acting as a pointing focus; it

also acts as a primary feedback to the user about when the system is busy and why.

The success of the gesture interface with the Wii remote hinges in part on how well the

system software improves over time to provide better sensitivity to player gestures. It

also depends on how well it provides feedback, such as a visual cue on the display, that

points out how users can make small adjustments to their motions so that the system

properly interprets their intended gestures.

Read more In-Depth

TechnicalFeatures

Haptic or tactile feedback engages the user's sense of touch; it is a growing area for

feedback, especially as a component of multimodal feedback involving more than a

single sense. Game consoles have employed rumble features for years in their handheld

controllers. The Segway PT signals error conditions to the user through force feedback

in the control stick. The da Vinci Surgical System uses force feedback to signal

boundary collisions, such as when the EndoWrist instrument makes contact with the

surface of the cutting target. Haptic feedback can compensate for the weaknesses of

other feedback methods, such as audio sounds in noisy environments.

Haptic feedback can also help offload the visual sensory overload by freeing the user's

eyes from seeking visual confirmation that the system has received an input to focus his

eyes on other details. For example, the iPhone keypad does not implement haptic

feedback to signal the user which key was pressed and when, so the user must visually

confirm each key press the system processes. One company, Immersion, offers a way

to simulate a tactile sensation for mobile devices by issuing precise pulse control over a

device's vibration actuator within a 5-msec window of the input event.

When all other compensation methods fail to eliminate a misunderstanding, designers

can employ a context-relevant response to address the uncertainty of a given input. A

common response type is to issue a warning and to ask the user to repeat the input, but

this situation risks frustrating the user if the system repeatedly requests the input with no

additional guidance about what it needs. The system can make a best guess as to what

the input was and ask the user to confirm that the guess is correct; this scenario also
http://www.edn.com/featureshttp://www.edn.com/featureshttp://www.edn.com/featureshttp://www.edn.com/features


17/30

can cause frustration to the user if no method is available to refine the guess on a

second try or if the system must too often confirm an input. A possible strategy for

minimizing the use of these types of responses is for the system to profile the user'

gesture interfaces project

Documents