gesture interfaces project

Upload: shambhavi-singh

Post on 14-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Gesture Interfaces Project

    1/30

    Hi All

    I reckon most of you are on your way having studied / collected enough resources/materials/tweakedaround with stuff, to propose a valid system.Others who have been hand waving, or who plan to do some last minute mash up, well, best of luckand wish you immense concentration in the next 24 hours.

    Below I have re-iterated on what the output will be. These will be evaluated by myself and andanother faculty (so Objectivity in judgement is ensured).

    A few people who are off to meet Nomads in Bijapur have asked me for an extension of a few dayswhich is OK, which means I expect something cooler from them as they spend time on the train backand forth.

    For the rest, I am fixing a deadline for Tuesday at 10 am. Please email me an attachment with SUBJECT : ''Multimodal Submission'' as a PDF document.

    GUIDELINES >>>>>

    It is intended to reinforce, by experience, your understanding of the lecture

    topics. It has intentionally been made quite open-ended in terms of choice ofsystem and methods used. You have the following tasks to complete:

    Choose a new technology or system to investigate. Justify your choice,citing some relevant previous research and commercial systems that have

    been developedYou should write a maximum of 1500 words (minimum font size, 12pt;

    minimum page borders 2cm) that describes the system you have chosen toinvestigate, why you chose that system, what issues in MultimodalInteraction that it will let you explore, and what previous research has

    been done on these issues.If you have a prototype/output, please include screenshots as work in

    progress.

    Choosing a system

    You should pick an example of technology, preferably new. This could be

    something like a touch screen mobile phone, a gesture recognition or voicecommand activated system, a new display or interaction device, a computer gameetc. Alternatively it could be a computerised interface for an everyday task, such as

    a machine that sells train tickets. It could be a web-based interface such as a librarycatalogue. Your choice should consider the following factors:

    Is this system new? Has it been introduced as an improvement that isintended to make interaction easier? Might you be able to compare the

    new and old system in your evaluation?Is there a well-defined, but not completely trivial, task that this system is

    meant to be used for? Note, your system might have multiple applications,

  • 7/29/2019 Gesture Interfaces Project

    2/30

    but you should focus on one specific task for the purpose of this

    assignment.

    Will it be practical for you to use this system for an evaluation?Note that difficulty with any of the above should lead you to reconsider and

    choose another system. You will not be given any special concessions onthe remainder of the assignment for having made a poor choice.

    Previous research

    You should look for previously published research (in the form of journal orconference articles) about the system you are investigating. For example, what

    previous investigations have been done comparing gesture recognition with other

    interaction methods? What criteria have been proposed for assessing usability ofwebsites?

    Hi All

    To clear any confusion in my previous email (see below) , I stand corrected!! (Theseare rough guidelines to aid you into the research process)

    Ofcourse your idea/concept is the focus.''new technology and investigation of a system'' is not the main focus.

    All I am saying is please investigate/research into the modules that your system willneed by looking into Previous work/Research and also talk about themThis will make sure that , indeed your idea is feasible.

    RegardsSharath

    Gesture-based computingon the cheapWith a single piece of inexpensive hardwarea multicolored gloveMIT

    researchers are makingMinority Report-style interfaces more accessible.

    Larry Hardesty, MIT News Office

    today's news

    Merging tissue and electronics

    http://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.html
  • 7/29/2019 Gesture Interfaces Project

    3/30

    New tissue scaffold could be used for drug development and implantable therapeutic devices.

    Turning on key enzyme blocks tumor formationAugust 27, 2012

    Making crowdsourcing easierAugust 24, 2012

    Engineers achieve longstanding goal of stable nanocrystalline metalsAugust 23, 2012

    The hardware for a new gesture-based computing system consists of nothing more than an ordinary webcam

    and a pair of brightly colored lycra gloves.

    Photo: Jason Dorfman/CSAILMay 20, 2010

    Share on facebookShare on twitterShare on redditMore Sharing ServicesShare

    Share on email

    Ever since Steven Spielbergs 2002 sci-fi movieMinority Report, in which a black-clad Tom Cruise stands

    in front of a transparent screen manipulating a host of video images simply by waving his hands, the idea

    of gesture-based computer interfaces has captured the imagination of technophiles. Academic and industry

    labs have developed a host of prototype gesture interfaces, ranging from room-sized systems with multiple

    cameras to detectorsbuilt into laptops screens. But MIT researchers have developed a system that could

    make gestural interfaces much more practical. Aside from a standard webcam, like those found in many

    new computers, the system uses only a single piece of hardware: a multicolored Lycra glove that could be

    manufactured for about a dollar.

    Other prototypes of low-cost gestural interfaces have used reflective or colored tape attached to the

    fingertips, but thats 2-D information, says Robert Wang, a graduate student in the Computer Science

    and Artificial Intelligence Laboratory who developed the new system together with Jovan Popovi, an

    http://web.mit.edu/newsoffice/2012/turning-on-key-enzyme-blocks-tumor-formation-0827.htmlhttp://web.mit.edu/newsoffice/2012/making-crowdsourcing-easier.htmlhttp://web.mit.edu/newsoffice/2012/stable-nanocrystalline-metals-achieved-0823.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://img.mit.edu/newsoffice/images/article_images/original/20100519153226-1.jpghttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://img.mit.edu/newsoffice/images/article_images/original/20100519153226-1.jpghttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://img.mit.edu/newsoffice/images/article_images/original/20100519153226-1.jpghttp://web.mit.edu/newsoffice/2012/nanoelectronics-and-tissues-0827.htmlhttp://web.mit.edu/newsoffice/2009/gestural-computing-1211.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/5/5005753ee202fe7b&frommenu=1&uid=5005753ee202fe7b&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2012/stable-nanocrystalline-metals-achieved-0823.htmlhttp://web.mit.edu/newsoffice/2012/making-crowdsourcing-easier.htmlhttp://web.mit.edu/newsoffice/2012/turning-on-key-enzyme-blocks-tumor-formation-0827.html
  • 7/29/2019 Gesture Interfaces Project

    4/30

    associate professor of electrical engineering and computer science. Youre only getting the fingertips; you

    dont even know which fingertip [the tape] is corresponding to. Wang and Popovis system, by contrast,

    can translate gestures made with a gloved hand into the corresponding gestures of a 3-D model of the hand

    on screen, with almost no lag time. This actually gets the 3-D configuration of your hand and your

    fingers, Wang says. We get how your fingers are flexing.

    The most obvious application of the technology, Wang says, would be in video games: Gamers navigating

    a virtual world could pick up and wield objects simply by using hand gestures. But Wang also imagines

    that engineers and designers could use the system to more easily and intuitively manipulate 3-D models of

    commercial products or large civic structures.

    Robert Wang demonstrates the speed and precision with which the system can gauge hand position in

    three dimensionsincluding the flexing of individual fingersas well as a possible application in

    mechanical engineering.

    Video:Robert Y. Wang/Jovan Popovi

    Patchwork approach

    The glove went through a series of designs, with dots and patches of different shapes and colors, but the

    current version is covered with 20 irregularly shaped patches that use 10 different colors. The number of

    colors had to be restricted so that the system could reliably distinguish the colors from each other, and

    from those of background objects, under a range of different lighting conditions. The arrangement and

    shapes of the patches was chosen so that the front and back of the hand would be distinct but also so that

    collisions of similar-colored patches would be rare. For instance, Wang explains, the colors on the tips of

    the fingers could be repeated on the back of the hand, but not on the front, since the fingers wouldfrequently be flexing and closing in front of the palm.

    Technically, the other key to the system is a new algorithm for rapidly looking up visual data in a database,

    which Wang says was inspired by the recent work of Antonio Torralba, the Esther and Harold E. Edgerton

    Associate Professor of Electrical Engineering and Computer Science in MITs Department of Electrical

    Engineering and Computer Science and a member of CSAIL. Once a webcam has captured an image of the

    glove, Wangs software crops out the background, so that the glove alone is superimposed upon a white

    background. Then the software drastically reduces the resolution of the cropped image, to only 40 pixels

    by 40 pixels. Finally, it searches through a database containing myriad 40-by-40 digital models of a hand,

    clad in the distinctive glove, in a range of different positions. Once its found a match, it simply looks upthe corresponding hand position. Since the system doesnt have to calculate the relative positions of the

    fingers, palm, and back of the hand on the fly, its able to provide an answer in a fraction of a second.

    Of course, a database of 40-by-40 color images takes up a large amount of memoryseveral hundred

    megabytes, Wang says. But today, a run-of-the-mill desktop computer has four gigabytesor 4,000

    megabytesof high-speed RAM memory. And that number is only going to increase, Wang says.

    Changing the game

    People have tried to do hand tracking in the past, says Paul Kry, an assistant professor at the McGillUniversity School of Computer Science. Its a horribly complex problem. I cant say that theres any

  • 7/29/2019 Gesture Interfaces Project

    5/30

    work in purely vision-based hand tracking that stands out as being successful, although many people have

    tried. Its sort of changing the game a bit to say, Hey, okay, Ill just add a little bit of information the

    color of the patchesand I can go a lot farther than these purely vision-based techniques. Kry

    particularly likes the ease with which Wang and Popovis system can be calibrated to new users. Since

    the glove is made from stretchy Lycra, it can change size significantly from one user to the next; but in

    order to gauge the gloves distance from the camera, the system has to have a good sense of its size. To

    calibrate the system, the user simply places an 8.5-by-11-inch piece of paper on a flat surface in front of

    the webcam, presses his or her hand against it, and in about three seconds, the system is calibrated.

    Wang initially presented the glove-tracking system at last years Siggraph, the premier conference on

    computer graphics. But at the time, he says, the system took nearly a half-hour to calibrate, and it didnt

    work nearly as well in environments with a lot of light. Now that the glove tracking is working well,

    however, hes expanding on the idea, with the design of similarly patterned shirts that can be used to

    capture information about whole-body motion. Such systems are already commonly used to evaluate

    athletes form or to convert actors live performances into digital animations, but a system based on Wang

    and Popovis technique could prove dramatically cheaper and easier to use.

    Share on facebookShare on google_plusoneShare on stumbleuponShare on diggShare on redditShare on deliciousMore Sharing Services233

    Share on email

    Gesture recognition is a topic incomputer scienceandlanguage technologywith the goal of

    interpreting humangesturesvia mathematicalalgorithms. Gestures can originate from any bodily

    motion or state but commonly originate from thefaceorhand. Current focuses in the field include

    emotion recognition from the face and hand gesture recognition. Many approaches have been madeusing cameras andcomputer visionalgorithms to interpretsign language. However, the identification

    and recognition of posture, gait,proxemics, and human behaviors is also the subject of gesture

    recognition techniques.[1]

    Gesture recognition can be seen as a way for computers to begin to understand humanbody

    language, thus building a richer bridge between machines and humans than primitivetext user

    interfacesor evenGUIs(graphical user interfaces), which still limit the majority of input to keyboard

    and mouse.

    Gesture recognition enables humans to interface with the machine (HMI) and interact naturally

    without any mechanical devices. Using the concept of gesture recognition, it is possible to point a

    finger at thecomputer screenso that thecursorwill move accordingly. This could potentially make

    conventionalinput devicessuch asmouse,keyboardsand eventouch-screensredundant.

    Although this technology is still in its infancy, applications are beginning to appear.Flutter, a start-up

    out of Palo Alto, CA, is allowing anyone with Mac/Windows computer and webcam to download an

    app that allows them to control Music & Video apps such as Spotify, iTunes, Windows Media Player,

    Quicktime, and VLC using gestures.

    Gesture recognition can be conducted with techniques fromcomputer visionandimage processing.

    The literature includes ongoing work in the computer vision field on capturing gestures or more

    general human pose and movements by cameras connected to a computer.[2][3][4][5]

    Gesture recognition and pen computing:

    http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=stumbleupon&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/1&frommenu=1&uid=503bab7c6a692cb7&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/3&frommenu=1&uid=503bab7c8572ca6f&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://web.mit.edu/newsoffice/2010/gesture-computing-0520.html?tmpl=component&print=1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-1http://en.wikipedia.org/wiki/Image_processinghttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Touch_screenhttp://en.wikipedia.org/wiki/Computer_keyboardhttp://en.wikipedia.org/wiki/Mouse_(computing)http://en.wikipedia.org/wiki/Input_deviceshttp://en.wikipedia.org/wiki/Cursor_(computers)http://en.wikipedia.org/wiki/Computer_screenhttp://en.wikipedia.org/wiki/User_interfacehttp://en.wikipedia.org/wiki/GUIhttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Text_user_interfacehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Body_languagehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-0http://en.wikipedia.org/wiki/Proxemicshttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Handhttp://en.wikipedia.org/wiki/Facehttp://en.wikipedia.org/wiki/Algorithmshttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Computer_sciencehttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/3&frommenu=1&uid=503bab7c8572ca6f&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=reddit&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/3&frommenu=1&uid=503bab7c8572ca6f&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=stumbleupon&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/1&frommenu=1&uid=503bab7c6a692cb7&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://www.addthis.com/bookmark.php?v=250&winname=addthis&pub=ra-4e43d5194319780c&source=tbx-250&lng=en-US&s=stumbleupon&u508=1&url=http%3A%2F%2Fweb.mit.edu%2Fnewsoffice%2F2010%2Fgesture-computing-0520.html&title=Gesture-based%20computing%20on%20the%20cheap%20-%20MIT%20News%20Office&ate=AT-ra-4e43d5194319780c/-/-/503bab7c9afb9f8e/1&frommenu=1&uid=503bab7c6a692cb7&ct=1&ui_cobrand=MIT%20News&pre=http%3A%2F%2Fwww.google.co.in%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D4%26cad%3Drja%26ved%3D0CDwQFjAD%26url%3Dhttp%253A%252F%252Fweb.mit.edu%252Fnewsoffice%252F2010%252Fgesture-computing-0520.html%26ei%3DZas7ULLOMIXrrQfY9YCoCg%26usg%3DAFQjCNHveiXvjOocxrRFB15g5fCyfE9n2Q%26sig2%3Do7hav5JOHuIklbkDhCMHkQ&tt=0http://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.htmlhttp://web.mit.edu/newsoffice/2010/gesture-computing-0520.html
  • 7/29/2019 Gesture Interfaces Project

    6/30

    In some literature[examples needed]

    , the term gesture recognition has been used to refer more narrowly

    to non-text-input handwriting symbols, such as inking on agraphics tablet,multi-touchgestures,

    andmouse gesturerecognition. This is computer interaction through the drawing of symbols with

    a pointing device cursor (see discussion atPen computing).

    Contents

    [hide]

    1 Gesture types

    2 Uses

    3 Input devices

    4 Algorithms

    o 4.1 3D model-based algorithms

    o 4.2 Skeletal-based algorithms

    o 4.3 Appearance-based models

    5 Challenges

    o 5.1 "Gorilla arm"

    6 See also

    7 References

    8 External links

    [edit]Gesture types

    In computer interfaces, two types of gestures are distinguished:[6]

    We consider online gestures, which

    can also be regarded as direct manipulations like scaling and rotating. In contrast, offline gestures are

    usually processed after the interaction is finished; e. g. a circle is drawn to activate a context menu.

    Offline gestures: Those gestures that are processed after the user interaction with the object. An

    example is the gesture to activate a menu.

    Online gestures: Direct manipulation gestures. They are used to scale or rotate a tangible object.

    [edit]Uses

    Gesture recognition is useful for processing information from humans which is not conveyed throughspeech or type. As well, there are various types of gestures which can be identified by computers.

    Sign language recognition. Just as speech recognition can transcribe speech to text, certain

    types of gesture recognition software can transcribe the symbols represented throughsign

    languageinto text.[7]

    For socially assistive robotics. By using proper sensors (accelerometers and gyros) worn on

    the body of a patient and by reading the values from those sensors, robots can assist in patient

    rehabilitation. The best example can be stroke rehabilitation.

    Directional indication through pointing. Pointing has a very specific purpose in our[clarification

    needed]society, to reference an object or location based on its position relative to ourselves. The

    use of gesture recognition to determine where a person is pointing is useful for identifying the

    http://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Multi-touchhttp://en.wikipedia.org/wiki/Multi-touchhttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Gesture_recognition#Gesture_typeshttp://en.wikipedia.org/wiki/Gesture_recognition#Gesture_typeshttp://en.wikipedia.org/wiki/Gesture_recognition#Useshttp://en.wikipedia.org/wiki/Gesture_recognition#Useshttp://en.wikipedia.org/wiki/Gesture_recognition#Input_deviceshttp://en.wikipedia.org/wiki/Gesture_recognition#Input_deviceshttp://en.wikipedia.org/wiki/Gesture_recognition#Algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#3D_model-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#3D_model-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Skeletal-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Skeletal-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Appearance-based_modelshttp://en.wikipedia.org/wiki/Gesture_recognition#Appearance-based_modelshttp://en.wikipedia.org/wiki/Gesture_recognition#Challengeshttp://en.wikipedia.org/wiki/Gesture_recognition#Challengeshttp://en.wikipedia.org/wiki/Gesture_recognition#.22Gorilla_arm.22http://en.wikipedia.org/wiki/Gesture_recognition#.22Gorilla_arm.22http://en.wikipedia.org/wiki/Gesture_recognition#See_alsohttp://en.wikipedia.org/wiki/Gesture_recognition#See_alsohttp://en.wikipedia.org/wiki/Gesture_recognition#Referenceshttp://en.wikipedia.org/wiki/Gesture_recognition#Referenceshttp://en.wikipedia.org/wiki/Gesture_recognition#External_linkshttp://en.wikipedia.org/wiki/Gesture_recognition#External_linkshttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-6http://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/wiki/Sign_languagehttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=2http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=1http://en.wikipedia.org/wiki/Gesture_recognition#External_linkshttp://en.wikipedia.org/wiki/Gesture_recognition#Referenceshttp://en.wikipedia.org/wiki/Gesture_recognition#See_alsohttp://en.wikipedia.org/wiki/Gesture_recognition#.22Gorilla_arm.22http://en.wikipedia.org/wiki/Gesture_recognition#Challengeshttp://en.wikipedia.org/wiki/Gesture_recognition#Appearance-based_modelshttp://en.wikipedia.org/wiki/Gesture_recognition#Skeletal-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#3D_model-based_algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Algorithmshttp://en.wikipedia.org/wiki/Gesture_recognition#Input_deviceshttp://en.wikipedia.org/wiki/Gesture_recognition#Useshttp://en.wikipedia.org/wiki/Gesture_recognition#Gesture_typeshttp://en.wikipedia.org/wiki/Gesture_recognitionhttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Multi-touchhttp://en.wikipedia.org/wiki/Graphics_tablethttp://en.wikipedia.org/wiki/Wikipedia:Please_clarify
  • 7/29/2019 Gesture Interfaces Project

    7/30

    context of statements or instructions. This application is of particular interest in the field

    ofrobotics.[8]

    Control through facial gestures. Controlling a computer through facial gestures is a useful

    application of gesture recognition for users who may not physically be able to use a mouse or

    keyboard.Eye trackingin particular may be of use for controlling cursor motion or focusing on

    elements of a display.

    Alternative computer interfaces. Foregoing the traditional keyboard and mouse setup to

    interact with a computer, strong gesture recognition could allow users to accomplish frequent or

    common tasks using hand or face gestures to a camera.[9][10][11][12][13]

    Immersive game technology. Gestures can be used to control interactions within video games

    to try and make the game player's experience more interactive or immersive.

    Virtual controllers. For systems where the act of finding or acquiring a physical controller could

    require too much time, gestures can be used as an alternative control mechanism. Controlling

    secondary devices in a car, or controlling a television set are examples of such usage.[14]

    Affective computing. Inaffective computing, gesture recognition is used in the process of

    identifying emotional expression through computer systems.

    Remote control. Through the use of gesture recognition, "remote controlwith the wave of a

    hand" of various devices is possible. The signal must not only indicate the desired response, but

    also which device to be controlled.[15][16][17]

    [edit]Input devices

    The ability to track a person's movements and determine what gestures they may be performing can

    be achieved through various tools. Although there is a large amount of research done in image/video

    based gesture recognition, there is some variation within the tools and environments used between

    implementations.

    Wired gloves. These can provide input to the computer about the position and rotation of the

    hands using magnetic or inertial tracking devices. Furthermore, some gloves can detect finger

    bending with a high degree of accuracy (5-10 degrees), or even provide haptic feedback to the

    user, which is a simulation of the sense of touch. The first commercially available hand-tracking

    glove-type device was the DataGlove,[18]

    a glove-type device which could detect hand position,

    movement and finger bending. This uses fiber optic cables running down the back of the hand.

    Light pulses are created and when the fingers are bent, light leaks through small cracks and the

    loss is registered, giving an approximation of the hand pose.

    Depth-aware cameras. Using specialized cameras such asstructured lightortime-of-flight

    cameras, one can generate a depth map of what is being seen through the camera at a shortrange, and use this data to approximate a 3d representation of what is being seen. These can be

    effective for detection of hand gestures due to their short range capabilities.[19]

    Stereo cameras. Using two cameras whose relations to one another are known, a 3d

    representation can be approximated by the output of the cameras. To get the cameras' relations,

    one can use a positioning reference such as alexian-stripeorinfraredemitters.[20]

    In combination

    with direct motion measurement (6D-Vision) gestures can directly be detected.

    Controller-based gestures. These controllers act as an extension of the body so that when

    gestures are performed, some of their motion can be conveniently captured by software.Mouse

    gesturesare one such example, where the motion of the mouse is correlated to a symbol being

    drawn by a person's hand, as is theWii Remote, which can study changes in acceleration overtime to represent gestures.

    [21][22][23]Devices such as the LG Electronics Magic Wand, the Loop

    http://en.wikipedia.org/wiki/Roboticshttp://en.wikipedia.org/wiki/Roboticshttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-12http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-12http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-16http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-16http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/wiki/Wired_glovehttp://en.wikipedia.org/wiki/Wired_glovehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Stereo_camerashttp://en.wikipedia.org/wiki/Stereo_camerashttp://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-22http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-22http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-22http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-20http://en.wikipedia.org/wiki/Wii_Remotehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Stereoscopy#Stereoscopic_motion_measurement_.286D-Vision.29http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-19http://en.wikipedia.org/wiki/Infraredhttp://en.wikipedia.org/w/index.php?title=Lexian-stripe&action=edit&redlink=1http://en.wikipedia.org/wiki/Stereo_camerashttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-18http://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Time-of-flight_camerahttp://en.wikipedia.org/wiki/Structured_lighthttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-17http://en.wikipedia.org/wiki/Wired_glovehttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=3http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-16http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-14http://en.wikipedia.org/wiki/Remote_controlhttp://en.wikipedia.org/wiki/Affective_computinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-13http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-12http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-8http://en.wikipedia.org/wiki/Eye_trackinghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-7http://en.wikipedia.org/wiki/Robotics
  • 7/29/2019 Gesture Interfaces Project

    8/30

    and the Scoop useHillcrest Labs' Freespace technology, which uses MEMS accelerometers,

    gyroscopes and other sensors to translate gestures into cursor movement. The software also

    compensates for human tremor and inadvertent movement.[24][25][26]

    Single camera. A normal camera can be used for gesture recognition where the

    resources/environment would not be convenient for other forms of image-based recognition.

    Earlier it was thought that single camera may not be as effective as stereo or depth aware

    cameras, but a start-up based out of Palo Alto namedFlutteris challenging this theory. It has

    released an app that could be downloaded to by any windows/mac computer with built-in

    webcam, thus, allowing an accessibility to a wider audience.[27]

    [edit]Algorithms

    Different ways of tracking and analyzing gestures exist, and some basic layout is given is in the diagram above. For

    example, volumetric models convey the necessary information required for an elaborate analysis, however they prove to

    be very intensive in terms of computational power and require further technological developments in order to be

    implemented for real-time analysis. On the other hand, appearance-based models are easier to process but usually lack

    the generality required for Human-Computer Interaction.

    Depending on the type of the input data, the approach for interpreting a gesture could be done in

    different ways. However, most of the techniques rely on key pointers represented in a 3D coordinate

    system. Based on the relative motion of these, the gesture can be detected with a high accuracy,

    depending of the quality of the input and the algorithms approach.

    In order to interpret movements of the body, one has to classify them according to common properties

    and the message the movements may express. For example, in sign language each gesture

    represents a word or phrase. The taxonomy that seems very appropriate for Human-Computer

    Interaction has been proposed by Quek in "Toward a Vision-Based Hand Gesture Interface".[28]He

    presents several interactive gesture systems in order to capture the whole space of the gestures: 1.Manipulative; 2. Semaphoric; 3. Conversational.

    Some literature differentiates 2 different approaches in gesture recognition: a 3D model based and an

    appearance-based.[29]The foremost method makes use of 3D information of key elements of the body

    parts in order to obtain several important parameters, like palm position or joint angles. On the other

    hand, Appearance-based systems use images or videos for direct interpretation.

    http://en.wikipedia.org/wiki/Hillcrest_Labshttp://en.wikipedia.org/wiki/Hillcrest_Labshttp://en.wikipedia.org/wiki/Hillcrest_Labshttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-TechJournal-25http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-TechJournal-25http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:BigDiagram2.jpghttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-28http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-27http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=4http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-26http://en.wikipedia.org/wiki/Flutter_(company)http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-TechJournal-25http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-Wong-23http://en.wikipedia.org/wiki/Hillcrest_Labs
  • 7/29/2019 Gesture Interfaces Project

    9/30

    A read hand (left) is interpreted as a collection of vertices and lines in the 3D mesh version (right), and the software

    uses their relative position and interaction in order to infer the gesture.

    [edit]3D model-based algorithms

    The 3D model approach can use volumetric or skeletal models, or even a combination of the two.

    Volumetric approaches have been heavily used in computer animation industry and for computer

    vision purposes. The models are generally created of complicated 3D surfaces, like NURBS or

    polygon meshes.

    The drawback of this method is that is very computational intensive, and systems for live analysis are

    still to be developed. For the moment, a more interesting approach would be to map simple primitive

    objects to the persons most important body parts ( for example cylinders for the arms and neck,

    sphere for the head) and analyse the way these interact with each other. Furthermore, some abstract

    structures likesuper-quadricsandgeneralised cylindersmay be even more suitable for approximating

    the body parts. The very exciting about this approach is that the parameters for these objects arequite simple. In order to better model the relation between these, we make use of constraints and

    hierarchies between our objects.

    The skeletal version (right) is effectively modelling the hand (left). This has less parameters than the volumetric version

    and it's easier to compute, making it suitable for real-time gesture analysis systems.

    [edit]Skeletal-based algorithms

    Instead of using intensive processing of the 3D models and dealing with a lot of parameters, one can

    just use a simplified version of joint angle parameters along with segment lengths. This is known as a

    skeletal representation of the body, where a virtual skeleton of the person is computed and parts of

    the body are mapped to certain segments. The analysis here is done using the position and

    orientation of these segments and the relation between each one of them( for example the angle

    between the joints and the relative position or orientation)

    Advantages of using skeletal models:

    Algorithms are faster because only key parameters are analyzed.

    Pattern matching against a template database is possible

    Using key points allows the detection program to focus on the significant parts of the body

    http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5http://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Appearance_hands.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Skeletal-hand_.jpghttp://en.wikipedia.org/wiki/File:Volumetric-hands.jpghttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=6http://en.wikipedia.org/wiki/Cylinder_(geometry)http://en.wikipedia.org/wiki/Superquadricshttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=5
  • 7/29/2019 Gesture Interfaces Project

    10/30

    These binary silhouette(left) or contour(right) images represent typical input for appearance-based algorithms. They are

    compared with different hand templates and if they match, the correspondent gesture is inferred.

    [edit]Appearance-based models

    These models dont use a spatial representation of the body anymore, because they derive theparameters directly from the images or videos using a template database. Some are based on the

    deformable 2D templates of the human parts of the body, particularly hands. Deformable templates

    are sets of points on the outline of an object, used as interpolation nodes for the objects outline

    approximation. One of the simplest interpolation function is linear, which performs an average shape

    from point sets, point variability parameters and external deformators. These template-based models

    are mostly used for hand-tracking, but could also be of use for simple gesture classification.

    A second approach in gesture detecting using appearance-based models uses image sequences as

    gesture templates. Parameters for this method are either the images themselves, or certain features

    derived from these. Most of the time, only one ( monoscopic) or two ( stereoscopic ) views are used.

    [edit]Challenges

    There are many challenges associated with the accuracy and usefulness of gesture recognition

    software. For image-based gesture recognition there are limitations on the equipment used andimage

    noise. Images or video may not be under consistent lighting, or in the same location. Items in the

    background or distinct features of the users may make recognition more difficult.

    The variety of implementations for image-based gesture recognition may also cause issue for viability

    of the technology to general usage. For example, an algorithm calibrated for one camera may not

    work for a different camera. The amount of background noise also causes tracking and recognition

    difficulties, especially when occlusions (partial and full) occur. Furthermore, the distance from the

    camera, and the camera's resolution and quality, also cause variations in recognition accuracy.

    In order to capture human gestures by visual sensors, robust computer vision methods are also

    required, for example for hand tracking and hand posture recognition[30][31][32][33][34][35][36][37][38]

    or for

    capturing movements of the head, facial expressions or gaze direction.

    [edit]"Gorilla arm"

    "Gorilla arm" was a side-effect of vertically-oriented touch-screen or light-pen use. In periods of

    prolonged use, users' arms began to feel fatigue and/or discomfort. This effect contributed to the

    decline of touch-screen input despite initial popularity in the 1980s.[39][40]

    Gorilla arm is not a problem for short-term use, since they only involve brief interactions which do not

    last long enough to cause gorilla arm.

    [edit]See also

    Pen computingDiscussion of gesture recognition for tablet computers

    Mouse gesture

    Computer vision

    Dialogue-Assisted Visual Environment for Geoinformation(DAVE_G)

    Gestures

    Hidden Markov model

    Language technology

    Omek Interactive

    http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Gesture_recognition#cite_note-29http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-31http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-33http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-35http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-37http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-37http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Dialogue-Assisted_Visual_Environment_for_Geoinformationhttp://en.wikipedia.org/wiki/Dialogue-Assisted_Visual_Environment_for_Geoinformationhttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Omek_Interactivehttp://en.wikipedia.org/wiki/Omek_Interactivehttp://en.wikipedia.org/wiki/Omek_Interactivehttp://en.wikipedia.org/wiki/Language_technologyhttp://en.wikipedia.org/wiki/Hidden_Markov_modelhttp://en.wikipedia.org/wiki/Gesturehttp://en.wikipedia.org/wiki/Dialogue-Assisted_Visual_Environment_for_Geoinformationhttp://en.wikipedia.org/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Mouse_gesturehttp://en.wikipedia.org/wiki/Pen_computinghttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=10http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-38http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=9http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-37http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-35http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-35http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-33http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-33http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-31http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-31http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-29http://en.wikipedia.org/wiki/Gesture_recognition#cite_note-29http://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/wiki/Image_noisehttp://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=8http://en.wikipedia.org/w/index.php?title=Gesture_recognition&action=edit&section=7
  • 7/29/2019 Gesture Interfaces Project

    11/30

    SixthSense

    SoftKinetic

    Sketch recognition

    Multi-touch gestures

    AnyTouch by ayotle and digitas_fr

    Ayotle Home Page

    Recognizing gestures:Interface design beyond

    point-and-clickBy Robert Cravotta, Technical Editor - August 16, 2007inShare

    Save FollowPRINTEMAIL

    The most basic and simplest gesture is pointing, and it is an effective method for most

    people to communicate with each other, even in the presence of language barriers.

    However, pointing quickly fails as a way to communicate when the object or concept that

    a person is trying to convey is not in sight to point at. Taking gesture recognition beyond

    simple pointing greatly increases the type of information that two people cancommunicate with each other. Gesture communication is so natural and powerful that

    parents are increasingly using it to enable their babies to engage in direct, two-way

    communication with their care givers, through baby sign language, long before the

    babies can clearly speak (Reference 1).

    The level of communication between users and their electronic devices has been largely

    limited to a pointing interface. To date, a few common extensions to the pointing

    interface exist. They include single- versus double-click or tap devices and devices that

    allow users to hold down a button while moving the pointing focus, such as mice,

    trackballs, and touchscreens. A user's ability to naturally communicate with a computingdevice through a gesture interface and a speech-recognition interface, such as a

    multitouch display or an optical-input system, is still largely an emerging capability.

    Consider the new and revolutionary mobile phone that relies on a touchscreen-driven

    user interface instead of physical buttons and uses a predictive engine that helps users

    with typing on the flat panel. This description could apply to Apple's iPhone, which the

    company launched in June, but it can also apply to the IBM Simon, which the company

    launched with Bell South in 1993, 14 years earlier than the iPhone. Differences exist

    between the two touch interfaces. For example, the newer units support multitouch

    gestures, such as pinching an image to size it and flicking the display to scroll the

    http://en.wikipedia.org/wiki/SixthSensehttp://en.wikipedia.org/wiki/SixthSensehttp://en.wikipedia.org/wiki/SoftKinetichttp://en.wikipedia.org/wiki/SoftKinetichttp://en.wikipedia.org/wiki/Sketch_recognitionhttp://en.wikipedia.org/wiki/Sketch_recognitionhttp://en.wikipedia.org/wiki/Multi-touch_gestureshttp://en.wikipedia.org/wiki/Multi-touch_gestureshttp://vimeo.com/43108191/http://vimeo.com/43108191/http://www.ayotle.com/http://www.ayotle.com/http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickmailto:?subject=Recognizing%20gestures:%20Interface%20design%20beyond%20point-and-click&body=http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickmailto:?subject=Recognizing%20gestures:%20Interface%20design%20beyond%20point-and-click&body=http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-amailto:?subject=Recognizing%20gestures:%20Interface%20design%20beyond%20point-and-click&body=http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-clickhttp://www.ayotle.com/http://vimeo.com/43108191/http://en.wikipedia.org/wiki/Multi-touch_gestureshttp://en.wikipedia.org/wiki/Sketch_recognitionhttp://en.wikipedia.org/wiki/SoftKinetichttp://en.wikipedia.org/wiki/SixthSense
  • 7/29/2019 Gesture Interfaces Project

    12/30

    content. This article touches on the nature of how gesture interfaces are evolving and

    what they mean for future interfaces.

    Much of the technology driving many of today's latest and innovative gesturelike

    interfaces is not exactly new: Most of these interfaces can trace their heritage in

    products or projects from the past few decades. According toReference 2, multitouchpanel interfaces have existed for at least 25 years, and that length of time is on par with

    the 30 years that elapsed between the invention of the mouse in 1965 and the mouse's

    reaching its tipping point as a ubiquitous pointing device, which happened with the

    release of Microsoft Windows 95. Improvements in the hardware for these types of

    interfaces enable designers to shrink and lower the cost of end systems. More important,

    however, these improved interfaces enable designers to leverage additional low-cost

    software-processing capacity to use it to better identify more contexts so they can better

    interpret what a user is trying to tell the system to do. In other words, most of the

    advances in emerging gesture interfaces will come not so much from new hardware as

    from more complex software algorithms that best use the strengths and compensate for

    the weaknesses of each type of input interface.Reference 3provides a work-in-

    progress directory of sources for input technologies.

    In addition to the commercial launch of the iPhone, this year has borne witness to the

    Korean and European launch of the LG Electronics-manufactured, Prada-designed LG

    Prada phone, the successful commercial launch of Nintendo's Wii gesture-interface

    console, and the pending launch of the multitouch Microsoft Surface Platform

    (see sidebarMultitouch surfaces). Are the lessons designers learned from previous

    iterations of gesture interfaces sufficient to give today's latest innovative products the

    legs they need to survive more than a year or two and finally usher in the promising age

    of more natural communication between humans and machines? These platforms have

    access to large amounts of memory and worldwide connectivity through the Internet for

    software updates. So, perhaps the more relevant question is: Can the flexible,

    programmable nature of these platforms enable the gesture interfaces to adjust to the

    set of as-yet-unlearned lessons without going back to the drawing board?

    Gesture-recognition interfaces are not limited to just gaming and infotainment products.

    Users of Segway's PTs (personal transporters) intuitively command their transporters by

    leaning in the appropriate direction to move forward, stop, and turn left or right (Figure

    1). Some interfaces focus on capturing a rich range of subtle gestures to emulate using

    a real-world tool rather than issuing abstract commands to a computer. For example,

    Wacom's Intuos and Cintiq tablets coupled with tablet-enhanced paint- and graphics-

    software programs can faithfully capture an artist's hand and tool motions in the six

    dimensions of up and down, left and right, downward pressure on the tablet surface,

    stylus-tilt angle, stylus-tilt direction, and stylus rotation. This feature enables the software

    to re-create not only the gross motions, but also the fine motions, such as twisting a

    user's hand to more realistically emulate the behavior of complex objects, such as paintand drawing tools.

    http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/file/12056-Figure_1.pdfhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636236-255-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-a
  • 7/29/2019 Gesture Interfaces Project

    13/30

    Another example of capturing subtle motions to enable the emulation of the direct

    manipulation of real-world tools is Intuitive Surgical's da Vinci Surgical System. This

    system employs a proprietary 3-D-vision system and two sets of robotsthe masters

    and the EndoWrist instrumentsto faithfully translate the intent of a surgeon's hand and

    finger motions on the masters to control the EndoWrist instruments during roboticlaparoscopic surgery (Figure 2). Decoupling the surgeon's hand motions from the on-

    site surgical instruments through the masters not only allows the surgery to require only

    a few small cuts to insert the surgical tools into the patient, but also affords the surgeon

    a better posture to delay the onset of fatigue when performing long procedures. It also

    enables greater surgical precision, an increased range of motion, and improved dexterity

    through digital filtering than if the surgeon directly manipulates the surgical tools, such

    as in traditional laparoscopic surgery.

    The 3-D-vision system is a critical feedback interface that enables surgeons to

    effectively use the da Vinci Surgical System and avoid mistakes. Additionally, the system

    complements the visual-feedback interface with some simple haptics or force feedback

    such as that to detect when internal and external collisions occur during a motion.

    Research organizations, such as at Johns Hopkins University, are using the da Vinci

    Surgical System to study technologies that support a sense of touch. The da Vinci is a

    perfect 'laboratory,' as it provides high-quality motion and video data of a focused and

    stylized set of goal-directed tasks, says Gregory D Hager, professor of computer

    science at Johns Hopkins. We envision using the statistical models we develop as a

    way of making the device more 'intelligent' by allowing it to recognize what is happening

    in the surgical field.

    Unseen potentialGreat experiences don't happen by accident, says Bill Buxton, principal researcher at

    Microsoft. They are the result of deep thought and deliberation. His decidedly low-tech

    example involves two manual juicers that look similar and have the same user interface

    (Reference 4). If you can use one, you can use the other. The juice tastes the same

    from each, and each takes the same amount of time to make the juice. However, they

    differ in the method and the timing of a user's applying the maximum force. The juicer

    with the constant-gear-ratio effect requires the user to apply the maximum force at theend of the lever pull, whereas the other juicer delivers a variable-gear-ratio effect that

    reduces the pressure the user needs to apply at the end of the lever pull. In essence, the

    qualitative difference between the juicers is the result of nonobvious mechanisms hidden

    in the interface.

    These examples of gesture-recognition interfaces are direct-control interfaces, in which

    users explicitly tell or direct the system to do what they want. However, the emerging

    trend for embedded or invisible human-machine interfaces is an area of even greater

    potential. Embedded processing, which is usually invisible to the end user, continues to

    enable designers to make their products perform more functions at lower cost and withbetter energy efficiency. As the cost of sensors and processing capacity continue to drop

    http://www.edn.com/file/12082-Figure_2.pdfhttp://www.edn.com/file/12082-Figure_2.pdfhttp://www.edn.com/file/12082-Figure_2.pdfhttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/file/12082-Figure_2.pdf
  • 7/29/2019 Gesture Interfaces Project

    14/30

    and the processors are able to optimize the essential functions of the systems they

    control, an opportunity arises for the extra available processing to provide an implicit or

    embedded human-machine interface between the user and the system. In other words,

    users may imply their intent with the system without consciously being aware they are

    doing just that. This emerging capability is essential to enabling systems to usepredictive compensation to better accommodate a user's inexperience or errors and

    allowing the system to still perform what the user intended.

    The Simon's PredictaKey keyboard explicitly listed to the user its top six predicted-letter

    candidates and allowed the user to explicitly select from that list. To take advantage of

    the prediction engine, the user had to explicitly engage with the engine's suggestions

    and choose from them. In contrast, the iPhone's typing interface manifests itself in

    several obvious and hidden ways to improve typing speed and accuracy. First, it

    presents specialized key layouts for each application so that only keys that are relevant

    are available for input. As the user types, the system may predict the word and present it

    to the user while they are typing; if the word is correct, the user can select it by pressing

    the space key on the display or just continue typing. Likewise, the system tries to

    identify potentially misspelled words and presents the word with the correct spelling in a

    similar fashion to allow the user to accept or ignore the proposed correction.

    However, the new and invisible magic in the iPhone typing interface is that it

    compensates for the possibility of the user's pressing the wrong letter on the display

    panel by dynamically resizing the target area or tap zone assigned to each letter without

    changing the display size of any of the letters, based on its typing engine's predictions of

    what letter the user will select next (Reference 5). The letters that the prediction engine

    believes the user may press next receive a larger tap zone that can overlap with the

    display area of nearby, lower probability letters, which receive a smaller tap zone as a

    result. This feature increases the chances of selecting the predicted letter and

    decreases the chances of selecting an unpredicted letter that is adjacent to the predicted

    letter.

    Although not considered a strict user interface, such as that between a user and a

    computer, some automobile-safety features implement an early form of implicit

    communication interfaces for predictive-safety features. As an example, to determine

    whether to warn the driver of an imminent lane departure, the system can examine the

    turn signal to determine whether the impending lane departure is intentional or

    accidental. People unintentionally and implicitly communicate their presence to

    passenger-detection systems that may control whether safety systems should deploy in

    the event of an accident. For example, the automobile may adjust how the air bag

    deploys to avoid certain types of injuries for passengers of different sizes. Electronic

    stability-control systems can compare the driver's implied intention, by examining the

    steering and braking inputs, with the vehicle's actual motion; they can then appropriately

    apply the brakes on each wheel and reduce engine power to help correct understeer

    http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1636152-234-a
  • 7/29/2019 Gesture Interfaces Project

    15/30

    (plowing), oversteer (fishtailing), and drive-wheel slippage to help the driver maintain

    some control of the vehicle.

    The control systems for the highest maneuverable fighter aircraft offer some insight into

    the possible future of consumer-level control of complex systems. Because these aircraft

    employ high levels of instability to realize their maneuverability, the pilot can no longerexplicitly and directly control the aircraft subsystems; rather, the embedded-processing

    system handles those details and enables the pilot to focus on higher level tasks. As

    automobile-control systems can better predict a driver's intentions and correlate those

    intentions with the state of the vehicle and the surrounding environment, they may be

    able to deliver even higher levels of energy efficiency by reducing energy loads in

    situations in which they are currently unnecessarywithout sacrificing safety. In each

    case, the ability of the system to better understand the user's intention and act

    appropriately correlates to the system's ability to invisibly and accurately predict what the

    user can and might do next.

    No matter how rich and intuitive an interface is, its ultimate success and adoption

    depend on how well the user and the system can signal each other and compensate for

    the possible range of misunderstandings. Uncertainty or unpredictability between how to

    command a system and its resultant behavior can kill the immediate usefulness and

    delay the adoption of the gesture interface. Merely repetitively informing the user that

    there is an error is insufficient in modern electronic equipment. These devices often

    guide the user about the nature of the error or misunderstanding and how they might

    correct the condition. Modern interfaces employ a combination of sensors, improved

    processing algorithms, and user feedback. This combination provides a variety of

    mechanisms to reduce ambiguity and uncertainty between the user and the system so

    that each can more quickly and meaningfully compensate for unexpected behavior of

    the other (see sidebarI'll compensate for you).

    One way to compensate for potential misunderstandings is for the system to control and

    to reduce the set of possible inputs to only those with a valid context, such as with the

    iPhone's specialized key layouts. Applications that can segment and isolate narrow

    contexts and apply strong goal-defined tasks in each one are good candidates for this

    type of compensation. Handwriting systems based on the Graffiti recognition system,

    such as Palm PDAs, improved the usability of a handwriting interface by narrowing the

    possibility for erroneous inputs, but doing so involved a significant learning curve for

    users before they could reliably use the system. Speech-recognition systems that

    require no training from a speaker increase their success rate by significantly limiting the

    number of words the systems can recognize, such as the 10 digits, or by presenting the

    user with a short menu of responses.

    Another method of compensating for misunderstandings is to eliminate or move

    translations from the user to the system. HP Labs India is working with a pen-based

    device, the GKB (gesture keyboard), which allows users to enter phonetic scripts, suchas Devanagari and Tamil scripts, as text input without the benefit of a language-specific

    http://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-ahttp://www.edn.com/design/systems-design/4316686/Recognizing-gestures-Interface-design-beyond-point-and-click#id1675921-311-a
  • 7/29/2019 Gesture Interfaces Project

    16/30

    keyboard. Another example is the Segway PT that once required a user to translate a

    forward and backward twist to correspond to a signal to turn left or right. Now, it instead

    allows the user to indicate left or right by leaning in the desired direction. In this case,

    the newer interface control removes the ambiguity of which twist direction aligns with

    which turn direction, and it aligns the control with the natural center-of-gravity usescenario for the system, which greatly increases its chances as a useful and sustainable

    interface.

    Another important way to compensate for potential errors or misunderstandings is to

    give users enough relevant feedback so that they can appropriately change their

    expectations or behavior. Visual feedback is a commonly used mechanism. The mouse

    cursor on most systems performs more functions than just acting as a pointing focus; it

    also acts as a primary feedback to the user about when the system is busy and why.

    The success of the gesture interface with the Wii remote hinges in part on how well the

    system software improves over time to provide better sensitivity to player gestures. It

    also depends on how well it provides feedback, such as a visual cue on the display, that

    points out how users can make small adjustments to their motions so that the system

    properly interprets their intended gestures.

    Read more In-Depth

    TechnicalFeatures

    Haptic or tactile feedback engages the user's sense of touch; it is a growing area for

    feedback, especially as a component of multimodal feedback involving more than a

    single sense. Game consoles have employed rumble features for years in their handheld

    controllers. The Segway PT signals error conditions to the user through force feedback

    in the control stick. The da Vinci Surgical System uses force feedback to signal

    boundary collisions, such as when the EndoWrist instrument makes contact with the

    surface of the cutting target. Haptic feedback can compensate for the weaknesses of

    other feedback methods, such as audio sounds in noisy environments.

    Haptic feedback can also help offload the visual sensory overload by freeing the user's

    eyes from seeking visual confirmation that the system has received an input to focus his

    eyes on other details. For example, the iPhone keypad does not implement haptic

    feedback to signal the user which key was pressed and when, so the user must visually

    confirm each key press the system processes. One company, Immersion, offers a way

    to simulate a tactile sensation for mobile devices by issuing precise pulse control over a

    device's vibration actuator within a 5-msec window of the input event.

    When all other compensation methods fail to eliminate a misunderstanding, designers

    can employ a context-relevant response to address the uncertainty of a given input. A

    common response type is to issue a warning and to ask the user to repeat the input, but

    this situation risks frustrating the user if the system repeatedly requests the input with no

    additional guidance about what it needs. The system can make a best guess as to what

    the input was and ask the user to confirm that the guess is correct; this scenario also

    http://www.edn.com/featureshttp://www.edn.com/featureshttp://www.edn.com/featureshttp://www.edn.com/features
  • 7/29/2019 Gesture Interfaces Project

    17/30

    can cause frustration to the user if no method is available to refine the guess on a

    second try or if the system must too often confirm an input. A possible strategy for

    minimizing the use of these types of responses is for the system to profile the user'