nicholas frost, william grant, kiennguyen, and parthparikhposter).pdf · §a valuable device would...

1
§ A valuable device would be able to translate sign language into spoken words to ease the communication barrier between people who only know sign language and people who only know English § Microsoft has demonstrated a sign language interpreter using its Kinect product, though the device had trouble identifying all the fingers so we intend to improve upon their results by using the Leap Motion device § The Leap is desirable as it focuses on hand movements and is smaller and more portable. It is also cheaper and operates easily on the three major operating systems § We use a recurrent neural network to train and analyze sign language data input from the Leap observing 372 features from the hand including the three dimensional coordinates of all the joints in the hand and the center of the palm § We have achieved 95% accuracy over 26 classes, namely the letters of the alphabet ABSTRACT § American Sign Language (ASL): This language is used by approximately 500 thousand to 2 million people across the United States. It is most common among the deaf community. § Previous work: Historically, Microsoft’s Kinect has been used in conjunction with random forest to classify stationary ASL letters with 92% accuracy. Further, the Leap has been used with support vector machine attaining 80% accuracy. § Leap Motion: This is a piece of hardware that connects to a computer via USB. We use it to collect approximately 60 frames per second where each frame holds 372 features on the hands. The data takes the form of a grayscale stereo image of the near-infrared light spectrum, separated into the left and right cameras. BACKGROUND § We have achieved 95% accuracy in identifying the 26 letters in the alphabet § Confusion matrix: RESULTS AND FUTURE WORK § Dong, C., Leu, M. C., & Yin, Z. (2015). American Sign Language Alphabet Recognition Using Microsoft Kinect. 2015 Computer Vision and Pattern Recognition. IEEE. § T. Kim, J. Keane, W. Wang, H. Tang, J. Riggle, G. Shakhnarovich, D. Brentari, and K. Livescu, "Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation” arXiv:1609.07876v1 2016 § H. Sakoe and S. Chiba, “Dynamic programming algo- rithm optimization for spoken word recognition,” IEEE Trans. Acoust., Speech, Signal Process., vol. 26, no. 1, pp. 43–49, 1978. § C. H. Chuan, E. Regina and C. Guardino, "American Sign Language Recognition Using Leap Motion Sensor," Machine Learning and Applications (ICMLA), 2014 13th International Conference on, Detroit, MI, 2014, pp. 541-544 REFERENCES SIGN LANGUAGE LEAPS TO ENGLISH Nicholas Frost, William Grant, Kien Nguyen, and Parth Parikh {naf77, wrg34, khn22, prp60}@scarletmail.rutgers.edu Advisor: Professor Anand Sarwate Rutgers University, Department of Electrical and Computer Engineering METHODOLOGY Leap: Use hand to sign a letter Data reader and raw data listener: Produces a JSON file with all the information on the hand Transformer: Sample, filter, and normalize raw data Recurrent Neural Network: Train and classify letter via Keras library Output English letter and histogram with top five classifications via a friendly user interface We would like to acknowledge Professor Hana Godrich and undergraduate student Michael Soskind for their consistent support throughout the project. We further would like to thank the Rutgers Sign Language Club and specifically Isabeau Touchard for being the source of authentic sign language data. ACKNOWLEDGEMENTS Data collection example and Leap diagnostic viewer: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0ahUKEwjM-K- bjKfTAhXJ7CYKHfr9CgkQjRwIBw&url=https%3A%2F%2Fwww.pubnub.com%2Fblog%2F2015-08-19-motion-controlled-servos-with- leap-motion-raspberry-pi%2F&psig=AFQjCNGjqqDvWcMW45-rLNyC52-9QbXGSQ&ust=1492367466269697 http://colah.github.io/posts/2015-08-Understanding-LSTMs/ § Key takeaway: We have demonstrated proof of concept for a Leap device to be used as a sign language translator. This directly impacts the deaf population across the world. § Future work § Extend the capability of the software to be able to identify more classes including words and phrases § Include video data to identify signs that depend on relative location of body § Conventional classification methods do not work well with time series data (i.e. letters with motion) § We solve this by employing a recurrent neural network, specifically long short term memory to extract temporal features

Upload: others

Post on 26-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nicholas Frost, William Grant, KienNguyen, and ParthParikhposter).pdf · §A valuable device would be able to translate sign language into spoken words to ease the communication barrier

§ AvaluabledevicewouldbeabletotranslatesignlanguageintospokenwordstoeasethecommunicationbarrierbetweenpeoplewhoonlyknowsignlanguageandpeoplewhoonlyknowEnglish

§ MicrosofthasdemonstratedasignlanguageinterpreterusingitsKinectproduct,thoughthedevicehadtroubleidentifyingallthefingerssoweintendtoimproveupontheirresultsbyusingtheLeapMotiondevice

§ TheLeapisdesirableasitfocusesonhandmovementsandissmallerandmoreportable. Itisalsocheaperandoperateseasilyonthethreemajoroperatingsystems

§ WeusearecurrentneuralnetworktotrainandanalyzesignlanguagedatainputfromtheLeapobserving372featuresfromthehandincludingthethreedimensionalcoordinatesofallthejointsinthehandandthecenterofthepalm

§ Wehaveachieved95%accuracyover26classes,namelythelettersofthealphabet

ABSTRACT

§ AmericanSignLanguage(ASL): Thislanguageisusedbyapproximately500thousandto2millionpeopleacrosstheUnitedStates.Itismostcommonamongthedeafcommunity.

§ Previouswork:Historically,Microsoft’sKinecthasbeenusedinconjunctionwithrandomforest toclassifystationaryASLletterswith92%accuracy.Further,theLeaphasbeenusedwithsupportvectormachineattaining80%accuracy.

§ LeapMotion: ThisisapieceofhardwarethatconnectstoacomputerviaUSB.Weuseittocollectapproximately60framespersecondwhereeachframeholds372featuresonthehands.Thedatatakestheformofagrayscalestereoimageofthenear-infraredlightspectrum,separatedintotheleftandrightcameras.

BACKGROUND

§ Wehaveachieved95%accuracyinidentifyingthe26lettersinthealphabet

§ Confusionmatrix:

RESULTS AND FUTURE WORK

§Dong,C.,Leu,M.C.,&Yin,Z.(2015).AmericanSignLanguageAlphabetRecognitionUsingMicrosoftKinect.2015ComputerVisionandPatternRecognition.IEEE.

§ T.Kim,J.Keane,W.Wang,H.Tang,J.Riggle,G.Shakhnarovich,D.Brentari,andK.Livescu,"Lexicon-FreeFingerspellingRecognitionfromVideo:Data,Models,andSignerAdaptation”arXiv:1609.07876v12016

§H.Sakoe andS.Chiba,“Dynamicprogrammingalgo- rithmoptimizationforspokenwordrecognition,”IEEETrans.Acoust.,Speech,SignalProcess.,vol.26,no.1,pp.43–49,1978.

§ C.H.Chuan,E.ReginaandC.Guardino,"AmericanSignLanguageRecognitionUsingLeapMotionSensor,"MachineLearningandApplications(ICMLA),201413thInternationalConferenceon,Detroit,MI,2014,pp.541-544

REFERENCES

SIGN LANGUAGE LEAPS TO ENGLISHNicholasFrost,WilliamGrant,Kien Nguyen,andParth Parikh

{naf77,wrg34,khn22,prp60}@scarletmail.rutgers.eduAdvisor:ProfessorAnand Sarwate

RutgersUniversity,DepartmentofElectricalandComputerEngineering

METHODOLOGY

Leap:Usehandtosignaletter

Datareaderandrawdatalistener:ProducesaJSONfilewithallthe

informationonthehand

Transformer:Sample,filter,andnormalizeraw

data

RecurrentNeuralNetwork:Trainand

classifyletterviaKeraslibrary

OutputEnglishletterandhistogramwithtopfiveclassificationsviaafriendlyuserinterface

WewouldliketoacknowledgeProfessorHanaGodrich andundergraduatestudentMichaelSoskind fortheirconsistentsupportthroughouttheproject.WefurtherwouldliketothanktheRutgersSignLanguageClubandspecificallyIsabeau Touchard forbeingthesourceofauthenticsignlanguagedata.

ACKNOWLEDGEMENTS

DatacollectionexampleandLeapdiagnosticviewer:

https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0ahUKEwjM-K-bjKfTAhXJ7CYKHfr9CgkQjRwIBw&url=https%3A%2F%2Fwww.pubnub.com%2Fblog%2F2015-08-19-motion-controlled-servos-with-leap-motion-raspberry-pi%2F&psig=AFQjCNGjqqDvWcMW45-rLNyC52-9QbXGSQ&ust=1492367466269697

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

§ Keytakeaway:WehavedemonstratedproofofconceptforaLeapdevicetobeusedasasignlanguagetranslator.Thisdirectlyimpactsthedeafpopulationacrosstheworld.

§ Futurework§ Extendthecapabilityofthesoftwaretobeableto

identifymoreclassesincludingwordsandphrases§ Includevideodatatoidentifysignsthatdependon

relativelocationofbody

§ Conventionalclassificationmethodsdonotworkwellwithtimeseriesdata(i.e.letterswithmotion)

§ Wesolvethisbyemployingarecurrentneuralnetwork,specificallylongshorttermmemorytoextracttemporalfeatures