natural language processing in ios / osx
TRANSCRIPT
![Page 1: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/1.jpg)
Tech Talk NLP Tools in iOS/OSX
Todd Kramer
![Page 2: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/2.jpg)
NLP Tools in iOS/OSX: Topics
• CFStringTransform • transliteration, normalization
• CFStringTokenizer • string tokenization, language identification
• UITextChecker • spell check
• NSLinguisticTagger • parts of speech tagging, named entity recognition,
lemmatization, language/script identification
• NSDataDetector • data detection
![Page 3: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/3.jpg)
NLP Tools in iOS/OSX: CFStringTransform
The CFStringTransform Function
![Page 4: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/4.jpg)
NLP Tools in iOS/OSX: CFStringTransform
Transliterate Thai to Latin
Original: สวัสดี; Transformed: sw̄ạsd̄ī
![Page 5: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/5.jpg)
NLP Tools in iOS/OSX: CFStringTransform
![Page 6: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/6.jpg)
NLP Tools in iOS/OSX: CFStringTransform
Transliterate Latin to Gujarati
Original: Gujarātī; Transformed: ગuજરાતી
![Page 7: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/7.jpg)
NLP Tools in iOS/OSX: CFStringTransform
Remove Diacritics and Accents
Original: sw̄ạsd̄ī; Transformed: swasdi
![Page 8: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/8.jpg)
NLP Tools in iOS/OSX: CFStringTransform
Describe Unicode Characters
Original: 👍; Transformed: \N{THUMBS UP SIGN}
![Page 9: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/9.jpg)
CFStringTokenizer
![Page 10: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/10.jpg)
NLP Tools in iOS/OSX: CFStringTokenizer
Tokenize Into Words: Simplified Chinese
Tokens: [⼈人, ⼈人⽣生, ⽽而, ⾃自由, 在, 尊严, 和, 权利, 上, ⼀一律, 平等, 他们, 赋有, 理性, 和, 良⼼心, 并, 应, 以, 兄弟, 关系, 的, 精神, 互相, 对待]
![Page 11: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/11.jpg)
NLP Tools in iOS/OSX: CFStringTokenizer
Transliterate Tokens: Simplified Chinese
Tokens: [rén, rénshēng, ér, zìyóu, zài, zūnyán, hé, quánlì, shàng, yīlv,̀ píngděng, tāmén, fùyǒu, lǐxìng, hé, liángxīn, bìng, yìng, yǐ, xiōngdì, guānxī, de, jīngshén, hùxiāng, duìdài]
![Page 12: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/12.jpg)
NLP Tools in iOS/OSX: CFStringTokenizer
Language Identification: Icelandic
Language Code: is
![Page 13: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/13.jpg)
UITextChecker
![Page 14: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/14.jpg)
NLP Tools in iOS/OSX: UITextChecker
![Page 15: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/15.jpg)
NLP Tools in iOS/OSX: UITextChecker
Spell Check
Misspelled Range: (7,4); Guesses: Optional([ice, Bice, bide, nice, vice, bike, bile, bite, bace, bbce, bcce, bdce, bece, bfce, dice, lice, mice, pice, rice, brice, bicep]) Misspelled Range: (12,3); Guesses: Optional([ay, cay, day, say])
![Page 16: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/16.jpg)
NSLinguisticTagger
![Page 17: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/17.jpg)
NLP Tools in iOS/OSX: NSLinguisticTagger
Parts of Speech Tagging and Named Entity Recognition
![Page 18: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/18.jpg)
NLP Tools in iOS/OSX: NSLinguisticTagger
NSLinguisticTagger Schemes
![Page 19: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/19.jpg)
NLP Tools in iOS/OSX: NSLinguisticTagger
Parts of Speech Tagging and Named Entity Recognition
![Page 20: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/20.jpg)
NLP Tools in iOS/OSX: NSLinguisticTagger
Parts of Speech Tagging and Named Entity Recognition
Token: What; Tag: Pronoun Token: is; Tag: Verb Token: the; Tag: Determiner Token: capital; Tag: Noun Token: of; Tag: Preposition Token: New York; Tag: PlaceName
![Page 21: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/21.jpg)
NLP Tools in iOS/OSX: NSLinguisticTagger
Script Identification
![Page 22: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/22.jpg)
NLP Tools in iOS/OSX: NSLinguisticTagger
Script Identification
Token: hello; Tag: Latn Token: สวัสดี; Tag: Thai Token: bonjour; Tag: Latn Token: 你; Tag: Hani Token: 好; Tag: Hani Token: !લો; Tag: Gujr Token: привет; Tag: Cyrl Token: नमस्ते; Tag: Deva
![Page 23: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/23.jpg)
NSDataDetector
![Page 24: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/24.jpg)
NLP Tools in iOS/OSX: NSDataDetector
Extracting Structured Data
![Page 25: Natural language processing in iOS / OSX](https://reader030.vdocuments.site/reader030/viewer/2022032616/55a6bfa41a28ab36688b4744/html5/thumbnails/25.jpg)
Match: Lunch tomorrow at 12:30PM; - Date: Optional(2014-11-20 20:30:00 +0000) Match: 1600 Pennsylvania Ave. NW, Washington, D.C. 20500; - Street: Optional(1600 Pennsylvania Ave.); - Zip: Optional(20500) Match: 202-456-1414 Match: 2:15PM; - Date: Optional(2014-11-19 22:15:00 +0000) Match: Southwest Airlines Flight 737 Match: www.southwest.com
NLP Tools in iOS/OSX: NSDataDetector
Extracting Structured Data