center for research in urdu language processing pan localization project a regional initiative to...
TRANSCRIPT
Center for Research in Urdu Language Processing
PAN Localization ProjectA Regional Initiative to Develop Local Language
Computing Capacity in Asia
ثناء گل اردو تحقیقاتمرکز
۲۰۰۵ ، پاکستانSANA GULCenter for Research in Urdu Language Processing
Pakistan, 2005
Center for Research in Urdu Language Processing
Introduction to Center for Research in Urdu Language Processing
Introduction to PAN Localization Project
Scope of Localization & Introduction to this Training
Presentation Highlights
Center for Research in Urdu Language Processing
Center for Research in Urdu Language Processing
CRULP ObjectivesCRULP Objectives
► To conduct To conduct linguistic researchlinguistic research for Urdu and regional for Urdu and regional
languageslanguages
► To participate in To participate in standardization effortsstandardization efforts in Urdu and in Urdu and
regional languagesregional languages
► To evolve To evolve computational modelscomputational models of Urdu and regional of Urdu and regional
languageslanguages
► Promote Promote content developmentcontent development in Urdu and regional in Urdu and regional
languageslanguages
Center for Research in Urdu Language Processing
CRULP ResearchCRULP Research
► LinguisticsLinguistics
► Script ProcessingScript Processing
► Language ProcessingLanguage Processing
► Speech ProcessingSpeech Processing
Center for Research in Urdu Language Processing
CRULP ResourcesCRULP Resources
►TeamTeam 4 Full-time Faculty Members4 Full-time Faculty Members Adjunct FacultyAdjunct Faculty 12 Graduate Students12 Graduate Students 45 Undergraduate Students45 Undergraduate Students 25 Full-time staff25 Full-time staff
Center for Research in Urdu Language Processing
CRULP CourseworkCRULP Coursework
► Phonetics and PhonologyPhonetics and Phonology► Morphology and SyntaxMorphology and Syntax
► Digital Signal Processing Digital Signal Processing ► Random Variables and Stochastic Processes Random Variables and Stochastic Processes ► Speech ProcessingSpeech Processing
► Computational LinguisticsComputational Linguistics
► Image ProcessingImage Processing► Calligraphy and Font DevelopmentCalligraphy and Font Development
Center for Research in Urdu Language Processing
CRULP Research - LinguisticsCRULP Research - Linguistics
►AreasAreas Acoustic Phonetics Acoustic Phonetics PhonologyPhonology MorphologyMorphology SyntaxSyntax
Center for Research in Urdu Language Processing
CRULP Research - ScriptCRULP Research - Script
► Font Development: Nafees Font FamilyFont Development: Nafees Font Family Nafees Nasta’leeq, Nafees Naskh, Nafees Pakistani Nafees Nasta’leeq, Nafees Naskh, Nafees Pakistani
Naskh (Urdu, Punjabi, Pashto, Sindhi, Balochi, Siraiki)Naskh (Urdu, Punjabi, Pashto, Sindhi, Balochi, Siraiki) Freely downloadable from Freely downloadable from www.crulp.orgwww.crulp.org Supported mainly by UNDP/IDRC/APNIC Small Grants Supported mainly by UNDP/IDRC/APNIC Small Grants
Program and partially by Microsoft, PakistanProgram and partially by Microsoft, Pakistan
►Optical Character RecognitionOptical Character Recognition Naskh (segmentation based)Naskh (segmentation based) Nasta’leeq (Ligature based)Nasta’leeq (Ligature based)
Center for Research in Urdu Language Processing
Center for Research in Urdu Language Processing
Nasta’leeNasta’lee
KufiKufi
SulusSulus
DiwaniDiwani
RiqaRiqa
NaskhNaskh
القمر ومسالش سخرو
Center for Research in Urdu Language Processing
CRULP Research - LanguageCRULP Research - Language
►Corpus DevelopmentCorpus Development►Computational Linguistic ApplicationsComputational Linguistic Applications
Spell CheckerSpell Checker Grammar CheckerGrammar Checker LexiconLexicon English to Urdu Machine TranslationEnglish to Urdu Machine Translation
Center for Research in Urdu Language Processing
CRULP Research - SpeechCRULP Research - Speech
►Text to Speech SynthesisText to Speech Synthesis►Automatic Speech RecognitionAutomatic Speech Recognition
Center for Research in Urdu Language Processing
ProjectsProjects
►Nafees Font FamilyNafees Font Family►Urdu Localization ProjectUrdu Localization Project►Microsoft Spell CheckerMicrosoft Spell Checker►PAN LocalizationPAN Localization
Center for Research in Urdu Language Processing
PAN Localization Project
www.panl10n.net
Center for Research in Urdu Language Processing
PAN Localization Project
Partnership PAN program of IDRC CRULP at NUCES
Objectives Develop localization technology for Asian
languages Develop human resource to develop and use
localized computing Research into policy framework to develop
local language computing Timelines
January 2004 till December 2006
Center for Research in Urdu Language Processing
PAN L10n Project Collaborations
1. BRAC University, Bangladesh2. Department of IT, Ministry of Information and
Communications, Bhutan3. Khmer Computerization Committee, National ICT
Development Agency, Cambodia4. Science Technology and Environment Agency, Laos5. Madan Puraskar Pustakalaya & Tribhuvan
University Nepal6. University of Colombo School of Computing, Sri
Lanka7. …
Center for Research in Urdu Language Processing
Salient PAN L10n Project Outputs
Localization Technology Asian Localization Peer Support
Network Bibliography of Asian Localization Who’s Who of Asian Localization Multi-lingual Website:
www.PANL10n.net Asian Localization Handbook
Center for Research in Urdu Language Processing
Country-wise Project Outputs
Center for Research in Urdu Language Processing
Scope of Localization
Center for Research in Urdu Language Processing
Localization
“enabling computing experience according to linguistic culture of the user”
Center for Research in Urdu Language Processing
Localization Requirements
Standards Basic Applications Intermediate Applications Advanced Applications Soft Issues
Center for Research in Urdu Language Processing
Standards
Character Set Keyboard/Keypad layout Locale Collation Sequence Terminology Translation Fonts (?) …
Center for Research in Urdu Language Processing
Basic Applications
Character set encoding(s) Utility for converting among
various encodings Keyboard/Keypad drivers Collation algorithm Local language interface Fonts for various devices …
Center for Research in Urdu Language Processing
Intermediate Applications
Find/Replace utility Natural language
processor/Bidirectional processor Lexicon Spell checker …
Center for Research in Urdu Language Processing
Advanced Applications
Grammar checker Automatic speech recognition Text to speech system Automatic machine translation Optical character recognition Handwriting recognition Speech to speech translation …
Center for Research in Urdu Language Processing
Introduction to Training
Objectives Overview scope of localization Study in detail basic issues regarding
localization standards and development
Develop Asian peer support network
Center for Research in Urdu Language Processing
Summary of Topics
Encoding Standards Font Development Localization on Microsoft Platform Localization on Linux Platform Defining Normalization and
Collation Overview Advanced Applications Overview Software Engineering
Center for Research in Urdu Language Processing
ہی شکر
SANA [email protected]
Regional Research OfficerPAN Localization project
(www.panl10n.net)