sindhi computing in human language technology
TRANSCRIPT
PRESENTED BY AMAR FAYAZ BURIRO
Where we are?
SINDHI COMPUTINGIN
HUMAN LANGUAGE TECHNOLOGY
HUMAN LANGUAGE TECHNOLOGY
AMAR FAYAZ BURIRO (Language Engineer)
HLT Definition
Alternative Name
• Language Technology
• Natural Language Technology
Human Language
Technology is the study of
computational systems that
process natural languages in
all forms i.e. in Text, image
or in Voice.
THREE LAIRS OF HLT
AMAR FAYAZ BURIRO (Language Engineer)
01. WORD PROCESSING
Input & output of alphabet & grammar (PoS)
locally or on remote servers, font development,
translations, transliterations, Web embedding,
lairing of core coding etc.
02. IMAGE PROCESSING
Optical Character Recognition (OCR) of core
& traditional typographies, boxing (XY
determination) of traditional typography (fonts)
03. VOICE PROCESSINGText to Speech & Speech to text transformation,
training of dialects, age, gender, moods, atmosphere
etc. in HLT.
Voice
ProcessImage
ProcessWord
Process
And finally Artificial Intelligence
AMAR FAYAZ BURIRO (Language Engineer)
Artificial Intelligence
Forth destination & gateway to infinity
In this arena, languages in HLT can be incorporate
in all life related engineering, technologies,
medical science, space science, agriculture,
motors, robots etc.… etc.…
Mechanism of Human Language Technology
AMAR FAYAZ BURIRO (Language Engineer)
o Alphabet & Coding
o Font development
o Content Migration
o Web embedding
o Standardization of Fontgraphy
o Grammar
o Parts of Speech mechanism
o Translation
o Transliteration
o OCR
o Post result Correction
o Alphabet recognition
o Training of alphabetical letters
o Boxing of Alphabet
o XY standardization of
alphabet
o Text to speech lairing
o Speech to text lairing
o Dialects recognition
o Training of different ages,
gender voices
o Standby algorithm
o Educational help liners
o Reshaping languages to
Artificial Intelligence
Image Processing Voice ProcessingWord Processing
Where Sindhi Language stands
AMAR FAYAZ BURIRO (Language Engineer)
We have worked on:
1. Language adding locally /
remote servers.
2. Added Sindhi alphabet in
core query.
3. Font development.
4. Web embedding.
5. Content compilation
6. ????????????????
Word
Unicode allotment,
Alphabet, fonts embedding,
content migration
Image
No advancement,
but development is
in progress Voice
No advancement
Just 10% work has been
done!
AMAR FAYAZ BURIRO (Language Engineer)
Source:
Center of Language Engineering
Lahore
Sindh in Information Technology @ Glance
AMAR FAYAZ BURIRO (Language Engineer)
Think about the future of Sindh and
synchronization with global market!
According to local market and IP as well IPS
statistics
And this is the future technology which goes on
high and high.
This literacy ratio is based on census and collected
data provided by NADRA.
English 10% People understand
Literate persons
Computers
Mobile Phones Access
58%
11% people have
70%
Human Language Technology necessary to bridge the gap.
Minus & Plus FACTORS
AMAR FAYAZ BURIRO (Language Engineer)
Lack of
institute
Lack of
Language
Engineers
UNICODE,
Nasakh,
Dedicated
developers
Due to lack of autonomous institute
HLT of Sindhi language is orphan
where huge budget can be utilized in
advancement of Language
Computing.
Un-availability of language engineers
brings irrational emotionalism and non-
scientific approach. Similar phonetic
letters in Sindhi alphabet had closed
doors in further advancement
Free lance Sindhi developers have
been programming in content shifting,
UNICODE had merged Sindhi
Language in UTF-8, Typography of
Sindhi is Nasakh which made easy
accessing.
Factors where Sindhi Language is in driving mode or in retrieving path.
WHAT WE NEED?
AMAR FAYAZ BURIRO (Language Engineer)
Autonomous Institute
Where experts of sociological background,
linguistics and computational engineers
may be appointed and handsome
budget may be allotted.
Research ApproachResearch on all developments and
programming of advancement must be
incorporated in Sindhi HLT.
Visionary Developers
Binary coders, developers of web, apps,
portals and cloud computing experts
must be identified and hire them asap.
Discuss Dialogue
Regarding similar phonetic alphabetical
characters, forthcoming challenges the
discuss dialogues are needful with
linguistics, publishers, political leaders
etc.….
Autonomous Institute Visionary Developers Research Discuss Dialogues
SOME MYTHS & REALITIES
AMAR FAYAZ BURIRO (Language Engineer)
It is stated everywhere that Sindhi
language is not completely
added in UNICODE consortium,
that is why some characters are
not properly working.
Truth is, in UTF-8 all Arabic
structured languages including
Sindhi has been added and all
characters, glyphs, punctuation
marks have now properly their
own codes.
The actual problem relates with Font-
developers and in core alphabet
of Sindhi Language.
ABOUT UNICODE SINDHI COMPUTING LINGUISTICS / LAN-ENG
Just Font development is actually
Sindhi Computing! Which isn’t
correct.
Fonts are simply dresses over
Skelton or body. Fonts relate
with designing of written
material. Similarly designing is
not actually core development
or programming. It is just over-
writing on demo content.
Some mobile manufacturing models
don’t support Sindhi language,
therefore Sindhi language is
not Universal.
Reality is the default font of mobile
system has not eligible as per
INTEL standards and in beta
phase.
We do believe that a linguistic is
also a Language Engineer.
Reality is there is huge difference
between linguistic & a
language engineers. A
linguistic focuses on meaning
of words and properly usage
of the language but an
engineer emphasis over the
characters and their migration
to the universalized digital
formats.
A language engineer processes the
language in multi-mode
programming of Computing
technology.
AND NOW TIME OF RETHINKING
AMAR FAYAZ BURIRO (Language Engineer)
REMAIN or CHANGE
THANK YOU DEARS
، پَڙهَجي۾ ِڪتابن ، جو اِئين َم ٿِئي! ااَل
!جي ٻوليِسنڌ َوارن ۽ ته ُهئي ِسنڌ
(شيامنارائڻ )
Amar Fayaz Buriro
IT Specialist & Language Engineer