flavius meeting – wp4
DESCRIPTION
FLAVIUS Meeting – WP4. June 7, 2010. Giurgiu Bogdan Wong William. Agenda. About Language Weaver R&D work Customer experience LW core mission LW architectural contribution Deliverables Roadmap Internal milestone rollup Questions & Answers. Language Weaver at a Glance. - PowerPoint PPT PresentationTRANSCRIPT
FLAVIUS Meeting – WP4June 7, 2010
Giurgiu BogdanWong William
Agenda• About Language Weaver • R&D work• Customer experience• LW core mission• LW architectural contribution• Deliverables• Roadmap• Internal milestone rollup• Questions & Answers
Language Weaver at a GlanceFounded 2002. The first commercial science breakthrough in
high speed statistical human language translation
Offices Los Angeles (HQ), Washington DC, Boston, San Francisco, Paris, London, Brussels, Tokyo, & Cluj
Employees 105
Management• Mark Tapling, President & CEO• Daniel Marcu, Founder & CTO• William Wong, Founder & VP Engineering• Adrian Gocan, Country Manager – Cluj Office
Markets Served
• Digital Content / Social Media• Customer Support• Government Intelligence
Language Weaver delivers human communication solutions through trusted automated language translation
Language Weaver Romania
Established 2008
Employees ~50 employees, 5 open positions
Key Areas of Expertise
Development, engineering, telemarketing/marketing, linguists
2 active contracts with the European Union are driven out of this location – FAUST and FLAVIUS
Partnerships Language Weaver Srl has a partnership with Cambridge University to deliver research solutions
FLAVIUS Contributors• LW SRL, Romania
– Daniel Marcu, CTO – Ionel Condor, Engineering Manager– Bogdan Giurgiu, Project Manager – Ana Totea, Engineer– Bogdan Faraga, Engineer– Daniel Sarbe, Engineer– Matei Nicolae, Engineer
R&D Projects• Research Projects
– Improve syntax-based SMT (DARPA funded)– Small footprint systems for SMT– Domain customization techniques for SMT
• R&D Projects– GALE Operational Engines – Broadcast Monitoring
Solutions (speech2text translation)– FAUST – FP7 EC Project
Currently Available Language PairsWestern European Middle Eastern & African Eastern European Danish to/from EnglishDutch to/from EnglishFrench to/from EnglishFrench to/from SpanishFrench to/from GermanItalian to/from EnglishItalian to/from SpanishGerman to/from EnglishGerman to/from SpanishGreek to/from EnglishNorwegian to/from EnglishPortuguese to/from EnglishSpanish to/from EnglishSwedish to/from English
Arabic to/from EnglishArabic to/from FrenchArabic to/from SpanishDari to/from EnglishHebrew to/from EnglishHausa to/from EnglishPashto to/from EnglishPersian to/from EnglishSomali to/from EnglishTurkish to/from EnglishUrdu to/from English
Bulgarian to/from EnglishCzech to/from EnglishHungarian to/from EnglishPolish to/from EnglishRomanian to/from EnglishRussian to/from English Serbian to/from English
Asian Simplified Chinese to/from EnglishTraditional Chinese to/from EnglishHindi to/from EnglishJapanese to/from EnglishKorean to/from EnglishThai to/from EnglishBengali to/from English
*Latest product release enables LW to translate to and from any language that is available with limited quality
Our Customer DeploymentsQUALITY
BaselineTra
ined
Post
Edi
t
Custom
er Care
Digital C
ontent
Lega
l Mark
eting
Publi
catio
ns
FACT INFLUENCE
Governments
Our Customer ExperienceQUALITY
BaselineTra
ined
Post
Edi
t
Custom
er Care
Digital C
ontent
Lega
l Mark
eting
Publi
catio
ns
FACT INFLUENCE
Governments
• Baselines are inadequate for FAUT (fully automated useful translation)
• Lacks utility of translation (usefulness)• Basic translation(gisting) does not convey publisher
needs such as terminology
Customer ExperienceQUALITY
BaselineTra
ined
Post
Edi
t
Custom
er Care
Digital C
ontent
Lega
l Mark
eting
Publi
catio
ns
FACT INFLUENCE
Governments
• Human post edit for preservation of publisher voice
• Humans productivity limited to 2.500 words per day
• High cost prevents time critical high volume publication and user generated content
Customer ExperienceQUALITY
BaselineTra
ined
Post
Edi
t
Custom
er Care
Digital C
ontent
Lega
l Mark
eting
Publi
catio
ns
FACT INFLUENCE
Governments
• Convergence of utility vs. ROI
• Proven trust in actionable content over baseline engines
• Significant cost reduction from influence oriented communications
• Liberates publisher & user generated content
Core Mission for FLAVIUS
Accelerate the adoption of FAUT on a broad scale by leveraging easy customization of domain verticals for content publishers.
FLAVIUS Content Management System
Language Weaver’s Contribution
Keys to a Successful Partner Integration1. Ability to integrate with Language Weaver
Machine Translation for development and testing
2. Ability to customize baseline engines with dictionaries
3. Ability to customize baseline engines with training of domain/customer specific vertical system
Accomplishments To Date (M3)• No pre-financing is expected• Negotiated purchase agreement between LW SRL
and Dell Computers• Purchased 14 Dell servers• Purchased Cisco network switch• Entered into collocation agreement between LW SRL
and Latisys (hosting location in Irvine, CA)• All hardware delivered to Latisys• LW Inc. IT staff installed and deployed to TOD
(Translations on Demand) at 0 cost to LW SRL.• Available languages: English to French, Spanish,
Italian, German, Polish, Romanian, Swedish and vice-versa
Current Activities (M3)
REST APITOD
• LW setup integration partner accounts• Partner start development using TOD REST
API:– HTTP base communication protocol
• Web 2.0 used by Amazon, Twitter, etc.– Supported text formats: TXT, HTML, TMX,
XLIFF
Upgrade TOD Framework (LW Milestone)• Internal milestone for LW to migrate partner
accounts to upgraded TOD framework in month 9
• Provide new functionality outside of the FLAVIUS project but materially benefits the teams.
• Extends current REST API• Trustscore™ enabled baseline engines
• Utility not quality based assessment• Deployed for TripAdvisor and Dell
• Reporting of basic statisticsREST APITOD
Reporting Trustscore™
Customization via Dictionary (M12)• REST API enabled dictionary support • Dictionary upload through API• A dictionary will be specific to an account,
per language pair• i.e. Dell (account), Eng-Spa(LP), Servers
Terminology (dictionary – 1+)
REST APITOD
Reporting Trustscore™ Dictionary
Customization via Training (M21)
d
Parallel Aligned Text
Optional: Regression Text
Optional: Test Text
Evaluation
Data:• Fix noisy text• More text• Text alignment• Text segmentation
Product Delivery viaTOD
LW TrainingCompute Cloud
REST APITOD
Reporting Trustscore™
DictionaryTraining
Complete Picture
REST APITOD
Reporting Trustscore™
DictionaryTraining
FLAVIUS Language Weaver Roadmap
Internal milestonesWho What WhenLW Translation Engines Up and
RunningJune 30th
LW REST API to be used to access the SMT
June 30th
SFT Architecture document should include the details of the translation API
TBD
TBD Project presentation June 30th TBD Project website June 30th
Project logo June 30th
Questions & Answers
Thank you!Accelerating the way the world communicates