machine translation manuel herranz pangeamt taus barcelona

18
Don't be afraid to provide the tools to those who need them Manuel Herranz – PangeaMT - Pangeanic www.pangea.com.mt - User empowerment - DIY SMT

Upload: manuel-herranz

Post on 20-Jun-2015

11.836 views

Category:

Technology


3 download

DESCRIPTION

how machine translation is about empowering users and how users can be empowered using DIY SMT technology to build their own statistical machine translation solutions

TRANSCRIPT

Page 1: machine translation manuel herranz PangeaMT TAUS Barcelona

Don't be afraid to provide the tools to those who need them

Manuel Herranz – PangeaMT - Pangeanic

www.pangea.com.mt

- User empowerment -DIY SMT

Page 2: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

http://t.co/HDTboxQ

USERS

80% like 19% not like 1% done before

User empowerment

Page 3: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

http://t.co/HDTboxQ

USERS

80% like 19% not like 1% done before

User empowerment

Meaning of USER becoming closely related to COMMUNITY, POWER, FEEDBACK, ACCOUNTABILITY

Page 4: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

Humankind constant search for

TOOLS

better more other things

http://t.co/HDTboxQ

Page 5: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

Humankind constant search for

TOOLS

better more other things

An instrument for making material changes on other objects […]. Tools are the primary means by human beings control and manipulate their physical environment – Encyclopedia Britannica.

http://t.co/HDTboxQ

Page 6: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

MT: Another translator out of business ...... ?

resources tools[technology]

Page 7: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

resources tools[technology]

In 20th-21st century also a fight to control and manipulate

INFORMATION [data]

ACCESS [data]

Page 8: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

21st century

INFORMATION [data]

ACCESS [data]

IS THE ERA OF• SHARING • OPEN

Page 9: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

21st century

IS THE ERA OF• SHARING • OPEN

* Communities

* Source (Linux, others)

* Data

INFORMATION [data]

ACCESS [data]

Page 10: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

21st century

IS THE ERA OF• SHARING • OPEN

* Communities

* Source (Linux, others)

* Data

USERShave the power

“We cannot solve the problem using the same tools and the way of thinking that created it” A. Einstein

INFORMATION [data]

ACCESS [data]

Page 11: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

MT at Pangeanic, from Trial to Production 2007/08

.

2009/10

2011/12

• DIY SMT • Empower Users• Glossary• Automated re-training• Transfer architecture and know-how to users• Compatibility with commercial formats (ttx, sdlxliff, itd)

2007 and before

• RB tests with commercial software• Insufficiently good output• Only internal production• EU Post-Editing Award

• V1: Small data sets (2-5M words), automotive & electronics• (ES), then Fr/It/De in other fields

• Division born • 00's of engine trials and language combinations• Open-Source to commercial• TMX / XLIFF workflows

Page 12: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

12

MT at Pangeanic, from Trial to Production

•Users provide information to improve [they are the source & target]

• Potential MT users wanted to be another Pangeanic = build their own systems

- Some can, some can’t- Other want turnkey developments- Others prefer SaaS- Most want to unwrap the black box but without walking the road

Page 13: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

Use

r em

pow

erm

ent

YEAR2016

00

0's o

f custo

mize

d M

T sy

stem

s

Predictions

PangeaMTTech. notthe realm of afew providers

Page 14: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

Use

r em

pow

erm

ent

YEAR2016

00

0's o

f custo

mize

d M

T sy

stem

s

Predictions

PangeaMTTech. notthe realm of afew providers

Page 15: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2010

2009

2018

2017

PangeaMT

Page 16: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

MT

acceptance

User em

powerm

ent

• MT acceptance growth.• Translator engagement challenge• Need for data has been addressed – still more work to be done.• Users and practitioners now can build their own systems.

Until 2011

YEAR2016

000's o

f customized

MT

systemsIn 5 years... after 2016

Predictions

PangeaMT

• Combinations??• Supra-engines??• World-knowledge?? …...suggestions....???

Tech. notthe realm of afew providers

Page 17: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2010

2009

2018

• USER EMPOWERMENT : give people the tools so they can grow their own solutions• PangeaMT provides infrastructure• Cloud Training : so users concentrate in production, not in technical bits & updates• Pressure for data availability coming from users will benefit efforts for standardization

Summary

PangeaMT

Page 18: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

18

Thank you !

MANY QUESTIONS PLEASE!!

[email protected]