communities, collaboration, content: can we make … · communities, collaboration, content: can we...
TRANSCRIPT
14-16 September 2009 | Cancun, Mexico
Communities, Collaboration, Content: Can We Make It Work? Don DePalma, Common Sense Advisory
Horst Liebscher, euroscript Systems (with guest appearances from Adobe Systems)
Host: Hans Fenstermacher, Translations.com
Speakers
• Don DePalma – Common Sense Advisory
• Horst Liebscher – Solution Architect, euroscript Systems
• Dirk Meyer – Product Manager, Adobe Systems
• Jean-François Vanreusel – Director of Product Localization, Adobe Systems
Communities and Crowds
• “Crowdsourcing” – Wikipedia: “Taking a task traditionally performed by an employee or a
contractor to a group of people or community in the form of an open call.” – Not new, but blossoming through availability of new interaction channels – Potential for reward (and failure) is significant
• Communities – Changing the way companies interact with their customers – Companies need to be more open for bi-directional interaction – Many forms of user involvement and empowerment (open vs. controlled,
growing organically vs. filling frameworks/Crowdsourcing in its strict sense) – Cost savings are not the main driver – Can increase brand awareness & customer loyalty and help market
products
Some Challenges
• Evolving role of communities in product development
• Changes in supporting workflows (e.g., localization)
• Giving up control • Industry adaptation to new paradigms
Changes in Scope
• New types of content – Content companies wouldn’t spend time/money to get
translated – User-generated content (blogs, how-tos, community
forums), UI (Facebook), videos • New types of services
– Building and managing the community – Managing consistency – Tools development to enable community workflows
LSP Reactions to Community Paradigm • Criticize quality/results of community and
collaborative translations (“stay their ground”)?
Or • Seize new opportunities?
Changes Coming for LSPs
• Compensation models • Time tables • Quality expectations • Infrastructure and technologies
14-16 September 2009 | Cancun, Mexico
Collaborative Content
Don DePalma Common Sense Advisory, Inc.
Too much information…
• “Post-privacy society” means lots of UGC
• Library of Congress catalogues every tweet ever made
• Blippy (purchases), Foursquare (location), Skimble (exercise regime), TripIt (travel)
… or fodder for business intelligence?
• Finding: “Number of cheese sandwiches being consumed at any point in time”
• Collective intelligence • Sentiment analysis • Actionable info: Milk
more cows, bake more bread
Holy Trinity for analyst community: People, process, technology • People = employees, customers, business
partners, extraterrestrials, and other random visitors
• Process = collection, management, and display of content
• Technology = automation for dealing with all of this stuff
Content
File-type (PDF, etc.) Operating system Database, CMS Syndication Form factors Combinations Collaborative filtering
Operational & Platform
Conversions
Inputs
•UGC types: Text, images, interactive multimedia •Input mechanisms: E-mail blogs, Twitter, SMS
Market Transformations
Personalization Demographic Commerce Compliance Speech Register Language Dialect
Volume Variables Venue Variety Volatility Velocity
Cardinality Input (x) Transformation (z) Conversion (y) * *
Runaway cardinality
Copyright © 2010 by Common Sense Advisory, Inc.
Triage to fix the toxic terabyte problem • Focus resources on managing
high-value information, while spending less time on low-value content
• Improve noisy data (tweets, chat, broadcast, newswire)
• Deal with IP and other ownership issues • Automatically tag and categorize content from
known, “good” applications • Determine content flows by language
The technology to support it
• Open repository for managing linguistic assets, regardless of input source
• Crowd control to manage translators and volunteers – and technology usage
• Machine translation to chip away at the huge volumes
• “g-Discovery” for harvesting multilingual information flows
Two approaches
• MT for user-generated content, with post-editing flag for most frequently searched-for, requested, or found content
• Set up communities on the same technology stack so that they cross borders rather than country-specific communities
Collaborative Content
Horst Liebscher :: Gala :: Prague, May 12th 2010
Background :: euroscript
Brüssel BELGIEN euroscript Delt Belgium
Bertrange LUXEMBURG euroscript International euroscript Luxembourg euroscript Delt Luxembourg
Eindhoven NIEDERLANDE Delt ICT Services
Riga LETTLAND euroscript Baltija
Krakau POLEN euroscript Polska
Budapest UNGARN euroscript Magyarország
Bukarest & Sibiu RUMÄNIEN Euroscript-Certitude
Zürich, Genf, Kreuzlingen SCHWEIZ euroscript Switzerland
Berlin Augsburg Bonn Frankfurt/Main DEUTSCHLAND euroscript Deutschland euroscript Süddeutschland docConsult
Montreal KANADA syselog Canada
Neu-Delhi INDIEN syselog India
Paris Bagnols s/Cèze Brest Cherbourg Guyancourt Le Havre Lorient Lyon Marseille Montpellier Nantes Toulon Toulouse FRANKREICH eurodoc Services eurodoc Systems
Background :: euroscript MCT
Creating Translating Publishing
Consulting
Opt. of Processes
Systems, Integratios
Services
Multilingual Content Technologies
Gala 2010 Prague, Horst Liebscher, slide 19
Background :: Service portfolio MCT
Source Language Quality Optimization /
Authoring Support
Terminology
Computer Aided Translation Conversions
Machine Translation
Multilingual Enterprise Content Management
ECM MT LS
Gala 2010 Prague, Horst Liebscher, slide 20
Intro :: Services & Industry
LSP is no bakery We do not produce
translations, put them into a shop and wait for somebody who comes along to by them. We “produce” on demand: That’s not industry. It’s
services.
Gala 2010 Prague, Horst Liebscher, slide 21
more and more translations become part of industrial products LSP’s services have to be
integrated into industrial environments new challenges: Workflows Orga QA calculations time to market
Services Industry
Intro :: Services & Industry [2]
Gala 2010 Prague, Horst Liebscher, slide 22
in the past manufacturing today more and more industrial context of services What does this mean regarding content?
Content :: What is Content?
in general: text? sentenses/paragraphs? Phrases? n-grams? words?
Gala 2010 Prague, Horst Liebscher, slide 23
stored informations plus meta plus structure plus context plus etc.
In other words: Documents/Modules TMS-content Terminology
Classes of Content
Content :: Classes
Gala 2010 Prague, Horst Liebscher, slide 24
bi-, tri-, n-lingual TM, Terminology, parallel corpora multilingual content language models (SMT) AA
Content :: Questions
Gala 2010 Prague, Horst Liebscher, slide 25
How was the content born? How was it created? Who is allow to share what? Who has the rights? source :: technical writer/developer? target :: translater?
Collaboration :: Synergy
Gala 2010 Prague, Horst Liebscher, slide 26
direct reuse of content (AA, TMS) indirect reuse (statistically based systems as SMT) where is the level of differenciation global collaboration? branches, technologies? enterprises? language specific? run into legal problems
Collaboration :: Example TM Systems
Gala 2010 Prague, Horst Liebscher, slide 27
TMS = collaboration tool level of standardization formats origins
Collaboration, Content :: Interests
Gala 2010 Prague, Horst Liebscher, slide 28
Who needs content? What for?
“Give us your content!” Interest of 3rd parties
(AO, google, LW, Taus)
own interest (terminology in RBMT systems)
MT Reuse “be happy & translate my
content & have fun” Crowdsourcing traditional creation/translation
Collaborative Content :: be careful!
Gala 2010 Prague, Horst Liebscher, slide 29
global content collaboration content collaboration across branches, technologies clouds black holes complicated “neutralization” of content minor effects legal problems quality problems
Collaborative Content :: do it!
Gala 2010 Prague, Horst Liebscher, slide 30
Collaborate in a useful way Control! Collaborate qualified content only classes of content involved QA processes collaborate “internally” “closed environments” TMS, MT, Terminology Terminology as the central aspect
Thank you for your attention.
Horst Liebscher, euroscript Systems
14-16 September 2009 | Cancun, Mexico
Communities, Collaboration, Content: Can We Make It Work? Don DePalma, Common Sense Advisory
Horst Liebscher, euroscript Systems (with guest appearances from Adobe Systems)
Host: Hans Fenstermacher, Translations.com