internationalization and translatability for beginners

60
Internationalization and Translatability for Beginners AGIS 09, University of Limerick, 21-September-2009 Ultan Ó Broin

Upload: ultan-obroin

Post on 20-Aug-2015

1.784 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Internationalization and Translatability for Beginners

Internationalization and Translatabilityfor BeginnersAGIS 09, University of Limerick, 21-September-2009

Ultan Ó Broin

Page 2: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

About

• Ultan Ó Broin• Microsoft and Oracle localization

and internationalization• Oracle Applications User Experience• Localization World, Multilingual Web Site,

Internationalization Roundtable Advisory Boards

• Editorial Board Multilingual Magazine• Blogos• Social media use by disabled research

(TCD)• Caveats about this presentation

• Personal perspective and opinion• Not those of Oracle Corporation• Don’t rush out and buy/sell ORCL stock as a result

• Copyright and usage• Share-alike non-attribution non-commercial please • Screenshots and images remain the copyright of respective owners• Products and services may be trademarks of their respective owners• Reproduction for promotional work for non-profit use is fine, but play nice and say where

you got the information and from whom (@ultan)

© Ultan Ó Broin 2009

Page 3: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Agenda: Internationalization and Translatability

• Definitions• Organization and

process • Internationalization

issues• Character

Processing• International Variables• Translatability

• Tools and environments• What makes sense for the little guy?• Resources

© Oracle Corporation 2009

Page 4: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Presentation Objectives

• Software and documentation-centric• Learn about internationalization (I18n) process

and responsibilities • Understand core I18n issues• Identify key I18n considerations for content

development• Consider what makes sense for you• Obtain resources for further investigation• Global user experience

Page 5: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Definitions

• “Internationalization is the process of designing a product so that it can be easily localized without the need for redesign… it is the process of designing and implementing a product which is as culturally and technically “neutral” as possible, and which can therefore easily be localized for a specific culture or cultures.”

Localization Industry Standards Association (http://:www.lisa.org, accessed 15 April 2007)

Page 6: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Definitions

• “Internationalization is the process of re-engineering any information product so that it can be easily localized for export to any country in the world. An internationalized information product consists of two components: core information and international variables.”

Nancy Hoft, International Technical Communication, 1995, p. 19

• Core information: Same code used by the same product in different environments

• International variables: political, economic, social, religious, educational, linguistic, technological – the localizable, cultural, user experience elements

Page 7: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Definitions

• “The process of making information technology flexible enough to be used in different cultural and linguistic environments without changing source code.”

• “Allows choice of language and locale (collection of language and cultural preferences).”

• “Internationalized and localized products provide equivalent functionality to users in their own language while observing their cultural conventions.”

Oracle University, Introduction to Product Globalization, 2001

Page 8: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Definitions

• “Internationalization: The process of developing a program core whose feature design and code design don’t make assumptions based on a single language or locale and whose source code base simplifies the creation of different language editions of a program.”

Nadine Kano, Developing International Software, 1995, p. 4

Page 9: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Forget This

English Is Just Another

Language• Why assume it’s always written in English anyway?

Page 10: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Questions

• Can you translate any software?

• Should you?

© Ultan Ó Broin 2009

Page 11: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization: Why?

• The user experience• Communicate efficiently globally using language, customs,

symbols, conventions• Facilitates cultural adaptation – localization (L10n),

customization • Eliminates cultural bias• Minimizes management• Correct market functionality• Allows for process efficiencies in development and

localization - scalability

Page 12: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

I18n Costs

• Fix once at source during development• Very costly to fix later

© LingoPort / Multilingual Magazine 2009

Page 13: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization: Why?

• Rationale• You must localize• Globalization• Competition• Market share• Revenue• Internet • SimShip• Legal requirements• User experience, engagement, communication

© Salesforce.com 2007

Page 14: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

I18n Myths Exploded

• “They all speak English”• “Once it’s in the reader’s language, it’s OK”• “Only the stuff they see needs attention”• “Won’t be translated anyway, so no need”• “Costs too much”• “Fix it later - if we have to”• “Wrote it in Japanese, so it’s fine”• “It’s open source, whatever”• “It’s Java”• “We’re giving it away for free”• “The user can translate it”

Page 15: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Example

© Oracle Corporation 2009

Page 16: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Questions

• Global information sharing• Can you afford not to internationalize?• To sell? To communicate? To share information?

• Language matters, but it’s NOT enough.

Page 17: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Process

• Internationalization is a development responsibility• Designed• Core practice• Modularization• Integrate process• Build, test environments• No linguistic expertise required

Page 18: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Standards

• Some standards/guidelines for free • Unicode, XML, Java, HTML and so on• Ken Lunde CJVK Information Processing great overview

• Not enough• Application of standards• Educate developers• Provide tools• Enforce standards, audit• Set priorities

• Determine warnings versus failures• Common sense

Page 19: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Standards Examples

© Oracle Corporation 2009

Page 20: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Questions

• Which is better: forced or voluntary i18n?• How might you sell i18n to developers?

• No lectures, slogans, linguistics

Page 21: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Internationalization Issues

• Character processing• International variables• Translatability

Page 22: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

Character Sets

A character set is a collection of letters, numbers, punctuation marks and signs which are needed to support the creation of text in a language or languages

Set of Characters

P o

e

%

!K

& #+a

X

C

A

;

}

[bY

o~

5Character Set A

Character Set B

© Oracle Corporation 2009

Page 23: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

Character EncodingEncoding is the process of mapping a character to a bit sequence used to represent the data on the computer

H oe l l !

48 6f656c6c 21 © Oracle Corporation 2009

Page 24: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

Single-Byte Character Sets

English language can be represented by 7 bits (ASCII)

To accommodate Western European languages,an extra bit is required (256 characters)

! # $

Alphabet A Zto , a zto52

Numbers10 0 9toPunctuations % ‘ ( )“ * +

- . /[

? @,^ _ { | } ~\ ] `; < =:

]>

33

© Oracle Corporation 2009

Page 25: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

Multi-Byte Characters

To represent more than 256 characters, more than one byte is used

Sample : Traditional Chinese Big5

fe c9a5 40 ac 21

!

ab a5a2 c5 6f© Oracle Corporation 2009

Page 26: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

• “Native”Character Sets• ISO 8859-1

Windows-1250CP852Shift JISBig5EUC-KR ...

• Different codesupport

• Conflicts• Gaps• Multiple-tier /

differences

© Oracle Corporation 2009

Page 27: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

Unicode• Multilingual character encoding standard• Consistent way of encoding multilingual text data

internationally• Foundation for global software

Arabic

Chinese FrenchEnglish

Jap

anes

e

Ger

ma

n

Unicode

© Oracle Corporation 2009

Page 28: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing

Unicode encoding UTF-16 and UTF-8• UTF-16 is a two-byte fixed-length encoding scheme• UTF-8 is variable length encoding scheme

UTF-8Encoding

A

Latin1CharSet

US-ASCIICharSet

Character

41 41

c7N/A

N/A N/A 82

41

e3 81

87c3

UTF-16Encoding

41

c7

30 42

00

00

English Alphabet

French Alphabet

Japanese

© Oracle Corporation 2009

Page 29: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing Impact

Impact of Multi-Byte Character Sets• Database column widths specified in bytes but HTML form input fields specified in characters• Committing mismatched data = error Inserted value too large for the column

ID

NAME

NUMBER(3)

VARCHAR2(5)

Table Column Size

Input text field size

aaaaa

© Oracle Corporation 2009

Page 30: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing Impact

• Code conversion takes place during transfer of data between tiers that use different encoding methods

• Code conversion required if different character sets exist in a single system

• Code conversion number one issue?

ISO8859-6

ISO8859-6to

UTF8 UTF8to

ISO8859-6 UTF8

Sends to the server

Receives from the server © Oracle Corporation 2009

Page 31: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing Impact

© Oracle Corporation 2009

Page 32: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Character Processing Impact

Use Unicode support for:• Storing• Inserting• Editing• Sorting• Deleting• Searching• Wrapping• Shaping• Rendering• ….

Text field length Table Column

VARCHAR2(5)aa

VARCHAR2(15)aa aa

Solution…Enlarged UTF-8 database column size as three times than original

Page 33: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

What Character Set to Use?

• Ah, Unicode• But…• Which one?• And what about legacy content?• Moving native to Unicode?• Remember data conversion!

Page 34: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

International Variables - Locale

• Numbers• Dates• Currencies• Time and time zone• Address format• Name format• Telephone number format• Statutory compliance• Language• More…

Page 35: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Dates

• Different date formats for different locales

© Oracle Corporation 2009

Page 36: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Numbers

Country Format Numbers US 1,234,567.89

Finland 1.234.567,89 Korea 1’234’567,89Germany 1.234.567,89

• Different number formats

© Oracle Corporation 2009

Page 37: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Currency

• Different currencies worldwide

© Oracle Corporation 2009

Page 38: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Others • Salutation, Telephone, Address formats• Sort orders (linguistic v logical,…)• Units of Measure (Imperial, Metric)• Calendar (Gregorian, Japanese, Islamic…)• Business, legal rules (HRM, Financials,

Manufacturing)• VAT• Tax• GAAP• Statutory compliance• SSN, PPS, IDs• Sarbanes-Oxley• Data protection, Privacy• …

Page 39: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

And of Course: Language

Source: Wikipedia.org entry on Internationalization and localization, accessed 15 April 2007

Page 40: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Locale Variables: Solutions

• Don’t hard-code • Store independently (MLS)• Rely on O/S language and country/region settings, ICU, NLS class

libraries to display• Auto-detect, then allow user to select

© Oracle Corporation 2009

Page 41: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Question

• Any other international variables you can think of?• Reading Writing Direction (BiDi, Vertical)

• HTML DIR=“RTL”• CSS direction• unicode-bidi property

Page 42: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

Translatability means that the product can be translated easily to another languages using an efficient, common sense, scalable process

Page 43: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

<INPUT name=ExchangeRateTypetype=radio value=USER <% if((currencyRadioButton==null)||currencyRadioButton.equals("")||currencyRadioButton.equals("USER"))out.print("CHECKED"); %>> User ratesspecified in the table below </TD>

Externalization - separating strings from software code makes translation safer and easier

Tokens

Context

MESSAGE_CODE MESSAGE_TEXT LANGUAGE------------ ------------ ---------HELLO Hello ENHELLO JAHELLO Bonjour F

© Oracle Corporation 2009

Page 44: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

• Separation of structure and rendering• Ready-made in some cases

• XML• XSLT• CSS

• File formats• Minimize• XLIFF• Standardize where you can

• Context versus Preview• Text formats

<trans-unit id="HcmPayBalTop_60FB2908EE5DCCCAE040D30A68810384V000">

<source>You can view a single balance (the accumulated result of a payroll calculation) and groups of balances. Review balance results to confirm that the payroll run has completed successfully, to verify that a worker has the correct pay and amount of tax deducted, and to check a balance before and after adjusting it.</source>

<note>Product feature: “HRM: Workforce Deployment”</note>

<note>Page title: "Search: Balances By Country"</note>

</trans-unit>

Page 45: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Ensure quality translation - allow for text expansion:• 200-300% for less than 20 characters in US English• 150% for 21-50 characters• 130% for over 50 characters

• Technologies that allowexpansion

• Context description

Translatability

© Oracle Corporation 2001

© Richard Ishida, 1999, Xerox, Designing International User Interfaces

Page 46: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

TranslatabilityTokens• Replaceable run-time variables• Efficient programming technique• Dangerous to use for translated words and verbs

Correct: This purchase order must be approved.

Incorrect: This purchase order must be &ACTION.

• Use for non-translatables: file or server names, dates, currency amounts, system user names, or numbers

• Shortcut can prove costly in long run

Problems with concatenation• <string1>+<string2>=<string3>

Terminal is operational -> Terminal est operationnelTerminal is not operational -> Terminal n’est pas operationnel

© Oracle Corporation 2009

Page 47: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

• Sorting Orders• Don’t hard-code

• Manually Expensive

• O/S, DB collation or generate w/ XSL

• Multilingual files• Separate files

• Bilingual translation memories

• Easier maintenance

• Faster turnaround

• Translation kit structure• Common, then folder for each language

• Reflect storage© Ultan Ó Broin 2009

Page 48: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

• Identifiers• Use for leveraging, context security• Scale• Development, storage efficiencies too

<p><!--BOLOC intro1019756-->Traditional performance measurement systems typically do not provide top managers with a comprehensive view of the organization. The Balanced Scorecard is a performance measurement methodology, developed by Kaplan and Norton, that exceeds the typical scope of traditional performance measurement systems. The Balanced Scorecard methodology links the financial goals of an enterprise with the drivers that determine future success.<!--EOLOC intro1019756--></p>

© Ultan Ó Broin 2009

Page 49: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

• Source Content • Approved terminology• Glossary and style guide• Write for the intended audience• Care with lang, cultural references,

humor…• Active voice?• Eliminate wordiness (cost,

time, user experience)• Care with symbols,

characters, acronyms, and other “shortcuts”

• Avoid temporary or placeholder text © Oracle Corporation 2009

Page 50: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

Cost Control• Common sense• Basic writing

standards• Examples

Solution: Not needed: You must enter the username that you want to log on with.(12 words saved)Better : Save your work and continue. Wordy: Click the Apply button to save your work, and then Click the Continue button. (9 words saved)

© Oracle Corporation 2009

© Ultan Ó Broin 2009

Page 51: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

TranslatabilityGraphics• Nonlocalizable if possible

• If not, externalize text (SVG, XLIFF, and so on)

• Store separately• Single tool for authoring/L10n if possible• Unicode fonts• Allow resizing• Care with images:

• Hands, Body Parts• Flags, Maps• People• Directionality• Color when associated with objects

Sound• Nonlocalizable if possible• Audio - > Recording timing, transitions

Page 52: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Translatability

import java.util.ListResourceBundle;public class OEXBundle extends ListResourceBundle{ public Object[][] getContents() { return contents; } static final Object[][] contents = { {"EDITDETAILS", "Edit Details"}, {"ADDTOCART", "Add to Cart"},

Images for buttons and labels generated from a translated text file at run-time

© Oracle Corporation 2009

Page 53: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Questions

• Can you think of other problematic graphics?• Can you see what’s

wrong with this file?What are the solutions?

© Ultan Ó Broin 2009

Page 54: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Tools and Environments

• Development Tools “Baked-In” I18N• Information Quality authoring (e.g., Acrolinx IQ Suite)

• Enforces terminology and style standards, reuse• Write your own scripts for text processing

• Pseudotranslation tools• Externalization• Hard-coding• Text expansion• Tokens• O/S, run-time exes• Character set support• Translation tool synergies

• Test with translated data, localized O/S and environments

Page 55: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Globalyzer Tool

© LingoPort, 2005, 2007

Page 56: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Pseudotranslated Environments

© Oracle Corporation 2009

Page 57: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

For the Little Guy• Don’t be intimidated by the “GILT Industry”• Who can afford to pay for conferences, reports, tools?• Use common sense

• Leverage what’s provided by technology for free• Write well in English (no translation guidelines)• Prioritize (graphics 5%? Don’t sweat it)• Obtain pseudotranslations using Google Translate (AR, F, JA)• Pseudotranslate using your translation tool• Visually inspect on different browsers, platforms• Make your own checklist, write your own tools• Discount usability and I18n testing – black-team• Social media• Engage the community, volunteers• Engineer for user participation and input• Beg, borrow, steal ideas and tools• If it works for you, go for it, but architect for expansion and scale

Page 58: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Resources• Web

• www.w3c.org, www.xliff.org• www.multilingual.com (plus guides)• www.i18nguy.com• www.globalyzer.com• www.opentag.com

• Social Media• Blogos• LinkedIn (groups)• @r12a, @localization, #agis09, #i18n on Twitter

• Publications• Multilingual Magazine• Lunde, K. 2008. CJKV Information Processing• Hall, B. 2004. Globalization Handbook for the .NET Platform• Savourel, Yves. 2001. XML Internationalization and Localization• Graham, T. 2000. Unicode: A Primer• Apple Computer Inc. 1992. Guide to Macintosh Software Localization

Page 59: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Summary

• Definitions• Organizational and Process• Internationalization Issues• Tools, Environments• … References

Page 60: Internationalization and Translatability for Beginners

© Ultan Ó Broin September 2009

Contact Information

• Information• [email protected]• http://www.multilingualblog.com• @localization

• Thank You• Questions?