internationalization 2014.pptx
Post on 08-Sep-2015
228 Views
Preview:
TRANSCRIPT
Internationalization
Internationalization
Localization
Locale
Java
Regional Requirements affecting the UI Design
It is important to bear in mind that more than 40 languages need to be supported and in the future the number will be expanding
The UI and software design should be as generic as possible
At least they should be as easily customizable as possible
Customization is needed for creating language, regional, and operator variants
Definition
It is the process of designing a program from the ground up so that it can be changed to reflect the expectations of a new user community without having to modify or recompile its executable code
Designing the program from the ground up with this kind of customization in mind is vital
Different user populations, particularly those speaking different languages or living in different countries, have widely varying expectations for how a computer program interact with them
Definitions
Translation
The process of converting text in one language to text in another language
Localization
The process of modifying a program to conform to the expectations of a given user community.
This can involve not only translating text, but also altering pictures, colors, and window layouts and changing the programs behavior
Internationalization
This does not involve localization its a technique that greatly simplifies the localization process
Localization (which includes translation) is what translation houses do to software to prepare it for a particular market. Internationalization is what programmers do to make sure the program can be localized easily
Definition
Locale includes
language, currency formatting, time and date formatting, and numeric formatting
Separating the locale-specific information from the code
Internationalization isnt a feature
Making sure the product is localized is just part of designing a good user interface
Designing the program from the start with eventual localization in mind can save considerable time down the road
The built-in Java I18N functions support over 70 language-country combinations
Purpose of I18N
Translation can be complicated
Retrofitting an application so that it can be translated is incredibly difficult
Designing the program from the start with eventual localization in mind can save considerable time down the road
Design
Basic approaches
Design one all-inclusive product that is shipped everywhere butwith different defaults
Design a modular product
Plug in localization modules asrequired before shipping to specific locales.
Procedure
Gather information about target locales
Determine the target audience
Make an international impact assessment
Determine which features will function identically acrossinternational boundaries
Determine which features are generally OK but have to beimplemented differently in target locales (e.g., addresses)
Determine which features have to be discarded or completelyre-engineered
Specific ideas
Use visual rather than verbal feedback.
Reduce the number of commands = Empower the mouse.
Make use of multi-cultural images
arrows
books, newspapers or magazines
calculators, computers, monitors or keyboards
How Java helps
Java supplies an extensive library of classes and functions to help you internationalize your programs
Some I18N support comes for free or at very little cost
This often includes partial support for some languages your program doesnt explicitly support
Avoid ad-hoc solutions in favor of the standard ones whenever possible
The Java libraries are more thorough and more thoroughly tested than most ad-hoc solutions would be
Bug fixes and support for new languages come for free
Rules for Internationalization
Separate program code from user interface
keep user-interface data (labels and messages, pictures, window layouts, etc.) out of program code
Text elements may grow or shrink dramatically when translated
An English message can get much smaller when translated into Japanese and much larger when translated into Italian
Overall arrangement of UI elements may change depending on writing direction of text
UI elements themselves may change shape or arrangement depending on writing direction of text
Handling User-Visible Text
Rules for Internationalization
Rely on external libraries whenever possible
In Java, this means using routines and classes in java.text and java.util whenever possible
If you need locale-specific capabilities that Java doesnt provide you and you implement them yourself, keep them separate from the rest of your program logic, and allow for graceful degradation when youre operating in a language they werent designed for
Watch out for hidden assumptions
Be careful to keep hidden assumptions about locale and UI out of your internal processing code
Be careful when converting a piece of data from its internal representation to a human-visible representation
Use locale-sensitive APIs whenever possible
Numeric values
Currency and other denominated numeric values
Dates and times
Characteristics
With the addition of localized data, the same executable can run worldwide
Textual elements, such as status messages and the GUI component labels, are not hardcoded in the program
Support for new languages does not require recompilation
Culturally-dependent data, such as dates and currencies, appear in formats that conform to the end user's region and language
It can be localized quickly
Unicode
A universal character-encoding standard
Unicode starts with a single character set that includes the characters used in the world's major (and quite a few minor) writing systems
Unicode provides several character encoding systems that allow the representation of all these characters, all at the same time
Most newer programming languages (including both Java and JavaScript) are being designed with Unicode as their native character-string format, and Unicode support is appearing in more and more operating systems and applications
Unicode and I18N
Internationalization is completely possible without Unicode
But internationalization is much easier with Unicode
No need for character-set tagging
Easier to implement language-specific processes
Easier to handle multilingual text
Java and Unicode
All text in a running Java program is Unicode
The primitive type char is a single Unicode character
The String type is a collection of char
The java.io package can do conversion
Java and Unicode
class {
String = "north";
double = 3.14159;
}
class UnicodeTest {
public static void main(String[] arg) {
x1 = new ();
System.out.println( x1. );
System.out.println( x1. );
}
}
Resource Bundles
Resource bundles can contain not only messages and other user-visible strings, but icons and pictures, actual UI elements like menus and buttons, and even whole window layouts
Java provides an abstract ResourceBundle object that represents a resource bundle
The process at a glance
ListResourceBundle
ListResourceBundle
import java.util.ListResourceBundle;
import java.awt.Button;
public class MyResource extends java.util.ListResourceBundle {
public Object getContents() {
return contents;
}
static Object[][] contents = {
{ "HELLO_TEXT", "Hello, world!" },
{ "GOODBYE_TEXT", "Goodbye everyone!" },
{ "CANCEL_BUTTON, new Button("Cancel") }};
ListResourceBundle
ListResourceBundle allows you to store any class of object
It implements both handles
getObjects and getKeys for you
The purpose of this bundle is to allow you to define localizable elements as a two-dimensional array of pairs
The Code
Step 1
Create the resource bundles
Create properties files
These are in plain-text format
Store the translatable text of the messages to be displayed
File MyResources_bn.properties will contain the Bengali text corresponding to the keys
Create a ListResourceBundle class
Steps 2
Define the locale
The Locale object identifies a particular language and country
frLocale = new Locale("fr","FR");
String language = new String(args[0]);
String country = new String(args[1]);
currentLocale = new Locale(language, country);
Locale objects are only identifiers
The object is passed to other locale-sensitive objects that perform useful tasks, such as formatting dates and numbers
A ResourceBundle is an example of a locale-sensitive object
Step 3
Create ResourceBundle
contain locale-specific objects
isolate locale-sensitive data
myResources = ResourceBundle.getBundle("MyResources", currentLocale);
ResourceBundle has two subclasses
ListResourceBundle and PropertyResourceBundle
What's in a Name?
If you create MyResources to store all English text, you will create a similarly named file to store the French text
MyResources__
The getBundle method provides a graceful degradation algorithm that attempts to find the nearest matched bundle in cases where the specified bundle can't be found or doesn't exist
MyResource_fr_FR
MyResource_fr_CA
The Algorithm
++ +
++
+
++ +
++
+
An Example
Suppose the default locale determined from the operating system is U.S. English
You may want to load MyResource for the Canadian French locale instead
The call to ResourceBundle.getBundle("MyResource", new Locale("fr","CA")) would produce the following search order
MyResource_fr_CA
MyResource_fr
MyResource_en_US
MyResource_en
MyResource
Inheritance
The ResourceBundle class associates a parent to any bundle
If an object value cannot be found in the specified class, ResourceBundle searches the parent class
This relationship among bundles is established by giving them the same base name
Step 4
Fetch the text from the Resource Bundle
The properties files contain key-value pairs
The key is hardcoded in the program and it must be present in the properties files
String msg1 =myResources.getString("greetings");
Example
import java.util.*;
public class I18NSample {
static public void main(String[] args) {
String language;
String country;
if (args.length != 2) {
language = new String("en");
country = new String("US");
} else {
language = new String(args[0]);
country = new String(args[1]);
}
Locale currentLocale;
ResourceBundle messages;
currentLocale = new Locale(language, country);
messages = ResourceBundle.getBundle("MessagesBundle",
currentLocale);
System.out.println(messages.getString("greetings"));
System.out.println(messages.getString("inquiry"));
System.out.println(messages.getString("farewell"));
}
}
java.text Architecture
Data Driven Model
Most i18n classes are pure execution engines that derive their exact behavior from some kind of textual description
The classs actual behavior is specified by a description (usually a String) that is supplied from outside
The application supplies it at construction time
or
The framework loads one from a resource bundle
Abstract classes and factory methods
Sometimes different code is required to support certain locales.
The Java i18n frameworks are based on abstract classes and factory methods
Factory methods are static methods that return an instance of the native class
like Calendar.getInstance
Factory methods:
have names, unlike constructors, which can clarify code
do not need to create a new object upon each invocation - objects can be cached and reused, if necessary.
can return a subtype of their return type - in particular,can return an object whose implementation class is unknown to the caller.This is a very valuable and widely used feature in many frameworks which use interfaces as the return type of static factory methods.
Common names for factory methods includegetInstanceandvalueOf
The Framework-Overview
The main API classes are all abstract; many of the implementation classes are internal
Collator.getInstance(Locale.FRANCE);
Framework instantiates a subclass based on parameters
Many of the implementation classes are also public
These classes can be instantiated directly by the user
more control
less flexibility as special cases could not be handled
Most classes have multiple factory methods:
DateFormat.getInstance()
DateFormat.getTimeInstance()
DateFormat.getTimeInstance(style)
DateFormat.getTimeInstance(style, locale)
This allows the user to achieve a fair amount of control over the result without having to call the implementation class directly
Locales
A Locale has three parts:
Language ID (drawn from ISO 639): e.g. de = German
Country/Region ID (drawn from ISO 3166): e.g. AT = Austria
Variant code (ad-hoc): can be used to specify Euro currency
Locale objects dont contain data
This approach allows different subsystems to support different sets of locales
Java doesnt follow the POSIX setlocale() model
Instead of setting a locale and then doing something, a Locale object is passed to an i18n objects constructor
I18n objects for several locales can coexist easily
Default Locale
There is, however, a default locale:
Used when the user doesnt supply a locale
Used as a fallback when looking for resource bundles
Picked up from the underlying environment or specified on the command line
(e.g., java -Dlanguage=fr -Dregion=CA MyProgram)
Can be changed (Locale.setDefault()), but not multithread safe
ResourceBundle Hierarchy
Resource Bundles
ResourceBundle provides
A generic interface to any type of actual repository of resource data
A graceful fallback in case information for a particular locale isnt there
All of the resource must be present in the root resource bundle
If you have a resource bundle with a language and a country, DO NOT omit the bundle with just the language
Root resource bundle can be in any language
There is no requirement that all of the bundles in the hierarchy be of the same class
Make all of your resource bundles descend directly from ListResourceBundle and not from each other
Programmatic ID vs Display Names
Locale IDs and time zone IDs (and so forth) are meant only for internal programmatic use
Dont use getName() to get user-visible text;
Use getDisplayName() instead
Message Format
The search found 23 files containing hello on disk MyDisk.
Es gibt 23 Dateien auf Platte MyDisk, die hello enthalten.
The code
dialog.add("Center",
new Label("The search found " + hits + " files containing \"" + searchString + "\" on disk \"" + searchRoot+ "\"."));
The hidden assumption is that the blanks will come in the same order in every language
Use Formatter
dialog.add("Center",
new Label(MessageFormat.format(
"The search found {0} files containing "
+ "\"{1}\" on disk \"{2}\".",
new Object[] {
new Integer(hits),
searchString,
searchRoot
} ) ));
The localizable part of this statement is the pattern string
The pattern string can use some parameters (not all) and can also use a parameter multiple times
Using Formatter
dialog.add("Center",
new Label(MessageFormat.format(
resources.getString("ResultMessage"),
new Object[] {
new Integer(hits),
searchString,
searchRoot
} ) ));
{ "ResultMessage",
"The search found {0} files + "containing \"{1}\" on disk + "\"{2}\"." }
{ "ResultMessage",
"Es gibt {0} Dateien + "auf Platte {2}, + "die {1} enthalten." }
Handling Plurals
The search found 1 files containing hello on disk MyDisk.
The search found 1 file(s) containing hello on disk MyDisk.
The search found {0} files containing "{1}" on disk "{2}".
The search found {0,choice, 0#no files|1#one file|2#{0} files} containing "{1}" on disk "{2}".
Lets say the root of the search could either be a whole disk or a single folder
The search found {0,choice, 0#no files|1#one file|2#{0} files} containing "{1}" {3,choice,0#on disk "{2}" |1#in folder "{2}"}
Handling dynamically generated text
Ammerican Format
French Format
Swiss German Format
Arabic Format
Japanese Format
Handling Numbers
DO NOT use toString() to format uservisible numbers!
DO NOT use parseInt() or other similar functions to parse numeric user input!
Use NumberFormat.format() and NumberFormat.parse() instead
lbl = new Label(Double.toString(milesTraveled));
NumberFormat fmtr = NumberFormat.getInstance();
lbl = new Label(fmtr.format(milesTraveled));
Make fmtr to be static
Number Formatters
All formatters both format and parse
00001111 31 00001111
public final String format(Object obj);
public Object parseObject(String source);
There are convenience methods which can take String or double parameter rather than an Object
NumberFormat provides four factory methods
NumberFormat.getInstance()
NumberFormat.getNumberInstance()
Formats numbers as generic format
NumberFormat.getPercentInstance()
NumberFormat.getCurrencyInstance()
Decimal Format
This class is used to format numbers using standard Western positional notation and the decimal numeration system
The minimum and maximum number of digits on either side of the decimal point can be fixed
Different subclass is needed for
formatting numbers in Chinese characters, or
formatting numbers into words
Decimal Format
String pattern=000000.000;
DecimalFormat myFormatter = new DecimalFormat(pattern);
String output = myFormatter.format(value); System.out.println(value + " " + pattern + " " + output);
Locale-sensitive formatting
NumberFormat nf = NumberFormat.getNumberInstance(loc);
DecimalFormat df = (DecimalFormat)nf;
Patterns
DecimalFormat provides a pattern language as a shortcut way to specify many options at once
0 specifies a required digit position
0000
# specifies an optional digit position
0.###
, specifies the use and position of a grouping separator
#,##0.00
Prefixes and suffixes can be added
Value=12345.67
Output=$12,345.67
Pattern=$###,###.###
; separates positive and negative patterns
$#,##0.00
$#,##0.00;($#,##0.00)
Handling Currency
Handling multiple currencies at the same time can be tricky
You may need to keep track of the units for each value
You may need to perform currency conversions
Handling Dates and Times
Calendar
Handling Dates and Times
Today is Friday, July 2, 1999.
Heute ist Friday, July 2, 1999.
Heute ist Freitag, 2. Juli 1999.
Not only do the words for the days and months change, but so does the order of the fields themselves and the punctuation around them.
In fact, in some countries, the calendar system in use also changes: In Hebrew, for example, April 2, 1999 is the 16th of Nisan, 5759
Handling Dates and Times
Use DateFormat:
DateFormat fmt = DateFormat.getDateTimeInstance(
DateFormat.FULL, DateFormat.DEFAULT);
System.out.println(fmt.format(new Date()));
Use MessageFormat:
MessageFormat.format("It is {0,time,medium} on {0,date,full}., new Object[] { new Date() } );
DateFormat also offers a selection of formats (short, medium, long, and full)
The date and time can be set independently
Date Style
Time Style
Handling Dates and Times
Provides four factory methods:
getInstance()
getDateInstance()
August 26, 1999
getTimeInstance()
12:47 PM
getDateTimeInstance()
August 26, 1999 12:47 PM
getInstance works in the same way as getDateTimeInstance()
Handling Dates and Times
Four time styles:
Short (12:54 PM)
Medium/Default: (12:54:56 PM)
Long: (12:54:56 PM PDT)
Full:(12:54:56.034 PM PDT)
Four date styles:
Short: (8/26/99)
Medium/Default: (8/26/1999 or Aug 26, 1999)
Long: (August 26, 1999)
Full:(Thursday, August 26, 1999)
Calendar
java.util.Date
# of milliseconds since midnight, January 1, 1970 GMT (signed 64-bit integer)
Composing and decomposing
DO NOT use Date.getMonth(), Date.getDate(), Date.getYear(), etc.
Use java.util.Calendar:
Calendar cal = Calendar.getInstance();
cal.setTime(myDate);
myDay = cal.get(Calendar.DAY_OF_MONTH);
myMonth = cal.get(Calendar.MONTH) + 1;
myYear = cal.get(Calendar.YEAR);
The API is better, and itll work with multiple calendar systems
Handling searching and sorting
String comparison is very language-specific
Different definitions of letter
In English, a and v w
In Swedish, a and v w
In Spanish, ch and ll are considered single letters, not pairs of letters
Expanding character sequences
In German, ae and ss
Ignorable characters
e-mail and email are the same word
String Comparison
list[i].compareTo(list[i + 1])
Collator coll = Collator.getInstance();
coll.compare(list[i], list[i+1]) > 0)
There are various levels of equivalence for searching
Primary differences
if somewhere they have different letters (according to the language) in corresponding positions
Different letters: resume vs repeat
Secondary differences
if they dont have a primary difference, but do have two corresponding letters with a diacritic or variant-form difference.
Different diacritics: rsum vs resume
Tertiary differences
if they dont have a primary or secondary difference, but two corresponding letters have different case
Theres also a fourth level of difference, identity difference, which is when there are no tertiary differences, but the strings still are different in terms of the actual hex codes
Whole word searches
Definition of word varies with language
Other Issues
Use Unicode Character Properties
char ch; ... if ((ch >= 'a' && ch = 'A' && ch
top related