people.uwplatt.edu › ~rowemi › classes › se373 › ... · web view– is the trust that a...

28
Notes_007_Software Internationalization (i18n) SE 3730 / CS 5730 – Software Quality 1 Localization Internationalization I-18-N Testing – Introduction Even small companies are making their software globally available. o In the latest survey of CSSE alumni for ABET objectives (folks 3-5 years after graduation) over 75% of the respondents indicated that they had worked on software for international use. This is from the Spring 2010 ABET Alumni Survey. Internationalized software supports multiple languages and cultures. Localized software can be switched to a specific language and culture. Internationalized software requires special software attention. o Design o Implementation o Testing o Deployment and Sales o Installation o Support and maintenance Design and Implementation: o Is the text independent from the code? This can be done using text files that have all messages in them or resource files. It is hard to be completely code independent, due to sentence structure, and other language differences. o Example: ©2011 Mike Rowe Page 1 01/05/2011 8:52 AM

Upload: others

Post on 01-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

1

Localization Internationalization I-18-N Testing – Introduction

Even small companies are making their software globally available.

· In the latest survey of CSSE alumni for ABET objectives (folks 3-5 years after graduation) over 75% of the respondents indicated that they had worked on software for international use. This is from the Spring 2010 ABET Alumni Survey.

Internationalized software supports multiple languages and cultures.

Localized software can be switched to a specific language and culture.

Internationalized software requires special software attention.

· Design

· Implementation

· Testing

· Deployment and Sales

· Installation

· Support and maintenance

Design and Implementation:

· Is the text independent from the code? This can be done using text files that have all messages in them or resource files. It is hard to be completely code independent, due to sentence structure, and other language differences.

· Example:

· Embedded Text – Not good!

cout << “Error[4535]: The system has detected water on the disk “

<< “drive. Do you wish to drain the disk?” << endl;

· Seperated Code and Text – Better Way!

static Language = DE; // Global that determines the current language

// This is used by Xlate() function to retrieve

// the proper string message.

cout << Xlate( 4535 ) << endl; // Xlate can also be used to set GUI labels

// Textboxs, etc.

Testing

· Must have both hardware and software to support the testing.

· Must have testers that would recognize language or cultural defects.

Deployment and Sales

· Must follow business rules and regulations of the countries in which you sell.

· Copyrights and anti-piracy practices

Support and maintenance

· Must be able to communicate in the language and during regular business hours.

· Must have hardware to support the support effort.

· All documentation must be kept synchronized in multiple languages with the product.

Generally, multiple language support is added to existing products after design.

· This can result in huge rework and inefficiencies

· Also, it is added to a product rather than having a true variant of the product for each language supported.

Installation tests

The install must be Multilanguage to direct users to their native language.

Customers must be able to get to their native language immediately.

Character sets test

Different languages need different character sets.

· The graphical representations of characters are contained in Code Pages.

· In the Windows world Code page 437 will work for English and German and is standard in U.S.

· Code page 850 will work for most Western European languages.

· There are hundreds of Code pages available on computers.

Microsoft Windows OEM Code Pages http://www.microsoft.com/globaldev/reference/oem.mspx

The list below provides links to graphical representations, and textual listings, of each of the Windows OEM code pages:

Notes_007_Software Internationalization (i18n)

SE 3730 / CS 5730 – Software Quality

©2011 Mike RowePage 101/05/2011 8:52 AM

437 (US)

720 (Arabic)

737 (Greek)

775 (Baltic)

850 (Multilingual Latin I)

852 (Latin II)

855 (Cyrillic)

857 (Turkish)

858 (Multilingual Latin I + Euro)

862 (Hebrew)

866 (Russian)

Windows ANSI and OEM Code Pages. The following codepages are used as both Windows ANSI and OEM codepages:

874 (Thai)

932 (Japanese Shift-JIS)

936 (Simplified Chinese GBK)

949 (Korean)

950 (Traditional Chinese Big5)

1258 (Vietnam)

Code Pages supported by IBM http://www-03.ibm.com/servers/eserver/iseries/software/globalization/codepages.html

· All of these code page images are Copyright IBM Corp. 1994-2001. 

· Here are the double-byte Chinese, Japanese, and Korean code pages. Since some of these files are large (greater than 200 pages), a sample page of each standard is provided for quick reference.

Simplified Chinese

EBCDIC  & PC 1993 version   sample

EBCDIC  & PC GBK 1997 version 

GB18030 2000 version 

EUC (Unix)    sample

single byte Simplified Chinese code pages

Traditional Chinese

EBCDIC  & PC 1999 version   sample

Big 5 1999 version  

EUC (Unix) 

single byte Traditional Chinese code pages

Japanese

EBCDIC  & PC 1999 version 

EBCDIC  & PC 1996 version    sample

EUC (Unix) sample

single byte Japanese code pages

Korean

EBCDIC  & PC 1999 version  sample

EBCDIC  & PC 1992 version  sample

EUC (Unix) (000)         sample

single byte Korean code pages

Here are the rest of the code pages

38xx/4250 (00383)

7-Bit Denmark (01017)

7-Bit Finland/Sweden (01018)

7-Bit France (01010)

7-Bit Germany F.R. (01011)

7-Bit Italy (01012)

7-Bit Netherlands (01019)

7-Bit Norway (01016)

7-Bit Portugal (01015)

7-Bit Spain (01014)

7-Bit United Kingdom (01013)

8-bit Estonia with euro (00902)

Adobe (PostScript) Latin 1 (01277)

Adobe (PostScript) Standard Encoding (01276)

Adobe (PostScript) Standard Encoding (01278)

APL ASCII (3812) (00907)

Apple Central European (01282)

Apple Cyrillic (01283)

Apple Greek (01280)

Apple Turkish (01281)

Apple, Icelandic (01286)

Apple, Latin 1 (01275)

Apple, Romanian (01285)

Arabic - Personal Computer (00864)

Arabic 8-Bit ISO/ASCII (01008)

Arabic Bilingual (00420)

Arabic Code Page, Data Storage & Interchange (01089)

Arabic Extended (01029)

Arabic Extended (01046)

Arabic (XCOM2) (01007)

Arabic/French - Personal Computer (01127)

Arabic/Latin for OS/390 Open Edition (00425)

ASCII (00367)

ASCII - APL (00371)

Australia, New Zealand, United States, Canada (English) - 38xx/4250 (00395)

Austria, Germany ECECP (Euro) (01141)

Austria, Germany F.R. - 38xx/4250 (00382)

Austria/Germany F.R., Alternate (3270) (00286)

Baltic - Multilingual, superset of ISO 8859-13 (00921)

Bar Code, Code-128 (01303)

BCDIC-A (00353)

Brazil (00021)

Brazil - 38xx/4250 (00384)

Brazil - CECP (00275)

Brazil - CECP (01073)

British NRC Set (01101)

Canada (English) (00010)

Canada (French) (00011)

Canada (French) - 38xx/4250 (00385)

Canada (French) - 94 (00276)

Canadian French - 116 (00260)

Canadian French - Personal Computer (00863)

CCITT T.61 (BTTX) Code Page (01036)

China (Hong Kong S.A.R.) (00251)

China (Hong Kong S.A.R.), United Kingdom, Ireland - 38xx/4250 (00394)

Cyrillic - PC with euro (00872)

Cyrillic - Personal Computer (00855)

Cyrillic, 8-Bit (00915)

Cyrillic, Multilingual (00410)

Cyrillic, Multilingual (00880)

Cyrillic, Multilingual (01025)

Cyrillic, Ukraine (01123)

Cyrillic, Ukraine (01124)

Czechoslovakia (00033)

Czechoslovakia (00034)

Czechoslovakia (Czech/Slovak) (00032)

DCF Compatibility (01068)

DCF Release 2 Compatibility (01002)

DCF, US Text Subset (01003)

DEC Greek 8-Bit (01287)

DEC Turkish 8-Bit (01288)

Denmark, Norway - 38xx/4250 (00386)

Denmark, Norway - CECP (00277)

Denmark, Norway - CECP (01074)

Denmark, Norway ECECP (Euro) (01142)

Denmark/Norway (00020)

Denmark/Norway, Alternate (3270) (00287)

Devanagari EBCDIC (01137)

DITROFF Base Compatibility (01108)

DITROFF Specials Compatibility (01109)

Dutch NRC Set (01102)

EBCDIC Baltic Multi with euro (01156)

EBCDIC Cyrillic, Multilingual with euro (01154)

EBCDIC Cyrillic, Ukraine with euro (01158)

EBCDIC Encoding of T.61 (BTTX) Characters (01024)

EBCDIC Estonia with euro (01157)

EBCDIC Latin 2 Multilingual with euro (01153)

EBCDIC Turkey with euro (01155)

EBCDIC, OCR A (00892)

EBCDIC, OCR B (00893)

Estonia, EBCDIC (01122)

Estonia, Personal Computer (01116)

Estonia, similar to ISO 8859-x (00922)

Farsi - Personal Computer (01098)

Farsi Bilingual - EBCDIC (01097)

Finland, Sweden - 38xx/4250 (00387)

Finland, Sweden - CECP (00278)

Finland, Sweden - CECP (01075)

Finland, Sweden ECECP (Euro) (01143)

Finland/Sweden, Alternate (3270) (00288)

Finnish NRC Set (01103)

France - 38xx/4250 (00388)

France - 5080 Graphics System (00885)

France - CECP (00297)

France - CECP (01081)

France AZERTY - 5080 Graphics System (00888)

France ECECP (Euro) (01147)

France, Belgium (00009)

French - 94 (00279)

French NRC Set (01104)

Germany F.R. (00008)

Germany F.R. - 5080 Graphics System (00884)

Germany F.R./Austria (00007)

Germany F.R./Austria - CECP (00273)

Germany F.R./Austria - CECP (01071)

GML Compatibility (01039)

Graphic Escape APL/TN (00310)

Greece (00875)

Greece - 183 (00423)

Greece - Personal Computer (00851)

Greece - Personal Computer (00869)

Greece ISO 8859-7 (00813)

Greece (Latin) (00027)

H-P Emulation, ASCII (01054)

H-P Emulation, Gothic 1 (01053)

H-P Emulation, Gothic Legal (01052)

H-P Emulation, IBM - DN (01058)

H-P Emulation, IBM - US (01057)

H-P Emulation, PC Line (01055)

H-P Emulation, PC Line Draw (01056)

H-P Emulation, Roman 8 (01051)

H-P Emulation, Roman 8 Extended (01050)

Hebrew - Personal Computer (00856)

Hebrew Character Set A (00803)

Hebrew Publishing (01028)

Hebrew (Latin) (00916)

Hitachi EBCDIC Katakana (01136)

Hungary (00254)

Hungary (00320)

IBM Logo (01093)

Iceland (00029)

Iceland - CECP (00871)

Iceland - CECP (01085)

Iceland - Personal Computer (00861)

Iceland ECECP (Euro) (01149)

International - APF 38xx/4250 (00361)

International ECECP (Euro) (01148)

International Set #5 3812/3820 (00906)

International #1 (00256)

International #2 (00257)

International #3 (00258)

International #4 (ROECE/Latin, Multilingual) (00330)

International #5 (00500)

International #5 (01084)

ISO IRV (01009)

ISO Text Communication Isomorphic (01005)

ISO/ANSI Multilingual (00819)

Israel - Personal Computer (00862)

Israel (Hebrew) (00424)

Israel (Hebrew) (01082)

Israel (Hebrew) (01083)

Italy (00012)

Italy - 38xx/4250 (00389)

Italy - 5080 Graphics System (00886)

Italy - CECP (00280)

Italy - CECP (01076)

Italy ECECP (Euro) (01144)

Japan - 5080 Graphics System (00887)

Japan 7-Bit Katakana Extended (00896)

Japan 7-Bit Latin (00895)

Japan Alphanumeric Katakana (01139)

Japan PC #1 (00897)

Japan PC #1 (01086)

Japan PC #2 (00911)

Japan (Katakana) (00298)

Japan (Latin) (00025)

Japan (Latin) (00026)

Japan (Latin) - 38xx/4250 (00390)

Japan (Latin) - CECP (00281)

Japan (Latin) - CECP (01077)

Japanese Extended - Personal Computer (01041)

Japanese (Katakana) Extended (00290)

Japanese (Katakana) Extended (01030)

Japanese (Latin) Extended (01027)

Japanese (Latin) Extended (01031)

Korea - 5080/6090 Graphics System (01037)

Korea - Personal Computer (00891)

Korean - Personal Computer for Windows (01126)

Korean Extended (00833)

Korean Extended - Personal Computer (01040)

Korean extended w/ box characters (01150)

Lao EBCDIC (01132)

Lao ISO-8 (01133)

Latin-1 Extended, Desk Top Publishing/Windows (01004)

Latin 1/Open Systems (01047)

Latin 2 (01111)

Latin 2 - EBCDIC Multilingual (00870)

Latin 2 - ISO (00912)

Latin 2 - Personal Computer (00852)

Latin 2 EBCDIC/Open Systmes (01165)

Latin 2, EBCDIC Multilingual (01110)

Latin 3 - EBCDIC (00905)

Latin 3 - ISO (00913)

Latin 3 - Personal Computer (00853)

Latin 4 (00914)

Latin 4, EBCDIC (01069)

Latin 6 - EBCDIC (01113)

Latin 6 - ISO (00919)

Latin 9 (00923)

Latin 9 EBCDIC (00924)

Latin America (Puerto Rico, Costa Rica) (00006)

Latin America (Spanish Speaking) - 38xx/4250 (00393)

Latin #5 - Turkey (00920)

Latin #5 - Turkey (01026)

Latin #5, Turkey - Personal Computer (00857)

Latvia, Personal Computer (01117)

Lithuanian and Russian, Personal Computer (01119)

Lithuania, Personal Computer (01118)

Maghreb/French (00421)

Math Symbols (00829)

MICR (01001)

MICR, CMC-7 Combined (01033)

MICR, E13-B Combined (01032)

Modified Symbols - Personal Computer (01092)

Modified Symbols, Set 7 (01091)

MS DOS Arabic (Transparent ASMO) (00720)

MS DOS Baltic Rim (00775)

MS DOS Greek (00737)

Multinational Emulation (01100)

Netherlands (00013)

Nordic - Personal Computer (00865)

Norwegian/Danish NRC Alternate (01107)

Norwegian/Danish NRC Set (01105)

OCR A (00876)

OCR B (00877)

Old Belgium Code Page (00274)

PC - APL (USA) (00909)

PC - APL (USA) (00910)

PC Baltic Multi with euro (00901)

PC Data, Cyrillic, Belorussian (01131)

PC Data, Cyrillic, Belorussian with euro (00849)

PC Data, Cyrillic, Russian (00866)

PC Data, Cyrillic, Russian with euro (00808)

PC Indian Script Code (ISCII-91) (00806)

PC Latin 9 (00859)

PC, Cyrillic, Ukrainian (01125)

PC, Cyrillic, Ukrainian with euro (00848)

People's Republic of China (PRC)-PC (00903)

People's Republic of China (PRC)-PC (01115)

Personal Computer (00437)

Personal Computer - Multilingual Page (00850)

Personal Computer - Multilingual with euro (00858)

Poland (00252)

Portugal (00022)

Portugal - 38xx/4250 (00391)

Portugal - CECP (00282)

Portugal - CECP (01078)

Portugal - Personal Computer (00860)

Print Train & Text Processing Extended (00264)

Printer Application - Shipping Label, Set #1 (01044)

Printer Application - Shipping Label, Set #2 (01034)

Printing and Publishing Option (00352) 

PTTC/BCD Correspondence Option (00358)

PTTC/BCD Duocase Option (00360)

PTTC/BCD H Option (00357)

PTTC/BCD Monocase Option (00359)

PTTC/BCD Standard Option (00355)

Revised Korean - Personal Computer (01088)

Romania (00035)

Romania (00036)

Russian internet koi8-r (00878)

Simplified Chinese Extended (00836)

Simplified Chinese Extended - PC (01042)

Simplified Chinese extended w/ box characters (01151)

South Africa (00031)

Spain (00014)

Spain - 190 (00283)

Spain Variant (01023)

Spain, Alternate (3270) (00289)

Spain, Latin America (Spanish) ECECP (Euro) (01145)

Spain, Philippines - 38xx/4250 (00392)

Spain/Latin America - CECP (00284)

Spain/Latin America - CECP (01079)

Special Characters and Line Drawing Set (01090)

Sweden - 5080 Graphics System (00883)

Sweden/Finland (00018)

Sweden/Finland WP, Version 2 (00019)

Swedish NRC Set (01106)

Switzerland Variant (01021)

Switzerland (French) (00015)

Switzerland (French/German) (00016)

Switzerland (German) (00017)

Symbol - Personal Computer (00899)

Symbol Set (Adobe) (01038)

Symbol Set (Adobe) - EBCDIC (01087)

Symbols, Set 7 (00259)

Symbols, Set 8 (00363)

Taiwan - Personal Computer (00904)

Taiwan - Personal Computer (01114)

Traditional Chinese EBCDIC (01159)  Also uses 00037

Teletext Isomorphic (00435)

Thai MS Windows (01162)

Thai with Low Tone Marks & Ancient Characters (00838)

Thai with Low Tone Marks & Ancient Characters (01160)

Thai with Low Tone Marks & Ancient Chars - PC (00874)

Thai with Low Tone Marks & Ancient Chars - PC (01161)

Thailand (00889)

Traditional Chinese Extended - PC (01043)

Traditional Chinese extended w/ box characters (01152)

Turkey (00030)

Turkey (00322)

UCS-2

Unicode

UK ECECP (Euro) (01146)

United Kingdom (00023)

United Kingdom (00024)

United Kingdom (00040)

United Kingdom - 5080 Graphics System (00882)

United Kingdom - CECP (00285)

United Kingdom - CECP (01080)

United Kingdom, Israel (Latin) (00039)

United States (00004)

United States (00005)

United States - 5080 Graphics System (00881)

Urdu - Personal Computer (00868)

Urdu Bilingual (00918)

Urdu, 8-Bit (01006)

US - ASCII Character Set (00038)

USA (00002)

USA WP, Original (00001)

USA, Accounting, Version A (00003)

USA, Canada, etc. ECECP (Euro) (01140)

USA/Canada - CECP (00037)

USA/Canada - CECP (01070)

UTF-8

UTF-16

UTF-32

Vietnamese EBCDIC (01130)

Vietnamese EBCDIC with euro (01164)

Vietnamese ISO-8 (01129)

Vietnamese ISO-8 with euro (01163)

Windows, Arabic (01256)

Windows, Baltic Rim (01257)

Windows, Cyrillic (01251)

Windows, Greek (01253)

Windows, Hebrew (01255)

Windows, Latin 1 (01252)

Windows, Latin 2 (01250)

Windows, Turkish (01254)

Windows, Vietnamese (01258)

Word Processing Multilingual - PC (00898)

Yugoslavia (00321)

Yugoslavia (00890)

Unicode and different multi-byte character sets are needed to handle world-wide situations. Different environments may have different character set selections.

Need to test the availability and language switching mechanism as your software is installed and run. There is significant momentum toward Unicode.

Since Windows NT there has been support for Unicode for Strings. There are two separate sets of APIs. One set ends with a ‘W’ for “wide” and handles Unicode characters. The other set ends in ‘A’ which covers “ANSI” characters (1-byte per character). See http://msdn.microsoft.com/en-us/library/cc500321.aspx for a very good discussion of the handling of Unicode and ANSI character details.

Example of German is handled differently on US hardware.

PART 1 - For this German character, type...

These codes work with most fonts. Some fonts may vary. For the PC codes, always use the numeric (extended) keypad on the right of your keyboard and not the row of numbers at the top. (On a laptop you may have to use "num lock" and the special number keys.)

Germanletter/symbol

PC CodeAlt +

Mac Codeoption +

ä

0228

u, then a

Ä

0196

u, then A

ée, acute accent

0233

E

ö

0246

u, then o

Ö

0214

u, then O

ü

0252

u, then u

Ü

0220

u, then U

ßsharp s / es-zett

0223

S

Size of text messages

English requires fewer characters than most other western languages. As a rule of thumb,

· French is 15% longer,

· German is 25% longer.

· Eastern languages, traditional or simplified Chinese, Japanese, and Korean require many fewer character (2-3 character positions per word).

Special consideration must be made for dialog design and functionality to handle different length text messages of the languages supported. Will labels line up over buttons, will text fit inside of boxes or buttons, …

Message lengths also greatly complicates business forms and report designs.

Let’s look at the difference in the Message “Contact customer support for help” in different languages.

The below was found by Amy Olstead (class of 2010). Can you pick out any particular languages.

Keyboard tests

Keyboard layouts were originally design to minimize the jamming of old fashion hammer typewriters. The design rule was to separate keys on the keyboard that were often used in sequences. This minimized the time that the keys spent in close proximity. Since different languages have different common character sequences, their keyboards were originally laid out differently.

Languages and cultures have different characters and special characters.

Keyboards differ from country to country to support their character sets and usage patterns.

French keyboards do not have ‘[‘ or ‘]’. French, German, Scandinavian, Greek, Eastern European, Eastern Asian, etc. all have different keyboards.

These keyboards generate interrupts that must match the loaded code page.

Different keyboards commonly available.

ArabicArmenianBelarusianBelgian DutchBelgian FrenchBrazilianBulgarianChineseCroatian

DanishDariEnglish English USFarsiFrench CanadianFrench EuropeanGermanGreekGujarati

HebrewHindiHungarianItalianJapaneseKoreanLatin AmericanNetherlandsNorwegianPolish

PortuguesePunjabiRussian CyrillicSpanishSwedish/FinnishSwissThaiUrduVietnamese

English lower-case

Russian upper-case

German Keyboard

Swedish Keyboard

French Keyboard

Arabic/English Keyboard

Text filter and special character tests

Sometimes software will block upper ASCII, or other codes. These codes may be needed to support non-English languages.

Special characters in the middle of names may cause problems. For example “O’Kelly”, ñ , ß, Ü. “O’Kelly” will sometime sort with the ‘O’s and sometimes with the ‘K’s.

Loading, Saving, Importing and Exporting High and Low ASCI tests

Make sure files will all supported ASCII values, allowing them to be loaded, saved, imported and exported properly with all supported Code Pages.

OS localization Tests

There is not just Windows XP, it is Windows XP German, French, etc. Need to test completely on all supported OS localizations.

Are wild cards and file filters the same across all languages?

What Service Packs need to be loaded for each OS to be compatible with your software.

Hot key tests

We may want Hot keys and Shortcuts to be different because the words on the menus are different.

“Stop” alt-S , what should it be for “Finis” or “Halt”?

Hot keys conventions differ – sometimes applications just stick with the English Hot key or short cut regardless of what the local command starts with.

Garbled in translation tests

Sometimes messages are built in fragments, “In file ____, a ____ error has been detected.” may have been built from two or more different functions or output statements.

The sentence structure of typical English “S-V-O”, etc. Sentence structure may differ from language to language; therefore, the software must be language sensitive wrt sentence structure.

C# has output positional parameters that helps facilitate this. It allows you to alter the order of output of parameters.

Error identifiers and translations tests

Make sure the error identifiers are consistent across languages (Error number 42) and that the messages are translated properly. This helps keep manuals and the software consistent.

Hyphenation Rule tests

In some languages, words that are hyphenated are spelled differently. There are also different rules on where to split words on end of line boundaries.

Other languages there are no rules or the rules are different

What are English hyphenation rules, what are German rules, what are any other Languages rules?

In the English, words are generally split on the accented syllable.

Spelling Rules

Spelling rules may differ across dialects of the same language.

English U.S. v. English Canadian v. English British v. English Zimbabwe v. English Texan “ya’all fixn ta study arrd”.

You may need multiple spell checkers for your system.

Need to test and document the dialects which are supported.

Sorting Rules

Where do the characters of a specific language need to fall into a collating sequence? This needs to be localized for people to use lists naturally.

English sorts by normal ASCII value sequence.

How are umlauts handled in German sorting?

In Iceland, phone books are ordered by first names.

Upper/Low Case conversions

English case conversion (toUpper, toLower) is commonly done by adding or subtracting 32 from the ASCII character.

This doesn’t work for all code sets. Some letters may not even have an uppercase/lower case form. Upper/lower case conversions may produce “smiling faces”. What happens when you toUpper or toLower a umlauted character?

Underscoring Rules

Underlining conventions differ from country to country. These need to be tested for locality compliance.

Printers

Printers are becoming more universal, but there are still printers that are manufactured, marketed and sold with only local support.

· This is cheaper as they don’t need as much memory to store fonts for many languages.

Printers will generally not support every character set – sometime different character sets may be downloaded. One printer for the CSSE department has difficulties with some character sets. On the last OOA&D test the key was garbled when sent to the departments color printer – printed just fine on the other printer.

Testing must be aware of these non-I18N printers and test for compatibility.

Sizes of Paper

Common European paper sizes (A3 and A4) are metric and are slightly longer and narrower than the North American 8 ½ by 11 inches. A4 is 8.27 by 11.69 inches (21 by 29.7 cm).

Many programs (like Office) allow adjustment of documents to different size forms and templates. But this generally produces non-optimally layout/formatted/pagination results.

Figures and tables can be especially problematic.

This has been a very real problem with JIM thesis work. I adjusted a JIM thesis for letter size paper and it went from 70 pages to over 200 pages.

CPU’s and Video Support

This is less of a problem today, but there may still be REALLY OLD locally manufactured and marketed equipment that will behave differently.

Amstrad and Olivetti equipment in Europe often behaved slightly differently than major U.S. brand video cards.

Mouse

Mice often come with their own specific drivers. These are very low tech and there was a proliferation of local Mouse factories, each with their own specific drivers.

The major problem is with the mouse pointer supported by these non-standard drivers. Some may not support the same variety of pointers (hour-glass for busy-wait, etc.). There also may be selection difference between Eastern Asia Languages and Western Languages.

This is becoming less of a problem, but there are still some REALLY OLD mice out there.

Clippy

The “text-tool” captions may need to be modified if a computer is switched from one language to another.

Also hover-over text needs to switch languages.

Other peripherals

This is still a very big issue. Wireless support (cell phones: GMS, CDMA, 3G, TDMA), and other devices, recordable media support (DVDs, memory sticks, tape drives, high speed internet, etc.) have different availabilities around the world.

CDs and DVD are physically localized at the player level. A CD sold in Asia or Europe may not play in the US.

Data Formats and setup Options

Time format – 12 v. 24 hour clock, colon or dashes or slashes or other separators, dates may be DD/MM/YY or MM/DD/YY, etc. need to be handled for local custom.

Time zones and daylight savings – time conventions and switchover dates are not standard (even within the US). When does Germany switch to/from daylight savings time.

Numeric Separators – In the US we use commas to separate fields of 3 numbers, in Europe dot or spaces are used.

In US dots are used as decimal points, in Europe commas are used as decimal places.

· How does this impact numeric input? How does cin >> x; behave with a comma instead of decimal point.

All of the above can cause major problems in input and output. Expect to find problems in this area while testing.

The ‘#’ sign (there are other local symbols) – is not a universal symbol for a number.

Money symbols differ – the actual symbols (€, $, ₤, ₣, ₫) and its placement (before, after, other) in relation to the number. May need a string of multiple characters to specifically label currencies $ USD, HKD, etc.

Exchange Rates – Need to make sure that when multiple currencies can be used in the system there are rules for handling exchange rates properly as they fluctuate in real-time. Companies can lose huge amounts of money by not paying attention to currency exchange fluctuations. (Some companies make a lot of money also)

Address formats – vary from country to country. ZIP and Postal code formats vary. States may or may not be used. Once had an address from Southeast Asia that read “on the alley behind the XYZ bank”

Phone number formats – vary from country to country and parts of the country. Lived in a small town in Texas in 1979 that still only needed 4 digits dialed for local phone numbers.

· The North American dialing plan is very different from what “most” of the rest of the world.

Personal identification – US social security numbers aren’t meaningful outside of the US.

Rulers and Measurements

Need to work in local units, make sure conversions are correct (Mars Lander), make sure engineering or other units are labeled to avoid misunderstandings.

Metric, English, Picas, Point, etc.

Need on-line word processing or graphics program Rulers, grids, etc supported in local measurement standards.

Culture-Bound Graphics

Some icons, art, etc may be culturally sensitive to different groups. Need to have local people test for this.

Gender sensitivity is important in some cultures – voice or images may or may not be appropriate.

· In the US we are much more sensitive to ensuring gender neutrality than the rest of the world.

Colors may have different connotations: Red in the North America and Europe generally means “STOP” or “Danger”, Red in some Eastern Asian cultures means “Good Luck” -- different decisions may be made just on color meaning.

Culture Bound output

Calendar formatting (non-Julian Calendars), voice output, standard business forms, etc. differ from country to country.

Related to this are holidays and different religious events.

Translations of technical terminology by non-technical personnel

Legend has it that “hydraulic ram” was translated from English to Russian and then back to English and came out “water sheep”. Beware of the people translating.

Local Regulatory/Certification Compliance

Some countries are more regulated and require certification of products to specific regulations.

Automated Testing

Often times, tools that you have used for automating testing will need to be significantly reconfigured to work with multiple languages.

Adding a single language can often more than double the testing load for a product. This needs to be considered and planned for early in a project.

Predicting software Quality

Failure Probability – the number of times we expect to experience a failure over the number of total attempts. What is a good number 1 in a 1000 (failure probability of 0.001) – depends on the application. If it is a system that is critical for landing an aircraft 1 in a 1000 is way too low.

Mean Time to Failure (MTTF) – generally based on time, but sometimes on attempts. If a system is time based, we talk about MTTF in hours, days, months or years.

Reliability – is the probability that a system will fail under normal operation. This is not a particularly good measure, in that unless the Failure probability is zero, the more we use a system the more likely it is to fail. R = (1 – f)K where f is the failure probability and K is the number of executions.

Down Time – how long a failed system is out of service. This is a function of MTTF and how long it takes a system to recover.

Dependability – is the trust that a system will not fail catastrophically. Good example is a telephone central switch. These generally rely on redundant processors and databases. If one node fails another node(s) will know that there has been a failure and can assume the processing and handling of already connected calls. A central switch may have a capacity to handle from 5 – 10 million calls per hour and may be required to have a down time of less than 5 minutes per year. If this down time is exceeded, then the manufacturer is in trouble.

Managing Testing ToolsScripting tool

Scripting tools allow you to automate the execution of test cases.

These can range from writing driver programs and input/output simulation programs to tools that capture user interactions with a system and allow one to replay that interactions (WinRunner).

These tools generally have a capture/play back capability and allow scripting. Often a capture is done and the script is edited to really hammer the system with input. This is called stress testing.

Coverage Tools

Instrument your code to keep track of which functions, blocks of code or even which statements are executed.

These are invasive tools and may alter some aspects of a programs behavior (certainly timings and memory footprints). (PureCoverage).

Test Results Need to be Recorded

Legal evidence that due diligence has been served.

Some regulated industries (FAA, FDA, DoE, DoD, etc.) one is contractually require to maintain test results data and these results need to be delivered with the product.

Individual test engineers and their managers can be held professionally liable for falsification. TI VP was reassigned to desolate west Texas plant with a string trimmer for test results fraud.

Managing/Planning

Make sure there is time scheduled for developing and executing tests. If schedules slip, you cannot generally contract the time scheduled for testing.

Start test development as early as possible – with requirements and use cases.

Start testing as early as possible – finding bugs at the unit level makes it easier to correct the bugs (less code to search).

Testing early can detect misunderstandings in requirements and specifications with time to fix them.

Make sure there is time to fix defects that are found and rerun tests.

Notes_00

7

_Software Internationalization (i18n)

SE 3730 / CS 5730

Software Quality

©

20

11

Mike

R

owe

Page

1

01/05/2011 8:52 AM

1

Localization Internationalization

I

-

18

-

N

Testing

Introduction

·

Even small companies are making their software globally available.

o

In the latest survey of

CSSE

alumni for ABET objectives (folks 3

-

5 years after

graduation)

over

75

% of the respondents indicated that they had worked on

software for international use.

This is from the Spring 20

1

0 ABET Alumni Survey.

·

Internationalized

software supports multiple languages and cultures.

·

Localized

software can be switched to a specific la

nguage and culture.

·

Internationalized software requires

special

software attention

.

o

Design

o

Implementation

o

Testing

o

Deployment and Sales

o

Installation

o

Support and maintenance

·

Design and Implementation:

o

Is the text independent from the code?

This can be done

using text files that have

all messages in them or resource files. It is hard to be completely code independent,

due to sentence structure, and other language differences.

o

Example:

§

Embedded Text

Not good!

cout << “Error[4535]: The system has detected w

ater on the disk

<<

drive.

Do you wish to drain the disk?” << endl;

§

Seperated Code and Text

Better Way!

static Language = DE;

// Global that determines the current

language

// This is used by Xlate() function to retrieve

// the proper string message.

cout <<

Xlate( 4535 ) << en

dl;

// Xlate can also be used to set GUI labels

// Textboxs, etc.