cit3611 software i18n wk 4: code sets, online help, prototyping david tuffley school of computing...

23
CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

Upload: horatio-griffin

Post on 20-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping

David Tuffley

School of Computing & IT

Griffith University

Page 2: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 2

Internationalisation - Basic Rules

Never hard-code translatable text Do not reuse the same string in different context 1 byte < > 1 character < > 1 glyph Watch for strings with several parameters

Page 3: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 3

Internationalisation - Goals

Making sure: Your application is able to process text from

any locale The interface can be localised without changes

in the source code The documents or data created by your

application are easy to localise

Page 4: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 4

Internationalisation - Code Sets

Character set is like a "bag" of characters. Example: A, B, d, ñ

Code set, coded character set or code-page, is the same as the character set, but a specific value, the code (or code-point) affects each character. Example: A=65, B=66, d=100, ñ=241

Page 5: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 5

Code Sets - Get Your Facts Straight

The vocabulary pertaining to code sets is often used incorrectly.

The terms code set and code page are interchangeable.

Microsoft documentation is confusing regarding code sets.

Nadine Kano's book helps

Page 6: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 6

ANSI Windows not the real ANSI

The first version of Windows used ISO-8859-1 (Latin-1) for code set. Then Microsoft introduced 24 extra characters (codes from 0x80 to 0x9F) that are not part of Latin-1.

Noticeable in some of the fonts still shipped with Windows: MS Sans Serif has no glyph defined for these code-points. The code set for Windows US should be called Windows Latin-1, or code-page 1252.

Page 7: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 7

"ANSI" not "Windows code set"

Some documents name the Windows code set "ANSI" even if when you use it in a different localised version of Windows, it is actually the Windows Cyrillic, or Windows Greek or Windows Turkish code set.

Same way the document uses "OEM" to refer to the DOS code-page, it should use a generic term for the Windows code set, rather than "ANSI."

Page 8: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 8

Don’t use ‘character sets’ or ‘charsets’ when you mean code sets Code set is an implementation of the character

set Several code sets can implement the same

character sets. In this case, the list of the characters supported is the same, but the codes are different. Eg. UCS-2 and UTF-8 are two different code sets, but they both implement the Unicode character set.

Page 9: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 9

Don’t mix up file format and file code set People mix up the content and the container: the

format of the file and its code set. They will say: "I saved this file in ASCII" when they really mean "I saved this file in Plain text." A plain text file could be in ASCII, but can also contain extended characters.

Page 10: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 10

Code Set - Families

DOS ISO Macintosh Windows IBM mainframe

Page 11: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 11

Code Sets - Unicode

Unicode an international character set Has the principal scripts of the world Unicode standard is foundation for the

internationalisation and localisation of software There are three levels of support for Unicode:

1: Combining characters not allowed 2: Avoid duplicate coded representations 3: All combining characters are allowed

Page 12: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 12

Han unification

To fit the tens of thousands of Chinese, Japanese and Korean ideograms in a 64-KByte space, Unicode uses the Han unification: where Japanese and Korean characters are derived from the Chinese characters.

In many cases the same symbol will mean the same thing.

Page 13: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 13

Character Composition

To support complex characters with diacritics, Unicode defines a generic way to encode a complex character. Instead of being coded in whole form, you can code any character with diacritics by using non-spacing marks.

Character composition is used, for example, to encode the Vietnamese characters.

Page 14: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 14

Surrogates

Hopefully you will not have to deal with surrogates. They are the mechanism put in place in Unicode to access the additional planes of ISO-10646. You can see them as "double-bytes," except they are double-wide-chars.

Page 15: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 15

Code Sets - Conversion

Converting from one code set to another is easy when you are only dealing with single-byte code sets.

Page 16: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 16

Screen-based help

plain text "Read Me" files, tutorial files, custom integrated help, sample files and stand-alone hypertext help.

Page 17: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 17

General Guidelines

Text Expansion Jargon, Humor, Use of Gender- or Culture-

Related Roles, Characteristics, or Issues Consistency with Software, Hardware, and

Documentation Hypertext Links Text Styles and Formatting

Page 18: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 18

General Guidelines cont.

On-Screen Controls File Format

Page 19: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 19

Windows Online Help

"Title" Footnote Text "Keyword List" Footnote Text Definitions (Pop-up Topics)

Page 20: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 20

Prototyping the key to success

Effective prototyping may be the most valuable core competence an innovative organisation can hope to have (Michael Schreg)

‘Spec Driven’ put much effort into developing a specification before proceding with production

‘Prototype Driven’ begin with an early prototype, then proceed with many iterations

Page 21: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 21

Prototyping the essential medium of:

Information transmission Interaction Integration Collaboration

Page 22: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 22

Work as play, play as work

You can ‘play your way’ to successful, innovative product development

At odds with traditional management models that champion predictability and control

Page 23: CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

CIT3611 Week 5: Code sets 23

Supported by research

Research by Tabrizi & Eisenhart (Stanford) looked at 72 product dev projects in 36 countries in Asia, Nth America and Europe

Most effective were those that iterated constantly Least were the hyper-organised, plan, plan planners Strong prototyping cultures therefore produce

strong products