unraveling unicode: a bag of tricks for bug hunting...unicode crash course code points encodings...

91
Black Hat USA July 2009 Chris Weber www.lookout.net [email protected] Casaba Security Unraveling Unicode: A Bag of Tricks for Bug Hunting

Upload: others

Post on 05-Sep-2020

24 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USAJuly 2009

Chris Weberwwwlookoutnet

chriscasabasecuritycomCasaba Security

Unraveling Unicode A Bag of Tricks for Bug Hunting

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Can you tell the difference

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

How about now

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscrİptgt

becomes

ltscriptgt

wwwcasabasecuritycom

The TransformersWhen good input turns bad

Black Hat USA - July 2009 copy 2009 Chris Weber

Agenda

wwwcasabasecuritycom

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 2: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Can you tell the difference

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

How about now

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscrİptgt

becomes

ltscriptgt

wwwcasabasecuritycom

The TransformersWhen good input turns bad

Black Hat USA - July 2009 copy 2009 Chris Weber

Agenda

wwwcasabasecuritycom

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 3: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

How about now

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscrİptgt

becomes

ltscriptgt

wwwcasabasecuritycom

The TransformersWhen good input turns bad

Black Hat USA - July 2009 copy 2009 Chris Weber

Agenda

wwwcasabasecuritycom

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 4: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscrİptgt

becomes

ltscriptgt

wwwcasabasecuritycom

The TransformersWhen good input turns bad

Black Hat USA - July 2009 copy 2009 Chris Weber

Agenda

wwwcasabasecuritycom

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 5: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Agenda

wwwcasabasecuritycom

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 6: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

ndash Find Unicode issues in Web-testing

ndash Visual Spoofing Detection

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 7: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 8: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull End users

bull Applications

bull Databases

bull Programming languages

bull Operating Systems

wwwcasabasecuritycom

Unicode Crash CourseThe Unicode Attack Surface

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 9: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash CourseUnthink it

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 10: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A large and complex standard

Unicode Crash Course

code pointsencodingscategorizationnormalizationbinary propertiescase mappingconversion tablesbi-directional properties

canonical mappingsdecomposition typescase foldingbest-fit mapping17 planesprivate use rangesscript blocks

escapings

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 11: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Shift_jis

Gb2312

ISCII

Windows-1252

ISO-8859-1

EBCDIC 037

wwwcasabasecuritycom

Unicode Crash CourseCode pages and charsets

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 12: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode can represent them all

bull ASCII range is preserved

ndash U+0000 to U+007F are mapped to ASCII

wwwcasabasecuritycom

Unicode Crash CourseAd Infinitum

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 13: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode 51 uses a 21-bit scalar value with space for over 1100000 code points

U+0000 to U+10FFFF

wwwcasabasecuritycom

Unicode Crash CourseCode points

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 14: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

A = U+0041

Every character has a unique number

wwwcasabasecuritycom

Unicode Crash CourseCode Points

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 15: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

AU+0041

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 16: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Unicode Crash Course

ſU+017F

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 17: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

UTF-8 ndash variable width 1 to 4 bytes (used to be 6)

UTF-16ndash Endianessndash Variable width 2 or 4 bytesndash Surrogate pairs

UTF-32ndash Endianessndash Fixed width 4 bytesndash Fixed mapping no algorithms needed

wwwcasabasecuritycom

Unicode Crash CourseEncodings

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 18: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A

EFBCA1

ampxFF21

amp65313

xEFxBCxA1

uFF21

wwwcasabasecuritycom

Unicode Crash CourseEncodings and Escape sequences

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 19: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 20: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 21: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash coursebull Root Causes

ndash Visual Spoofing and IDNrsquosndash Best-fit mappingsndash Normalizationndash Overlong UTF-8ndash Over-consumptionndash Character substitutionndash Character deletionndash Casingndash Buffer overflowsndash Controlling Syntaxndash Charset transformationsndash Charset mismatches

bull Tools

wwwcasabasecuritycom

Unicode TransformationsOverview

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 22: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Over 100000 assigned characters

bull Many lookalikes within and across scripts

AΑАᐱᗅᗋᗩᴀᴬA

wwwcasabasecuritycom

Root CausesVisual Spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 23: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Some browsers allow COM IDNrsquos

based on script family

ndash (Latin has a big family)

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 24: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 25: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Opera

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 26: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwgooglecom is not wwwgooɡlecom

Latin U+0069

LatinU+0261

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 27: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

ICANN guidelines v20

ndash Inclusion-based

ndash Script limitations

ndash Character limitations

wwwcasabasecuritycom

Root CausesThe state of International Domain Names

Deny-all default seems to be the right concept

A script can cross many blocks Even with limited script choices therersquos plenty to choose from

Great for domain labels but sub domain labels still open to punctuation and syntax spoofing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 28: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Non-Unicode attacks

bull Confusables

bull Invisibles

bull Problematic font-rendering

bull Manipulating Combining Marks

bull Bidi and syntax spoofing

wwwcasabasecuritycom

Attack VectorsVisual spoofing Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 29: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

rn can look like m in certain fonts

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

wwwmulletscom is not wwwrnulletscom

Latin U+006D

LatinU+0073 U+006E

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 30: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Are you using mono-width fonts

0 and O

1 and l

5 and S

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 31: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Classic long URLrsquos

httploginfacebookintvitationvideomessageid-

h048892r39sessionnfbidcomhomehtmdisbursements

wwwcasabasecuritycom

Attack VectorsNon-Unicode homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 32: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

wwwɑpplecom

All Latin using Latin small letter Alpha lsquoɑrsquo

wwwfaϲebookcom

Mixed LatinGreek with lunate sigma symbol lsquocrsquo

wwwаЬсcom

All Cyrillic lsquoabcrsquo

wwwcasabasecuritycom

Attack VectorsSingle-script and The Confusables

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 33: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Browsers whitelist ORG

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 34: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Others donrsquot necessarily buthellip

wwwcasabasecuritycom

Attack VectorsIDN homograph attacks

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 35: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsIDN homograph attacks

wwwmozillaorg is not wwwmoziacutellaorg

Latin U+0069

LatinU+00ED

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 36: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

(This case doesnrsquot work anymore)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

FULLWIDTH SOLIDUSU+FF0F

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 37: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

(Normalized to a U+002F)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecompathfilenottrustedorg

SOLIDUSU+002F

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 38: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

(However punctuation not requiredhellip)

wwwcasabasecuritycom

Attack VectorsIDN Syntax Spoofing with lookalikes

httpwwwgooglecomノpathノfilenottrustedorg

Katakana NoU+FF89

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 39: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsThe Invisibles

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 40: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack VectorsVisual Spoofing with Bidi Explicit Directional Overrides

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 41: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Commonly occur in charset transformations and even innocuous APIrsquos

Impact Filter evasion Enable code execution

When σ becomes s

U+03C3 GREEK SMALL LETTER SIGMA

When prime becomes

U+2032 PRIME

wwwcasabasecuritycom

Root CausesBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 42: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Scrutinize charactercharset manipulation APIrsquos

bull Use EncoderFallback with SystemTextEncoding

bull Set WC_NO_BEST_FIT_CHARS flag with WideCharToMultiByte()

bull Use Unicode end-to-end

wwwcasabasecuritycom

Root CausesGuidance for Best-Fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 43: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull A popular social networking site in 2008

bull Implemented complex filtering logic to prevent XSS

ndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with best-fit mappings to leverage cross-site scripting

ndash Root Cause best-fit mappings

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 44: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

-moz-binding()

was not allowed buthellip

-[U+ff4d]oz-binding()

would best-fit map

wwwcasabasecuritycom

Case Study Social NetworkingBest-fit mappings

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 45: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalizing strings after validation is dangerous

Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesNormalization

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 46: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

İ becomes I +

wwwcasabasecuritycom

Root CausesNormalization

U+0130 U+0049 U+0307

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 47: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

But are there dangerous characters

You bethellip with NFKC and NFKD you could control HTML or other parsing

﹤ becomes lt

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 48: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

﹤ becomes lt

toNFKC(ldquo﹤scriptgtrdquo) = ldquoltscriptgtrdquo

wwwcasabasecuritycom

Root CausesNormalization

U+FE64 U+003C

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 49: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalize strings before validation

NFKC first defense against Visual spoofing

wwwcasabasecuritycom

Root CausesGuidance for Normalization

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 50: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Non-shortest or overlong UTF-8

Impact Filter evasion Enable code execution

Application gets C0A7

OSFramework sees 27

Database gets

wwwcasabasecuritycom

Root CausesNon-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 51: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode specification forbids

ndash Generation of non-shortest form

ndash Interpretation of non-shortest form for BMP

bull Validate UTF-8 encoding (throw on error)

wwwcasabasecuritycom

Root CausesGuidance for Non-shortest form UTF-8

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 52: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

How many ways can you say

wwwcasabasecuritycom

Attack VectorsDirectory traversal

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 53: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Attack Vectors

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 54: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unassigned code points

ndash U+2073

bull Illegal code points

ndash Half a surrogate pair

bull Code points with special meaning

ndash U+FEFF is the BOM

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesHandling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 55: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Over-consuming ill-formed byte sequences

Big problem with MBCS lead bytes

lt41 C2 3E 41gt becomes

lt41 41gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 56: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

ltimg src=[0xC2]gt onerror=alert(1)ltbr gt

becomes

ltimg src=gt onerror=alert(1)ltbr gt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Over-consumption

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 57: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Correcting insecurely rather than failing

ndash Substituting a lsquorsquo or a lsquorsquo would be bad

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-substitution

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 58: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

ldquodeletion of noncharactersrdquo (UTR-36)

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 59: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

ltscr[U+FEFF]iptgt becomes ltscriptgt

wwwcasabasecuritycom

Root CausesHandling the Unexpected Character-deletion

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 60: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Fail or error

bull Use U+FFFD instead

ndash A common alternative is lsquorsquo which can be safe

wwwcasabasecuritycom

Root CausesSolutions for Handling the Unexpected

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 61: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Bypass filters WAFrsquos NIDS and validation

bull Exploit delivery techniques

ndash Eg Cross-site scripting (buffer overflow of the Web)

wwwcasabasecuritycom

Attack VectorsFilter evasion

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 62: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Safari and Firefox BOM consumptionndash Attack Filter evasion code execution

ndash Exploit Bypass filtering logic with specially crafted strings to leverage cross-site scripting

ndash Root Cause Character deletion

lta href=ldquojava[U+FEFF]scriptalert(bdquoXSS‟)gt

Can be nastier

lta h[U+FEFF]ref=ldquojava[U+FEFF]scriptal[U+FEFF]ert(bdquoXSS‟)gt

wwwcasabasecuritycom

Case Study Apple and Mozilla

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 63: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

A Closer Look The BOM

BOMU+FEFF

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 64: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Attackers manipulate casing operations to inject otherwise prohibited characters

bull Casing can multiply the buffer sizes needed

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 65: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

toLower(ldquoİrdquo) == ldquoirdquo

toLower(ldquoscrİptrdquo) == ldquoscriptrdquo

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 66: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

len(x) = len(toLower(x))

wwwcasabasecuritycom

Root CausesCasing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 67: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Perform casing operations before validation

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Casing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 68: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Incorrect assumptions about string sizes (chars vs bytes)

bull Improper width calculations

bull Impact Enable code execution

wwwcasabasecuritycom

Root CausesBuffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 69: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Casing - maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

Lower 8 15 Ⱥ U+023A

16 32 1 A U+0041

Upper 8 16 32 3 ΐ U+0390Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 70: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

Normalization- maximum expansion factors

wwwcasabasecuritycom

Root CausesBuffer Overflows

Operation UTF Factor Sample

NFC8 3X 119136 U+1D160

16 32 3X ש U+FB2C

NFD8 3X ΐ U+0390

16 32 4X ᾂ U+1F82

NFKCNFKD8 11X

ملسو هيلع هللا ىلص U+FDFA16 32 18X

Source Unicode Technical Report 36

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 71: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Know the difference between bytes and chars

bull Secure coding

bull Leverage existing frameworks and APIrsquos

ndash ICU Net

wwwcasabasecuritycom

Root CausesGuidance for Buffer Overflows

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 72: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull White space and line breaks

ndash Eg when U+180E acts like U+0020

bull Quotation marks

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesControlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 73: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Manipulate HTML parsers and javascriptinterpreters

bull Control protocols

wwwcasabasecuritycom

Attacks and ExploitsControlling syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 74: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode formatter characters exploited for XSS

ndash Damage Filter evasion controlling syntax

ndash Exploit Bypass filtering logic with specially crafted characters to leverage cross-site scripting

ndash Root Cause Interpreting ldquowhite spacerdquo

ndash A problem with HTML 40 spec

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 75: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

lta href=[U+180E]onclick=alert()gt

wwwcasabasecuritycom

Case Study Opera

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 76: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Case Study Opera

MVSU+180E

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 77: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Question specifications

bull Be carefulhellip

wwwcasabasecuritycom

Root CausesGuidance for Controlling Syntax

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 78: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

1) Character stabilityndash IDNANameprep based on Unicode 32

2) Designsndash Specs are carefully designed but not always perfect

bull This could have been a problemndash ldquoWhen designing a markup language or data protocol the use of

U+FEFF can be restricted to that of Byte Order Mark In that case any U+FEFF occurring in the middle of the file can be ignored or treated as an error rdquo

ndash HTML 401 bull Defines four whitespace characters and explicitly leaves

handling other characters up to implementer

wwwcasabasecuritycom

Root CausesSpecifications

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 79: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Converting between charsets is dangerous

bull Mapping tables and algorithms vary across platforms

bull Impact Filter evasion Enable code execution Data-loss

wwwcasabasecuritycom

Root CausesCharset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 80: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Avoid if possible

bull Use Unicode as the broker

bull Beware the PUA mappings

bull Transform case and normalize prior to validation and redisplay

wwwcasabasecuritycom

Root CausesGuidance for Charset Transformations

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 81: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Some charset identifiers are ill-defined

bull Vendor implementations vary

bull User-agents may sniff if confused

bull Attackers manipulate behavior

bull Impact Filter evasion Enable code execution

wwwcasabasecuritycom

Root CausesCharset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 82: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Root CausesCharset Mismatches

Content-Type charset=ISO-8859-1

ltmeta http-equiv=Content-Type content=texthtml charset=shift_jisgt

Attacker-controlled input

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 83: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Force UTF-8

bull Error if uncertain

wwwcasabasecuritycom

Root CausesGuidance for Charset Mismatches

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 84: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 85: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode crash course

bull Root Causes

bull Attack Vectors

bull Tools

wwwcasabasecuritycom

Unicode TransformationsAgenda

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 86: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Watcher

ndash Passive Web-app security testing and auditing

bull Unibomber

ndash XSS autopwn testing tool

wwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 87: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Unicode transformation hot-spotsbull User-controlled HTMLbull Cross-domain issuesbull Insecure cookiesbull Insecure HTTPHTTPS transitionsbull SSL protocol and certificate issuesbull XSS hot-spotsbull Flash issuesbull Silverlight issuesbull Information disclosure

wwwcasabasecuritycom

ToolsWatcher ndash Some of the Passive Checks Included

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 88: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weberwwwcasabasecuritycom

Tools

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 89: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

httpwebsecuritytoolcodeplexcom

wwwcasabasecuritycom

ToolsWatcher - Web-app Security Testing and Auditing

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 90: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Black Hat USA - July 2009 copy 2009 Chris Weber

bull Deterministic testing

bull Auto-inject payloads

bull Unicode transformers

ndash lt gt lsquo ldquo etc

bull Detect transformations and encoding hotspots

wwwcasabasecuritycom

ToolsUnibomberndash runtime XSS testing tool

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber

Page 91: Unraveling Unicode: A Bag of Tricks for Bug Hunting...Unicode Crash Course code points encodings categorization normalization binary properties case mapping conversion tables bi-directional

Thank you

Casaba Security

wwwcasabasecuritycom

Chris Weber

Blog wwwlookoutnet

Email chriscasabasecuritycom

LinkedIn httpwwwlinkedincominchrisweber