using xmlindex and binary xml for motorola bis aris prassinos, distinguished member of technical...
TRANSCRIPT
<Insert Picture Here>
Using XMLIndex and Binary XML for Motorola BISAris Prassinos, Distinguished Member of Technical Staff, MotorolaAsha Tarachandani, Senior Member of Technical Staff, Oracle Inc.
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remain at the sole discretion of Oracle.
Introduction
• Motorola Printrak: Biometrics Identification Solution
• Oracle XMLIndex
• Binary XML
• Oracle XMLIndex team: Thomas Baby, Sivasankaran
Chandrasekaran, Asha Tarachandani, Anh-Tuan Tran
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Motorola Printrak Biometrics Identification Solution
• A comprehensive solution for investigation, identification and verification in criminal and civil markets• criminal investigation• applicant background checks• biometric visa and passport• border patrol and security• social services fraud detection
• Provides full biometric integration • fingerprints, palmprints, facial images, irises, signatures,
descriptive data and documents
BIS Application Characteristics
• OLTP• Read intensive, frequent inserts, occasional updates / deletes
• Structure of data different in each deployment• Each customer stores different demographics and arrest
information as well as custom defined elements• Schema may also change over time within the same system
• Designed to be deployed without extensive custom configuration and to operate without onsite DBA
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
BIS Indexing Needs
• Schema-less XML chosen as a storage format for maximum flexibility• Several million XML documents stored per table• Several thousand documents inserted / updated daily• Size of XML documents ranges from 1K to 20K• Number of tags per XML document ranges from 10 to 100• Documents may contain collection elements• XML documents contain 5 – 20 searchable tags
• XML data must be indexed without prior knowledge of the paths that will be queried but if they are known in advance this can be used to optimize the indexing
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
BIS 10g Indexing Approach
• Functional indexes
• Allow range queries, arithmetic, aggregation
• Fastest possible query performance if you know the Xpath
expressions used in queries in advance!
• Not possible to automatically index all paths
• Index maintenance cost climbs up as the number of indexed
nodes increases
• Cannot index collection elements
BIS 10g Indexing Approach (contd.)
• Oracle Text Index• No prior knowledge of queries necessary• Index Creation and Maintenance overhead is minimized by:
• Selective exclusion / inclusion of tags or attributes as well as bypassing entire rows
• Asynchronous index maintenance• Satisfactory query performance• Does not allow range queries, arithmetic, aggregation• Prefix indexing necessary to avoid ‘query too complex’ when
doing wildcard queries on short strings• Periodic optimization needed due to fragmentation
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
BIS 11g Indexing Approach – Oracle XMLIndex
• Meets all BIS Indexing needs
• Index specialized for XML Data and Queries
• Resolves querying limitations of Text and Functional
indexes without sacrificing performance• Allows range queries, arithmetic, aggregation
• Allows wildcard queries on short strings
• Can index collection elements
• Can extract fragments
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Oracle XMLIndex
• Universal indexing solution for XML Data and Queries
• Provides improved query performance
• Schema-less and schema-based data
• Binary-XML and CLOB storage
• SQLX and XQuery Data Model
• Wide range of Xpaths
• Index creation, maintenance and queries can go parallel
• Adhoc queries can be supported
Oracle XMLIndex – Details
<descriptors>
<d_child1>…</d_child1>
<d_child2>…</d_child2>
<addr>
<a_chidl1>…</a_child1>
<a_chidl2>…</a_child2>
<a_chidl3>…</a_child3>
<a_chidl4>…</a_child4>
<zip>65487</zip>
</addr>
</descriptors>
Row ID Path ID Order Key
Locator Value
Row ID of Base XML table
Token for each path Position of this node in the XML doc
Offsets into Base XML table column, additional info
If any
111111 Path ID for /descriptors
1 Start 0
End 2050
111111 Path ID for /descriptors/addr
1.3 Start 1047
End 1065
111111 Path ID for /descriptors/addr/zip
1.3.5 65487
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Oracle XMLIndex – Asynchronous Maintenance
• Cost of immediate index maintenance is avoided
• Improves DML performance
• Sync performance is optimized by batching up rows to
be indexed
• Index can be synced automatically or manually
• Dictionary Views are available to check the current
state of the index
Oracle XMLIndex – Path Subsetting
• Specify
• Paths that will be used in common queries or
• Paths that will rarely be used
• Can change the specified paths later
• Better DDL, DML performance
• Reduces size of primary and secondary indexes. Less
storage overhead
• Transparent to queries
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
BIS 11g Approach –XMLIndex Usage
• create index ads_xml_index on ads_element(desc)
indextype is XDB.XMLINDEX
parameters('PATH TABLE ADS_PATH_TABLE
PENDING TABLE ADS_PEND_TABLE
ASYNC (SYNC EVERY
“FREQ=MINUTELY; INTERVAL=2”)
PATHS(//ArrestCode
//Sex
//Classification)’);• select …
where extractValue(desc,'//ArrestCode') =‘C01‘
BIS 11g Approach –XMLIndex Usage
• XML-aware index performs well for XML Data• For example, queries on collection elements can make use of
the index.
• Path subsetting and Asynchronous maintenance alleviate Index maintenance overhead
• Querying XML Data • Allows range queries, arithmetic, aggregation• Allows wildcard queries on short strings
• No periodic defragmentation necessary as was the case with the Text index
• Can be combined with Functional Indexes on selected paths when maximum query performance is required
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Oracle XMLIndex – features
• Asynchronous index maintenance• Path subsetting• XQuery support• Support for indexing CLOBs embedded within O-R
storage• XML-DB repository• Partitioned index• Parallel index creation, maintenance and query• Binary XML support
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Oracle Binary XML
• Encoding format intended for use in all tiers of the Oracle stack
• Oracle XML DB
• Oracle iAS / XDK Java
• Improved storage, retrieval
• Parsing, validation and conversion costs are reduced or eliminated
• Smaller footprint results in less disk IO
• Reduced CPU cost for loading XML info-set into memory
• Query Performance
• Improved fragment extraction using XMLIndex
• Streaming single-pass evaluation of many XPaths when not using
XMLIndex
Oracle Binary XML (contd.)
• Support for schema-based and schema-less
documents
• Exploits XML Schema information about data-types
and structure
• Preserves Infoset or Data Model fidelity
Using XMLIndex and Binary XML for Motorola BIS
• Motorola Printrak• BIS Indexing
• Requirements• 10g• 11g
• Oracle XMLIndex• Details • Maintenance overhead– Asynchronous and Path-subsetting• BIS Usage• Features
• Oracle Binary XML• Conclusion
Conclusion
• XMLIndex is the complete indexing solution for XML
• Universal framework allows expanding to all XML DB
areas – Binary XML, O-R storage, Repository,
XQuery etc – and Oracle DB areas – partitioning,
parallelism, relational views etc
• Motorola Biometrics plans to use XMLIndex for query
performance with Asynchronous index maintenance
and Path Subsetting
The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remain at the sole discretion of Oracle.
Other XML talks at Oracle Open World, 2006
State of California Legislative Data Center
10/24/2006
2:30 PM - 3:30 PM
Hilton Hotel
Continental Parlor 3
On-Demand XML Information Solutions
10/24/2006
1:45 PM - 2:45 PM
Moscone West
3004 West
Developing XML Applications Using Oracle Fusion Middleware
10/26/2006
8:00 AM - 9:00 AM
Moscone South
304 South
Demo grounds Every day