data pump practical lessons, tips and...
TRANSCRIPT
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Data Pump Practical Lessons, Tips and Techniques
Aris PrassinosDistinguished Member of Technical StaffMotorola
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 2
Motorola Printrak Biometrics Identification Solution
• Motorola Printrak BIS provides full biometric integration with inclusion of fingerprints, palmprints, facial images, irises,signatures, descriptive data and documents to deliver a comprehensive solution for investigation, identification and verification in both the criminal and civil markets
• Current applications include criminal investigation, applicantbackground checks, biometric visa and passport, border patrol and security, and social services fraud detection
• Deployed in a variety of customer environments and configurations
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 3
Application Characteristics
• OLTP
• Large amount of LOB data• BLOB and XML stored as CLOB• Deployed systems range from 500 GB to several TB
• Small number of tables, indexes, and packages• Hash partitioning used for large systems• Oracle Text indexes used for XML data
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 4
Data Movement Needs
• Initial loading of system with back data• 500 GB – 1 TB • 5 - 10 million rows per table
• Residual loading of back data to production system
• Selective archiving / restore
• Moving test data
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 5
Data Movement Methods
• Prior to Oracle 10g, Transportable Tablespace used to be our method of choice for medium to large transfers due to the performance problems of original Export and Import
• With Data Pump available in 10g, we use Transportable Tablespace only for very large transfers and if the performance gains justify their additional complexity and limitations
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 6
Parameters for Large Imports
• DISK_ASYNCH_IO=TRUE• NOARCHIVELOG
• Very significant difference over ARCHIVELOG
• DB_BLOCK_CHECKING=FALSEDB_BLOCK_CHECKSUM=FALSE• Very little noticeable difference in our environment
• Disable Block Change Tracking• Very little noticeable difference in our environment
• Size data files properly before a large import• Performance degradation observed with AUTOEXTEND
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 7
Using Parallelism
• PARALLEL is only available with Enterprise Edition• With Standard Edition, multiple import sessions can be used with fine
grained object selection to load different tables, partitions or indexes in parallel
• More information on parallelism:• http://www.oracle.com/technology/products/database/utilities/pdf/parallel
_cap_datapump.pdf
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 8
Two Levels of Parallelism
• Parallelism implemented by Data Pump workers and server PX processes• Both workers and PX processes count against the degree of parallelism• Workers implement inter-segment parallelism using the DIRECT PATH
or EXTERNAL TABLE access methods for each segment (segment refers to a table partition or an entire unpartitioned table)
• PX processes implement intra-segment parallelism using the EXTERNAL TABLE access method for each segment• Due to bug 5212908 this is not working properly in versions up to
10.2.0.3. Fixed in 11g. A patch will be available for 10g in the future.• Intra-segment parallelism is not possible for LOBs
• In a RAC, workers run only on the originating instance but PX processes may run on multiple instances• Put dump files on shared storage if planning to use parallelism
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 9
Access Methods
• A combination of workers and PX processes (and hence access methods) is automatically chosen by Data Pump• Depending on the number and size of segments as well as the timing of
the job, some segments will be loaded with inter-segment parallelism and some with intra-segment parallelism
• Automatic access method choice can be overridden by the hidden parameter access_method=direct_path which effectively disables intra-segment parallelism (hence no PX processes will be used)• If DIRECT PATH can’t be used for a segment it will not get imported!
• Two equally sized segments will be loaded faster with two parallel direct path streams vs using two parallel external table streams for the first followed by two parallel external table streams for the second• We have been able to get 10%-20% improvement for datasets
containing equally sized segments by forcing direct path with PARALLEL
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 10
Parallel Index Creation
• Indexes are created one at a time by a single worker using multiple parallel PX processes• Standard rules for parallel index creation apply• Depending on number and type of indexes this can take
as much time or longer as the data import itself
• For best results the degrees of parallelism for data loading andindex creation can be different• Even if I/O bandwidth is not sufficient for parallel data import,
indexes may still be able to benefit from parallel creation
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 11
Issues with LOBs
• Some bugs affect the performance of importing LOBs in versions up to 10.2.0.3• Bug 5555463: Import of LOBs via external table access is very slow
(about 30 times slower than direct path in our environment)This makes any operation that requires external table access unusablefor large tables containing LOBs. Fixed in 11g for LOBs smaller than 256K. A patch will be available for 10g in the future. For larger LOBs, the issue remains.
• Bug 5212908: When PARALLEL>1, intra-segment parallelism is attempted by Data Pump for tables with LOBs, even though this is not supported for LOBs. This leads to external table access, which is in turn affected by the previous bug 5555463. Fixed in 11g. A patch will be available for 10g in the future. Alternatively on 10g, access_method=direct_path may be forced, if otherwise applicable.
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 12
Parallelism Recommendations
• The standard recommendations:• Degree of parallelism approximately 2x number of CPUs• Degree of parallelism during import not much larger than number of
dump files• Use multiple dumpfile templates when exporting
(e.g. exp1_%U, exp2_%U etc) • Spread I/O
• Monitor the I/O to ensure system can handle it • Performance substantially degrades if you overshoot degree of
parallelism, especially during import• Degree of parallelism can be adjusted on the fly
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 13
Network Import
• Some performance gains over the equivalent export-transfer-import when used with non-LOB data if the network can handle the traffic
• Performance degradation with LOB data stored out-of-line• Multiple LOB locators returned in a single fetch; however the actual LOB
data is individually fetched, resulting in multiple network round trips• In our environment, performance degraded by a factor of 3 over the
equivalent export-transfer-import
• Only inter-segment parallelism is supported• Should be used for convenience rather than performance
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 14
Different Partitioning Schemes
• If source and target systems use different partitioning schemes1.Create empty tables on target system with desired partitioning scheme
(edit sqlfile created by Data Pump)2.Import with table_exists_action=append
APPEND will use external table access methodCannot force access_method=direct_path in this caseBe aware of its implications if importing tables with LOBs
• Oracle Database 11g Data Pump has greatly enhanced partition handling capabilities
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 15
Fine Grained Object Selection
• Significantly enhanced over original Export and Import• Any type of object can be filtered as well as subsets of objects within an
object type• For example, only include selected packages, exclude indexes that
do not apply to the target system etc• QUERY can also be specified during import
• Allows filtering without using a staging area• A different QUERY can be specified for each table
• Simplifies partial export / import of multiple tables with referential integrity constraints
• Allows DDL only export / import for some tables and full for others • Put filters in a parameter file rather than command line
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 16
Fine Grained Selection Caveats
• Excluding/including an object will also exclude/include its dependent objects• Check database_export_objects, schema_export_objects,
table_export_objects views for the list of dependencies• Statistics imported by default – make sure they are accurate• Even if indexes are excluded, the ones needed to support PK and UK
constraints will be imported• During import, if an object type that does not exist in the dump
file is excluded / included, the import will be aborted with ORA-39168:Object path was not found• This behavior is not observed during export
• QUERY will use external table access method• Cannot force access_method=direct_path in this case• Be aware of its implications if importing tables with LOBs
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 17
Consistent Exports
• Flashback Query is needed for consistent exports• By default, Data Pump only guarantees consistency for single tables
• LOBs should use RETENTION rather than PCTVERSION• RETENTION obeys UNDO_RETENTION init parameter but
PCTVERSION doesn’t
• Advisable to use FLASHBACK_SCN rather than FLASHBACK_TIME• Be aware of possible ORA-1466 due to time-to-SCN mapping granularity
when using FLASHBACK_TIME
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 18
Very Large Consistent Exports
• For very lengthy consistent exports, it may not be realistic to set such a high UNDO_RETENTION, especially in an OLTP system
• Transportable Tablespace is a good choice in this case• If the system cannot be set to read only while the data files are being
copied, transportable tablespaces can also be created directly from backups using RMAN TRANSPORT TABLESPACE
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 19
Archiving Data
• External tables created with ORACLE_DATAPUMP driver are a good choice for data that needs to be archived and possibly later reloaded • Multiple dump files generated at different times can be combined
transparently during reload• Complex logic (e.g. merge, error handling) can be performed during
reload• Data can be transformed (e.g. denormalized) during unload• Archived data can be accessed without even reloading it back• Dump file format is not compatible with Data Pump expdp / impdp • DDL is not contained in the dump file
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 20
Minimizing Redo Generation
• If database is in ARCHIVELOG mode• Prior to import, put tablespace(s) in NOLOGGING mode
OR1.Precreate the tables with content=metadata_only
exclude=index,constraint,ref_constraint2.Put tables in NOLOGGING mode3.Import with table_exists_action=append
(Be aware of external table access implications if importing LOBs) 4.Create indexes with NOLOGGING (edit sqlfile created by Data Pump)
Put objects back to LOGGING mode as applicable • Be aware of the backup / recovery / standby implications of
NOLOGGING operations before doing any of the above
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 21
Tracing
• TRACE=480300 parameter for expdp / impdp• Extended log information - Not so excessive that will fill up the disk• Check Metalink Note: 286496.1 for more info
• Use it along with the standard log file• Access method used per object
Some example messages from the trace files:In dm trc file: Direct path selected for next work item
External table selected for next work item In dw trc file: load method of: 1
load method of: 2• Detailed timing
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 22
Conclusion
• Data Pump offers significant performance and functionality improvements over original Export / Import
• Spend more time to tune your I/O subsystem rather than Data Pump itself
• If planning to use parallelism make sure it is applicable• Special consideration necessary when dealing with LOBs• Very important to understand the differences between Direct
Path and External Table access methods
Oracle Open World 2006MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. © Motorola, Inc. 2005
Slide 23
Questions - Answers