Browse DevX
Sign up for e-mail newsletters from DevX


The Oracle 10g Data Pump API Speeds Up the ETL Process : Page 4

How can you keep slow extract, transform, and load operations from bogging down your production instance? Oracle 10g's Data Pump API offers a great solution.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Large Amounts of Data

If you're transferring a large amount of data and your source and target databases have limited bandwidth between them, then it might be faster to export the tablespaces on the source machine, move the file over, and import the tablespaces on your target machine. You can do this either with the dbms_datapump API or with expdp/impdp.

You'll need to define a directory object on the source and target machines to hold the dump file sets:

CREATE DIRECTORY dpump_dir AS /oracle/dpumpdir ;

If you're using ASM, you'll need to create the directory as a reference to the appropriate disk group:


The user executing the Data Pump commands needs to have read and write access to these directories:

grant read, write on directory dpump_dir to bulkload; grant read, write on directory dpump_log to bulkload;

If your source and target databases are on the same machine, then the fastest solution may be to set the Data Pump directory object to the same directory on both machines, use Data Pump Export on the source instance to export the tablespace set from the source, leave it in place, and use Data Pump Import on the target instance to pull the tablespace dump file set from the source.


Oracle 10g provides another new way to copy a tablespace from one database to another. It isn't part of the Data Pump API, but it's worth a mention. DBMS_STREAMS_TABLESPACE_ADM.PULL_SIMPLE_TABLESPACE will copy a single-datafile tablespace from one database to another over a database link. Like the preceding Data Pump examples, it can even do this cross-platform.

Resolve a Common Data Warehousing Issue

You can use the solution presented in this article to handle a common dilemma in data warehousing environments. For example, you might be building a data mart that combines information from a set of slow data warehouses in many different segments of the enterprise to help top-level management make decisions. You want your data mart to be as fast as possible, and to be up as much as possible. Now you know how to implement the best solution using the DBMS_DATAPUMP API.

Natalka Roshak is a database administrator, analyst, and architect based in Ontario, Canada. Find more of her articles and scripts at toolkit.rdbms-insight.com. Contact Natalka at www.rdbms-insight.com/contact.php.
Thanks for your registration, follow us on our social networks to keep up-to-date