Browse DevX
Sign up for e-mail newsletters from DevX


The Oracle 10g Data Pump API Speeds Up the ETL Process : Page 3

How can you keep slow extract, transform, and load operations from bogging down your production instance? Oracle 10g's Data Pump API offers a great solution.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Putting It Together

Now you're ready to put Steps 1, 2, and 3 together and kick off the job. See Listing 1 for the code.

You can keep an eye on the job while it's running by querying USER_DATAPUMP_JOBS. You can also see details about the sessions connected to a job by looking at the view DBA_DATAPUMP_SESSIONS. As the job progresses, the job status should change from RUNNING to COMPLETING, and finally vanish when the job is done.

If you query the master table while the job is running, you'll note that it changes as the job progresses, keeping track of job progress in case the job is stopped.

The next section looks at more detailed job monitoring.

Monitoring a Running Job
One of the big advantages of DBMS_DATAPUMP is its sophisticated monitoring tools. You can connect to a running job and get a really rich bundle of information on its status. All of this information is accessed through the DBMS_DATAPUMP.GET_STATUS procedure, covered in Sidebar 3: About GET_STATUS.

The information returned by GET_STATUS provides running updates on the job status in the code in Listing 2.

Fetching a Lot of Tables
Data Pump is a bit of overkill if you're fetching only one table, of course. And one of the reasons to use Data Pump is that it's so easy to specify a set of tables. Suppose that your load process needs to refresh all the tables in your schema that start with ALL_BILL%. You can simply supply this expression to dbms_datapump.metadata_filter:

dbms_datapump.metadata_filter (handle => my_handle, name => 'NAME_EXPR', value =>
'LIKE ''ALL_BILL%''', object_type => 'TABLE');

Or you can fetch all the tables in a given schema or a given tablespace. For example, refresh all the tables that are in the DATA_TO_LOAD tablespace on the source instance:

dbms_datapump.metadata_filter (handle => my_handle, name => 'TABLESPACE_EXPR', value =>
'=''DATA_TO_LOAD''', object_type => 'TABLE');

See Listing 3 and Listing 4 for the full code for these examples.

Note that the last example is pulling an entire tablespace. However, unlike transportable tablespaces (TTS), you don't need to put the source tablespace in read-only mode, because the tables are being pulled one at a time. DBMS_DATAPUMP also supports transportable tablespace export/import (see Chapter 27 of the Oracle Database PL/SQL Packages and Types Reference for details—login required).

You can also do a schema-mode import. In this case, you would call dbms_datapump.open with mode=>'SCHEMA', and use dbms_datapump.metadata_filter with object_type =>'SCHEMA' to specify the schema names you want to load.

Thanks for your registration, follow us on our social networks to keep up-to-date