common misconception among Oracle developers is they can run stored procedures in parallel only from outside the database. For example, many use Java-based code that spawns multiple Java threads and runs each procedure on a separate thread (using the java.sql.CallableStatementobject), or UNIX shell scripts using cron to enable this function. But there’s no need to reinvent the wheel this way—you can execute stored procedures in parallel inside an Oracle database using PL/SQL code. This 10-Minute Solution describes how, and it explains which types of tasks are suitable for this technique.
How do I run stored procedures in parallel from inside an Oracle database?
Use one of Oracle’s prepackaged APIs, the DBMS_JOB package, to submit multiple jobs to a job queue and then run them in parallel.
Using Oracle Native APIs: DBMS_JOB Package
Oracle databases provide a rich set of prepackaged APIs in the form of Supplied PL/SQL Packages. I browsed the Oracle 9.2 documentation and found references to 101 packages, although issuing the following query returns 320:
select count(*) from dba_objects where object_type = 'PACKAGE BODY' and owner = 'SYS'
See the complete list at Oracle9i Supplied PL/SQL Packages and Types Reference, Release 2 (9.2). Access requires an Oracle.com login.
Before you start programming your own code, always check Supplied PL/SQL Packages. Most of the time, you’ll find a useful API for your task, which the Oracle development team has already developed (and tested). The DBMS_JOBpackage is one of those APIs. It’s Oracle’s native scheduler, which provides a set of APIs for submitting and managing jobs in a queue.
How does it help with parallel execution? Well, if you submit multiple jobs to a job queue and schedule them all to start immediately, then they will start and run in parallel. The useful API in this case is DBMS_JOB.SUBMIT. It submits jobs to a job queue for execution. The
next_dateparameter of DBMS_JOB.SUBMIT determines the next date when the job will be run. It has a default value of SYSDATE, so if you don’t specify a date, the job will start immediately. However, the job will be submitted to the job queue only after commit, which makes this process transaction-based.
You won’t find this functionality in the UNIX shell scripts. Scripts run with no transaction knowledge. As a matter of fact, these scripts will start (and subsequently fail) even if the database is down. DBMS_JOB-based code has no such problems because it lives inside the database.
Another problem with the UNIX-based scripts is security. Because they reside outside of the database, they have to contain a password to connect to the database. Hard-coding passwords in scripts is not a secure solution. The same problems apply to Java-based code, since it also lives outside the database.
Another big advantage of the DBMS_JOB-based solution is load balancing. A system-modifiable database parameter,
job_queue_processes, determines the maximum number of concurrent jobs the solution can execute. In another words, it controls how many jobs can be run simultaneously. Consider the following scenario:
- You develop code that detects out-of-balanced indexes, which can happen as a result of massive delete operations.
- You want to rebuild them, and to save time you decide to do it in parallel.
Now, how many indexes do you rebuild in parallel at the same time, knowing the number of out-of-balance indexes will vary? For example, if you have three “bad” indexes, you’d rebuild them all at the same time. However, if you have 30 indexes to rebuild and only a four-CPU server, you’d want to build only some (maybe eight) indexes simultaneously and process more after the previous ones get rebuilt. That’s exactly what the
job_queue_processesparameter enables you to do.
The DBA sets this parameter based on the database server resources and workload. For example, a DBA would set a four-CPU machine to 8:
alter system set job_queue_processes = 8;
As a result, when you rebuild 30 indexes, all 30 jobs will be submitted to the queue but only eight will start. As soon as a job finishes, the next job from the queue starts. Only eight indexes are rebuilt concurrently to conserve database server resources.
DBMS_JOB is a nice way to execute jobs in parallel because you control the degree of concurrency. You could set up to 36 job queue processes (in 8i) or 1,000 (in 9i) and that would be the degree of concurrency.
The performance benefits could be substantial. For example, I did a performance test on two 1-million-row tables with four indexes to rebuild. The following snippet is the first block of code I used, which rebuilds indexes with NOLOGGING PARALLEL options serially using the RebuildIndex procedure of my Maintenance package:
declare vStart number;begin vStart := dbms_utility.get_time; Maintenance.RebuildIndex('ORDERS_PK', 'NOLOGGING PARALLEL'); Maintenance.RebuildIndex('SALES_PK', 'NOLOGGING PARALLEL'); Maintenance.RebuildIndex('ORDERS_AK1','NOLOGGING PARALLEL'); Maintenance.RebuildIndex('SALES_AK1', 'NOLOGGING PARALLEL'); dbms_output.put_line('Elapsed time: ' || round((dbms_utility.get_time - vStart)/100, 2) || ' hsecs' );end;/
The second block of code rebuilds indexes in parallel using DBMS_JOB:
declare vJob number; begin dbms_job.submit(job => vJob, what => 'Maintenance.RebuildIndex(''ORDERS_PK'', ''NOLOGGING PARALLEL'');'); dbms_output.put_line('Index ORDERS_PK: Job ' || to_char(vJob)); dbms_job.submit(job => vJob, what => 'Maintenance.RebuildIndex(''SALES_PK'', ''NOLOGGING PARALLEL'');'); dbms_output.put_line('Index SALES_PK: Job ' || to_char(vJob)); dbms_job.submit(job => vJob, what => 'Maintenance.RebuildIndex(''ORDERS_AK1'', ''NOLOGGING PARALLEL'');'); dbms_output.put_line('Index ORDERS_AK1: Job ' || to_char(vJob)); dbms_job.submit(job => vJob, what => 'Maintenance.RebuildIndex(''SALES_AK1'', ''NOLOGGING PARALLEL'');'); dbms_output.put_line('Index SALES_AK1: Job ' || to_char(vJob)); commit; end;/
The above block executes almost immediately and produces the following output:
Index ORDERS_PK: Job 10Index SALES_PK: Job 11Index ORDERS_AK1: Job 12Index SALES_AK1: Job 13
However, these results don’t mean the indexes got rebuilt that fast. Indexes get built in the background. To view the progress, you can query the
select job, total_time from user_jobs where job in (10, 11, 12, 13);
The output of this query shows that jobs 10–13 have been running for two seconds:
10 211 212 213 2
If you keep querying for a while, some jobs will finish and disappear from the query output:
12 12713 127
Jobs 10 and 11 are done. Indexes
SALES_PKare rebuilt, and it took somewhere between 117 and 127 seconds (since you keep querying every 10 seconds). When the above query returns no rows, the index rebuilding jobs are complete. The second block of code, which rebuilds indexes in parallel, executed 30–35 percent faster than the first block. That’s a significant performance gain.
Now that the execution has completed, you can tell one piece of code is definitely missing from the parallel execution block listed previously. It needs a procedure that automates the process for determining execution completion. The way it currently works is not elegant at all: opening another SQL*Plus session and issuing the same query every 10 seconds from that session. (See the WaitForCompletion procedure in JobsPkg in Listings 1 and 2.)
A possible solution would be a procedure that polls the
user_jobsview in a loop until it returns no rows. But again, as I said at the beginning, there’s no need to reinvent the wheel. Oracle provides two packages to communicate between different database sessions: DBMS_ALERT and DBMS_PIPE. I’ve chosen DBMS_ALERT, because when a job completes, it sends an “I’m done” signal to the main block of code that initially submitted that job to a job queue.
Using DBMS_ALERT to Signal Job’s Completion
The DBMS_ALERT package supports asynchronous notification of database events (alerts). It provides APIs to send alerts, register for alerts, and wait to receive them. The procedures this Solution uses are SIGNAL, REGISTER, and WAITFORANY:
DBMS_ALERT.SIGNALprocedure signals an alert. The SIGNAL call goes into effect only when the transaction in which it is made commits. If the transaction rolls back, SIGNAL has no effect. All sessions that have registered interest in this alert are notified. (If the application does not require transaction-based alerts, the DBMS_PIPE package may provide a useful alternative.)
DBMS_ALERT.REGISTERprocedure registers interest in an alert.
DBMS_ALERT.WAITFORANYprocedure waits for any of the alerts for which the current session is registered to occur. (For more details see Oracle9i Supplied PL/SQL Packages and Types Reference, Release 2 (9.2), DBMA_ALERT package. Access requires an Oracle.com login.)
As you see, the DBMS_ALERT package has exactly what you need. The main session carries out the following procedure:
- Runs N jobs in parallel
- Registers for N alerts (one per job)
- Submits the jobs to a job queue
- Waits to be notified by all of the alerts
Take a closer look at the code in the WaitForCompletion procedure. It takes the following steps:
- Builds an associative array of N jobs indexed by string (new in Oracle 9i)
- Registers for all N alerts
- Waits for alerts (As soon as it gets notified, it removes the alert from the associative array built in step 1 and gets back in waiting mode. When the array gets empty, the procedure has received all of the alerts you registered for, all of the jobs are finished, and it’s time to exit the loop.)
Author warning: Oracle restricts alert names to 30 characters each and converts them to upper case internally! Therefore, if a session signals a
DemoPkg.Sleep alert, the sessions interested in that alert have to register for the
DEMOPKG.SLEEP alert. Otherwise, these sessions would remain in waiting mode forever. That “small” detail may cause some frustration, especially in debugging processes that run in the background. Trust me, I learned the hard way. The code in the private BuildAlertName function resolves these case sensitivity and length limitation issues.
To demonstrate the JobPkg code work, I created a DemoPkg package with one simple procedure,
Sleep. This procedure “sleeps” for the number of seconds passed as a parameter (click hereto download the package).
The following code shows the DemoPkg package specification and body:
create or replace package DemoPkg is procedure Sleep ( pTime integer ); end DemoPkg;/create or replace package body DemoPkg is procedure Sleep ( pTime integer ) is begin dbms_lock.sleep(pTime); JobsPkg.SignalCompletion('DemoPkg.Sleep(' || to_char(pTime) || ')'); end Sleep; end DemoPkg;/
You can use this package to execute code similar to the parallel execution block in the Control Concurrency section and run three
Sleepprocedures in parallel:
DemoPkg.Sleep(10)will sleep for 10 seconds and send an alert.
DemoPkg.Sleep.Sleep(15)will sleep for 15 seconds.
DemoPkg.Sleep.Sleep(20)will sleep for 20 seconds.
The following code executes these procedures:
set timing ondeclare vJobs JobsPkg.tArrayJob;begin vJobs(1) := 'DemoPkg.Sleep(10)'; vJobs(2) := 'DemoPkg.Sleep(15)'; vJobs(3) := 'DemoPkg.Sleep(20)'; JobsPkg.Execute(vJobs); JobsPkg.WaitForCompletion(vJobs);end;/
Unlike the parallel execution block code from the Control Concurrency section, the above code does not return control immediately. It waits for completion of all the jobs from the vJobs array. Its output shows that the execution time is a little more than 20 seconds, as expected:
Received alert: DEMOPKG.SLEEP(10) with message: DemoPkg.Sleep(10) completed.Received alert: DEMOPKG.SLEEP(15) with message: DemoPkg.Sleep(15) completed.Received alert: DEMOPKG.SLEEP(20) with message: DemoPkg.Sleep(20) completed.PL/SQL procedure successfully completed.Elapsed: 00:00:22.05
When Would I Use This Technique?
During a recent project, I had to speed up legacy code that was so slow running it overnight didn’t allow enough time for it to complete. It was an intensive data-manipulation batch process that consisted of 14 major steps. All the steps were executed serially. Looking at the dependencies, I found the following:
- Steps 1–5 were independent.
- Steps 6–9 were also independent, but they relied on the results from Steps 1–5.
- Steps 10–11 had to be run after Steps 6–9 and one after another.
- Steps 12–14 were independent, but they relied on results from Step 11.
Using the JobsPkg APIs, I modified the code by creating the following four code blocks and executing them in sequence:
- Block 1: execute five steps (1–5) in parallel
- Block 2: execute three steps (6–9) in parallel
- Block 3: execute steps 10 and 11 serially
- Block 4: execute the last three steps (12–14) in parallel
Simply by restructuring the code, I was able to cut execution time by 60 percent! And I hadn’t even touched the code itself yet, just an order of execution.
You can easily execute stored procedures in parallel—and reap significant performance benefits—by using code like that in the JobsPkg package. You can use the package as a base and add more functionality based on your own business rules and requirements.