Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Optimize the Massive DELETE Operation in Oracle, Part 2 : Page 4

Perform the massive DELETE operation as a swap delete with parallel execution. This technique can offer significant performance gains over a straight or in-place delete operation.


advertisement

In-place Delete vs. Swap Delete Performance
The first test deletes the results from runs with even runIDs (2, 4, 6, ..., 24), which will be roughly 50 percent of the data in the TestData table:


select sum(case when mod(runId, 2) = 0 then 1 else 0 end) as To_Delete, 
       sum(case when mod(runId, 2) > 0 then 1 else 0 end) as To_Keep 
  from TestData;

 TO_DELETE    TO_KEEP
---------- ----------
    245952     266448

Run an in-place delete first:




delete
   from TestData
  where mod(runId, 2) = 0;

245952 rows deleted.

Elapsed: 00:00:39.87

Next, run a swap delete:


begin
   SwapOps.FullInsert('TestData', 'mod(runId, 2) > 0');
end;
/

Elapsed: 00:00:14.28

The in-place delete took roughly 41 seconds, while the swap delete took slightly more than 14 seconds!

The TestData table has one primary key index and one non-unique index. I added one more non-unique index, then another non-unique index to test the index factor. The numbers in Table 1 show that the swap delete kept outperforming the in-place delete.

In-place Delete (sec.)Swap Delete (sec.)
Table with one primary key index and one non-unique index 39.87 14.28
Table with one primary key index and two non-unique indexes 52.39 24.12
Table with one primary key index and three non-unique indexes 64.18 34.90
Table 1. Results of Swap Delete vs. In-place Delete Performance Test

Is In-place Delete Better Than Swap Delete?
Deleting almost half of the data (48 percent) obviously is much faster via a swap delete, but what about when the percentage of data you want to delete goes down (40, 25, 10, etc.)? At which point does an in-place delete become more attractive than a swap delete? Well, the less data you need to delete and the more data you need to insert, the more work you'll have to do to rebuild indexes. For example, deleting 10 percent of the data via a swap delete means that 90 percent will be inserted in a new table and indexes will be rebuilt on 90 percent of the original data. Rebuilding indexes is a time-consuming operation. Believe or not, it's the biggest time component of the swap delete process: during its 15-second total in Table 1, 5 seconds went to the actual insert and 9 seconds were spent on index creation.

The current code in the SwapOps package rebuilds indexes serially, one after another. One way to speed up that step is to rebuild the indexes in parallel using Oracle's DBMS_JOB package, which I discuss it later in the article. (See Oracle9i Supplied PL/SQL Packages and Types Reference Release 2(9.2) for detailed information on DBMS_JOB package.)

In the case of partitioned indexes, you can achieve parallel functionality by utilizing the DBMS_PCLXUTIL package. It provides intra-partition parallelism for creating partition-wise local indexes. (See Oracle9i Supplied PL/SQL Packages and Types Reference Release 2(9.2) for detailed information on DBMS_PCLXUTIL package.)

I ran the performance tests on the TestData table with the following where clauses:

mod(runId, 2) = 0  – 48% rows to delete, 52% - to keep (initial test)
mod(runId, 3) = 0  – 32% rows to delete, 68% - to keep
mod(runId, 4) = 0  – 24% rows to delete, 76% - to keep
mod(runId, 5) = 0  – 20% rows to delete, 80% - to keep
mod(runId, 8) = 0  – 12% rows to delete, 88% - to keep

Table 2 shows the performance results for the TestData table with one primary key index and one non-unique index.

Rows to Delete (%)/Rows to Keep (%)In-place Delete (sec.)Swap Delete (sec.)
48/52 39.87 14.28
24/76 24.96 19.16
20/80 20.12 20.88
12/88 12.64 24.12
Table 2. Performance Results for the TestData Table (one primary key index, one non-unique index)

The results show that the threshold for when an in-place delete becomes more attractive than a swap delete is very close to 20 percent: all delete operations that remove less then 20 percent of rows are better done by with in-place delete, and any delete operations that remove 20 percent rows or higher are faster via a swap delete.

Since rebuilding indexes is the most time-consuming component of the swap delete process, I believe the number of indexes in a table is a mayor factor that affects the threshold—the more indexes a table has, the higher threshold is. To test that theory, I added two more non-unique indexes on the TestData table:


create index TestData_IdName
on TestData
(
  runId,
  objectName
);

create index TestData_IdDate
on TestData
(
  runId,
  createDate,
  status  
);

Table 3 shows the performance results for the TestData table with one primary key index and three non-unique indexes.

Rows to Delete (%)/Rows to Keep (%)In-place Delete (sec.)Swap Delete (sec.)
48/52 64.18 34.90
32/68 48.94 36.91
24/76 33.44 38.21
20/80 27.46 39.83
12/88 17.61 44.72
Table 3. Performance Results for the TestData (one primary key index, three non-unique indexes)

The results show that the threshold is higher for the heavier indexed TestData table. It moved from 20 percent to more than 25 percent.

Indexes are just one mayor factor that affects the threshold value. Other factors that may influence the threshold value include table width/number of columns, CLOB/BLOB columns existence, and partitioned tables.

Keep in mind that the actual threshold value may change from one database server to another—especially if they have different configurations that affect their logging performances. I ran the same tests on another database server with less efficient logging than the first and its threshold value came out lower, around 15 percent.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap