Optimize the Massive DELETE Operation in Oracle, Part 1

Optimize the Massive DELETE Operation in Oracle, Part 1

massive DELETE operation deletes millions of rows from a table with indexes and constraints. This operation is database intensive and time consuming, mainly because it forces the database to generate and save to disk significant amounts (possibly gigabytes) of redo and undo data.

You can perform massive DELETEs as bulk INSERT operations: instead of removing data you no longer need, you insert data you want to keep. The key to this option is performing it efficiently with minimum logging by using direct-path INSERT. Part 1 of this article series compares direct-path INSERT and regular INSERT. It also describes the technique of performing DELETEs as INSERTs. Part 2 will discuss the packaged APIs that implement the bulk INSERT technique.



How do I complete a massive DELETE operation in my Oracle database without having to pay the heavy performance overhead?



Perform the massive DELETE operation as a direct-path (direct-load) INSERT (INSERT with APPEND hint) into a table whose logging parameter is set to NOLOGGING. This operation will complete significantly faster than DELETE and with minimum logging, but you have to take a backup afterwards to establish a new baseline.

Why INSERT Could Be Faster Than DELETE (or UPDATE)
Direct-path INSERT is a special database operation. Like SQL*Loader, it writes data directly to the database files, bypassing the buffer cache. It does this with minimum logging, recording only data dictionary changes. The logic behind this method is that because the data files are already up-to-date when an instance failure occurs, logging isn’t necessary.

The two distinct cases in which direct-path INSERT is important are when:

  1. The database is in noarchivelog mode. Media recovery is not possible, and you don’t need redo data for that either.
  2. The database is in archivelog mode. It logs redo blocks for media recovery by default. However, if you explicitly set a table to NOLOGGING mode, the database won’t log redo blocks.

Therefore, with the direct-path INSERT when the database is in noarchivelog mode or when it is in archivelog mode and the table is in NOLOGGING mode, it performs only minimum redo logging—to protect the data dictionary.

Need Proof? Generate Test Data
To demonstrate the difference between direct-path and regular INSERTs, create test data. First, create a table named TestData with half a million rows, a primary key, one non-unique index, and a check constraint. You will use this table in all of the article’s code and performance tests. It employs data from the all_objects view as its source.

First, create the table with the following code:

drop table TestData;create table TestData(   objectId     integer      not null,   objectName   varchar2(30) not null,   objectType   varchar2(30) not null,   runId        integer      not null,   createDate   date                 ,   status       varchar2(7)          ,   description  varchar2(255))nologging;

Next, populate it with data using the following block (Since the all_objects view (Oracle 9.2) has 20K+ rows, you need to do the insert 25 times to total 500K+ rows in the TestData table):

declare  vCount pls_integer := 25;begin   for nIndex in 1..25   loop      insert /*+ APPEND */      into TestData      (         objectId,         objectName,         objectType,         runId,         createDate,         status,         description      )      select object_id,             object_name,             object_type,             nIndex,             created,             status,             object_name        from all_objects;              commit;   end loop;end;/

Check the rows count:

select count(*)   from TestData;  COUNT(*)----------    547875

Add the primary key on the objectId and runId columns, one non-unique index on the objectName and objectType columns, and one check constraint on the runId:

alter table TestDataadd constraint TestData_PKprimary key(  objectId,  runId);create index TestData_NameTypeon TestData(  objectName,  objectType);alter table TestDataadd constraint TestData_CKcheck (runId > 0);

Gather the statistics using the following block:

begin   dbms_stats.gather_table_stats   (      ownname    => user,      tabname    => 'TestData',      method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO',      degree     => 4,      cascade    => true   );end;/

Now, create two tables, TestData_Logging and TestData2_Nologging, with identical structures to TestData but with no indexes or constraints:

create table TestData_Logging asselect *   from TestData where 1 = 0; create table TestData_Nologging nologging asselect *   from TestData where 1 = 0;

TestData_Logging is in LOGGING mode, while TestData_Nologging is in NOLOGGING mode:

select table_name,       logging  from user_tables where table_name like 'TESTDATA\_%' escape '';TABLE_NAME                     LOGGING------------------------------ -------TESTDATA_LOGGING               YESTESTDATA_NOLOGGING             NO

Take a snapshot of the redo size before doing a regular INSERT:

select a.name,       b.value  from v$statname a,       v$mystat b where a.statistic# = b.statistic#   and a.name = 'redo size';NAME              VALUE----------------- -----------redo size            39867216

Perform the INSERT into the TestData_Logging table (Scenario 1):

set timing oninsert into TestData_Loggingselect *  from TestData;547875 rows created.Elapsed: 00:00:08.28

Take a snapshot of redo and undo after the INSERT:

select a.name,       b.value  from v$statname a,       v$mystat b where a.statistic# = b.statistic#   and a.name = 'redo size';NAME                    VALUE----------------- -----------redo size            93019404select used_ublk   from v$transaction;USED_UBLK----------       238commit;

Populating the TestData_Logging table with 547,875 rows in LOGGING mode took 8.28 seconds and forced the database to generate 50.69MB (93,019,404 – 39,867,216 = 53,152,188) of redo and 238 undo blocks.

Now, repeat the same steps for the table in NOLOGGING mode (Scenario 2):

  1. Obtain performance results before load:
    select a.name,       b.value  from v$statname a,       v$mystat b where a.statistic# = b.statistic#   and a.name = 'redo size';NAME                    VALUE----------------- -----------redo size            93019496insert /*+ APPEND */into TestData_Nologgingselect *  from TestData;547875 rows created.Elapsed: 00:00:05.39
  2. Obtain performance results after load:
    select a.name,       b.value  from v$statname a,       v$mystat b where a.statistic# = b.statistic#   and a.name = 'redo size';NAME                    VALUE----------------- -----------redo size            93019404select used_ublk   from v$transaction;USED_UBLK----------       1

Populating the TestData_Nologging table with 547,875 rows in NOLOGGING mode took 5.39 seconds and generated only 0.01MB (!) (93,120,212 – 93,019,496= 100,716) of redo and 1 undo block.

Table 1 shows the performance numbers for both scenarios.

Elapsed Time (sec.) Redo Size (MB) Undo Blocks
Scenario 1: Regular INSERT, LOGGING mode 8.28 50.69 238
Scenario 2: Direct-path INSERT, NOLOGGING mode 5.39 0.01 1
Table 1. Performance Results for Scenarios 1 and 2

However, if you add just a primary key constraint to the TestData_Nologging table (Scenario 3), the redo size skyrockets even though both the table and the primary key index are in NOLOGGING mode:

drop table TestData_Nologging;create table TestData_Nologging nologging asselect *   from TestData where 1 = 0; alter table TestData_Nologgingadd constraint TestData_Nologging_PKprimary key(  objectId,  runId);alter index TestData_Nologging_PK nologging;

Obtain performance results before load:

select a.name,       b.value  from v$statname a,       v$mystat b where a.statistic# = b.statistic#   and a.name = 'redo size';NAME                    VALUE----------------- -----------redo size            93198748insert /*+ APPEND */into TestData_Nologgingselect *  from TestData;547875 rows created.Elapsed: 00:00:15.61

Obtain performance results after load:

select a.name,       b.value  from v$statname a,       v$mystat b where a.statistic# = b.statistic#   and a.name = 'redo size';NAME                    VALUE----------------- -----------redo size            131033428select used_ublk   from v$transaction;USED_UBLK----------       817commit;

As you can see, adding index led to an elapsed time increase from 5.39 to 15.61 seconds, a redo size increase from 0.01MB to 36MB (131,033,428 – 93,198,748 = 37,834,680), and an undo blocks increase from 1 to 817 (!). The reason is the database did heavy logging for the primary key index. Putting a primary key index in NOLOGGING mode did not help because NOLOGGING applies only to a limited number of operations. Here is a list of such operations from Oracle9i SQL Reference Release 2 (9.2):

DML:
  • Direct-path INSERT (serial or parallel)
  • Direct Loader (SQL*Loader)

DDL:

  • CREATE TABLE ... AS SELECT
  • ALTER TABLE ... MOVE
  • ALTER TABLE ... [all partition operations that involve data movement]
    • ALTER TABLE ... ADD PARTITION (hash partition only)
    • ALTER TABLE ... MERGE PARTITIONS
    • ALTER TABLE ... SPLIT PARTITION
    • ALTER TABLE ... MOVE PARTITION
    • ALTER TABLE ... MODIFY PARTITION ... ADD SUBPARTITION
    • ALTER TABLE ... MODIFY PARTITION ... COALESCE SUBPARTITION
    • ALTER TABLE ... MODIFY PARTITION ... REBUILD UNUSABLE INDEXES
  • CREATE INDEX
  • ALTER INDEX ... REBUILD
  • ALTER INDEX ... REBUILD PARTITION
  • ALTER INDEX ... SPLIT PARTITION

See Oracle9i SQL Reference Release 2 (9.2) at for more details on NOLOGGING clause (login required).

The above list shows that you can take advantage of the CREATE INDEX operation by rearranging your steps:

  1. Load the data into a table with no indexes.
  2. Create indexes in NOLOGGING PARALLEL mode.
  3. Create constraints.

The following series of commands for the TestData_Nologging table implements the above technique (Scenario 4):

drop table TestData_Nologging;create table TestData_Nologging nologging asselect *   from TestData where 1 = 0; insert /*+ APPEND */into TestData_Nologgingselect *  from TestData;  create unique index TestData_Nologging_PKon TestData_Nologging(  objectId,  runId)nologging parallel;alter table TestData_Nologgingadd constraint TestData_Nologging_PKprimary key(  objectId,  runId);

Scenario 4 turns out to be more efficient than Scenario 3: redo size went down from 36.08MB to 0.17 and undo blocks from 817 to just 2.

Table 2 shows the performance numbers for both scenarios 3 and 4.

Elapsed Time (sec.) Redo Size (MB) Undo Blocks
Scenario 3: Direct-path INSERT into table with existing primary key 15.61 36.08 817
Scenario 4: Direct-path INSERT into table with no primary key, creating primary index in NOLOGGING PARALLEL mode, then adding primary key 12.34 0.17 2
Table 2. Performance Results for Scenarios 3 and 4

The numbers clearly show that Scenario 4 is the way to go. Now it’s time to talk about implementing DELETEs as bulk INSERTs.

Performing Massive DELETEs as Bulk INSERT
The following steps implement the technique of implementing DELETEs as direct-path INSERTs:

  1. Create a table (TestData_X) with the identical structure as the TestData table in the previous section but with no primary key, alternate key(s), or foreign key(s) constraints (NOLOGGING option).
  2. Insert data that you need to be keep into the newly created table using direct-path INSERT.
  3. Create indexes on the new table with NOLOGGING PARALLEL options.
  4. Create constraints on the new table. Use the ENABLE NOVALIDATE option for referential integrity and check constraints.
  5. Swap the TestData and TestData_X tables: rename TestData to TestData_T, rename TestData_X to TestData.
  6. Restore the original logging settings.
  7. Verify the results and drop the TestData_T table.

Optimize the Massive DELETE Operation in Oracle, Part 2 will present the SwapOps package, which contains all of the APIs necessary to implement the above steps. It also will present performance tests for DELETEs and direct-load INSERT, and compare the two.

devx-admin

devx-admin

Share the Post:
Razer Discount

Unbelievable Razer Blade 17 Discount

On September 24, 2023, it was reported that Razer, a popular brand in the premium gaming laptop industry, is offering an exceptional deal on their

Innovation Ignition

New Fintech Innovation Ignites Change

The fintech sector continues to attract substantial interest, as demonstrated by a dedicated fintech stage at a recent event featuring panel discussions and informal conversations

Import Easing

Easing Import Rules for Big Tech

India has chosen to ease its proposed restrictions on imports of laptops, tablets, and other IT hardware, allowing manufacturers like Apple Inc., HP Inc., and

Anthropic Investment

Amazon’s Bold Anthropic Investment

On Monday, Amazon announced its plan to invest up to $4 billion in the AI firm Anthropic, acquiring a minority stake in the process. This

Razer Discount

Unbelievable Razer Blade 17 Discount

On September 24, 2023, it was reported that Razer, a popular brand in the premium gaming laptop industry, is offering an exceptional deal on their Razer Blade 17 model. Typically

Innovation Ignition

New Fintech Innovation Ignites Change

The fintech sector continues to attract substantial interest, as demonstrated by a dedicated fintech stage at a recent event featuring panel discussions and informal conversations with industry professionals. The gathering,

Import Easing

Easing Import Rules for Big Tech

India has chosen to ease its proposed restrictions on imports of laptops, tablets, and other IT hardware, allowing manufacturers like Apple Inc., HP Inc., and Dell Technologies Inc. more time

Semiconductor Stock Plummet

Dramatic Downturn in Semiconductor Stocks Looms

Recent events show that the S&P Semiconductors Select Industry Index seems to be experiencing a downturn, which could result in a decline in semiconductor stocks. Known as a key indicator

Anthropic Investment

Amazon’s Bold Anthropic Investment

On Monday, Amazon announced its plan to invest up to $4 billion in the AI firm Anthropic, acquiring a minority stake in the process. This decision demonstrates Amazon’s commitment to

AI Experts Get Hired

Tech Industry Rehiring Wave: AI Experts Wanted

A few months ago, Big Tech companies were downsizing their workforce, but currently, many are considering rehiring some of these employees, especially in popular fields such as artificial intelligence. The

Lagos Migration

Middle-Class Migration: Undermining Democracy?

As the middle class in Lagos, Nigeria, increasingly migrates to private communities, a PhD scholar from a leading technology institute has been investigating the impact of this development on democratic

AI Software Development

ChatGPT is Now Making Video Games

Pietro Schirano’s foray into using ChatGPT, an AI tool for programming, has opened up new vistas in game and software development. As design lead at business finance firm Brex, Schirano

Llama Codebot

Developers! Here’s Your Chatbot

Meta Platforms has recently unveiled Code Llama, a free chatbot designed to aid developers in crafting coding scripts. This large language model (LLM), developed using Meta’s Llama 2 model, serves

Tech Layoffs

Unraveling the Tech Sector’s Historic Job Losses

Throughout 2023, the tech sector has experienced a record-breaking number of job losses, impacting tens of thousands of workers across various companies, including well-established corporations and emerging startups in areas

Chinese 5G Limitation

Germany Considers Limiting Chinese 5G Tech

A recent report has put forth the possibility that Germany’s Federal Ministry of the Interior and Community may consider limiting the use of Chinese 5G technology by local network providers

Modern Warfare

The Barak Tank is Transforming Modern Warfare

The Barak tank is a groundbreaking addition to the Israeli Defense Forces’ arsenal, significantly enhancing their combat capabilities. This AI-powered military vehicle is expected to transform the way modern warfare

AI Cheating Growth

AI Plagiarism Challenges Shake Academic Integrity

As generative AI technologies like ChatGPT become increasingly prevalent among students and raise concerns about widespread cheating, prominent universities have halted their use of AI detection software, such as Turnitin’s

US Commitment

US Approves Sustainable Battery Research

The US Department of Energy has revealed a $325 million commitment in the research of innovative battery types, designed to enable solar and wind power as continuous, 24-hour energy sources.

Netanyahu Musk AI

Netanyahu and Musk Discuss AI Future

On September 22, 2023, Israeli Prime Minister Benjamin Netanyahu met with entrepreneur Elon Musk in San Francisco prior to attending the United Nations. In a live-streamed discussion, Netanyahu lauded Musk

Urban Gardening

Creating Thriving Cities Through Urban Gardening

The rising popularity of urban gardening is receiving increased recognition for its numerous advantages, as demonstrated in a recent study featured in the Environmental Research Letters journal. Carried out by

What You Need to Know About Cloud Security Strategies

What You Need to Know About Cloud Security Strategies

Today, many businesses are adopting cloud computing services. As a result, it’s important to recognize that security measures for data in the cloud are different from those in traditional on-premises

Romanian Energy Security

Eastern Europe is Achieving Energy Security

Canada and Romania have solidified their commitment to energy security and independence from Russian energy exports by signing a $3-billion export development agreement. The deal is centered on constructing two

Seamless Integration

Unlocking Seamless Smart Home Integration

The vision of an intelligently organized and interconnected smart home that conserves time, energy, and resources has long been desired by many homeowners. However, this aspiration has often been hindered

New Algorithm

MicroAlgo’s Groundbreaking Algorithm

MicroAlgo Inc. has revealed the creation of a knowledge-augmented backtracking search algorithm, developed through extensive research in evolutionary computational techniques. The algorithm is designed to boost problem-solving effectiveness, precision, and

Poland Energy Future

Westinghouse Builds Polish Power Plant

Westinghouse Electric Company and Bechtel have come together to establish a formal partnership in order to design and construct Poland’s inaugural nuclear power plant at the Lubiatowo-Kopalino site in Pomerania.

EV Labor Market

EV Industry Hurting For Skilled Labor

The United Auto Workers strike has highlighted the anticipated change towards a future dominated by electric vehicles (EVs), a shift which numerous people think will result in job losses. However,