Don’t Let Referential Integrity Degrade Your Database Performance

racle databases provide a powerful mechanism for enforcing business rules, called referential integrity (RI), which is implemented either via a declaration of foreign key (FK) constraints (“declarative RI”) or via triggers. It is almost impossible to find an Oracle database with no declarative RI implemented.

Declarative RI protects the integrity of related data by defining the relation between data in different tables. Relation between two tables is defined using a combination of primary, or unique, key (PK) and FK constraints. A PK constraint uniquely identifies a row in a database table, while an FK constraint actually declares the rules that will be applied to the related data. Declarative RI rules basically dictate that FK values in a column, or set of columns, of one table match the PK or unique constraint values in a related table.

In the SCOTT schema provided with Oracle’s sample database we could illustrate the declarative RI using the sample EMP table, which contains employee information, and DEPT table (department information). DEPT and EMP are in a one-to-many relationship: one department (parent) could contain multiple employees (children). Each employee belongs to a certain department, so each DeptNo value in the EMP table (FK) has to exist as a DeptNo value in the DEPT table (PK). You don’t need to write any PL/SQL code to enforce this rule; all you need to do is DECLARE an FK constraint using the following command:

alter table EMPadd constraint EMP_DEPTNO_FKforeign key (DEPTNO)references DEPT(DEPTNO);
That’s why it’s called “declarative RI.” It assures that all of the references within the database are valid. In the case of the EMP and DEPT tables, the EMP_DEPT_FK FK constraint makes sure that nobody could delete a parent record from the DEPT table if it references at least one EMP record. This ensures that you don’t get employees in a nonexistent department, which would be called orphan records. This FK constraint also does not allow you to update any EMP records with new DeptNo value that do not exist in the DEPT table. As you see, this FK constraint protects DEPT and EMP data integrity by always making sure that data are completely in sync.

It is possible to specify an FK declaration that would automatically delete all child records when the parent record gets deleted. This could be achieved with the ON DELETE CASCADE clause:

alter table EMPadd constraint EMP_DEPTNO_FKforeign key (DEPTNO)references DEPT(DEPTNO)on delete cascade;
This is dangerous and it could lead to unexpected results, because data in child tables would be removed automatically in the background without informing the user. It provides the cascading capability only for the DELETE operation; for UPDATE you have to write a database trigger to achieve that.

Usually, FK constraints are supported by corresponding indexes. (In our example, it is an index on the DeptNo column in the EMP table.) However, the FK constraint declaration does not automatically create a supporting index. Unindexed FK constraints can often lead to poor performance.



Foreign key (FK) constraints without supporting indexes can cause severe performance problems during SQL queries and massive DML (Data Manipulation Language) operations, such as DELETEs or UPDATEs.



Use a SQL script to detect unindexed foreign key constraints, review each constraint case individually, and either add an index?if it doesn’t slow down your DML operations substantially?or disable FK constraints just to perform massive DML operations, and then enable them back.

Reasons for Foreign Key Constraints
There are many reasons for creating an index that supports the foreign key (FK) constraint. The first and the most obvious one is that the FK constraint represents a relation between tables, and if these tables are part of a SQL query, there’s a 99.9 percent chance that they will be joined on the constraint column. In this case you definitely need an index that supports SQL queries.

Another reason to have indexes for FK constraints is to avoid locking conflicts during an update of the child table. Oracle provides a row-level locking mechanism that removes many locking problems contained in databases with page-level and block-level locking. However, even row-level locking won’t help if frequent updates to the child table cause it to lock the entire parent table just because the supporting index is missing.

There is also a less-obvious problem that occurs with massive DELETE operations on the parent table. This is dangerous because, unlike the SELECT command, it triggers SQL queries implicitly in the background and you could spend hours and days trying to figure out why your simple DELETE command is unable to complete.

Problem with Massive DELETEs on the Parent Table
Recently I had to solve a problem with a procedure that was written to delete data from a set of related tables for a specified time period in order to prepare the tables for the next run. There were five tables involved: one transactions table (the parent) and four child tables?Credits, Deposits, Incentives, and Payments. All four tables were in one-to-many relationship with the Transactions table, which had a PK based on the TransactionID column. All child tables also included a TransactionID column and a foreign key that referenced the PK of the Transactions table.

As you can see, declarative RI would not allow me to delete any records in the Transactions table. So a procedure was written in such a way that it deleted records in four child tables for the specified time period and only after that deleted records from the parent table for the same time period. It worked just fine for a while, but then it started running really slow (1216 hours) on big data sets. In my case, I was trying to delete 10,000 transactions from a table with 200,000 transactions total. Here were the stats for the other tables:

  • Credit: 3 million records to delete, 25 million total
  • Deposits: 100,000 records to delete, 1.5 million total
  • Incentives: 150,000 records to delete, 2.5 million total
  • Payments: 75,000 records to delete, 2.5 million total
First I tried to locate the particular DELETE statement that was the source of the problem. So I put debugging messages in the code and recompiled and ran the procedure. To my surprise, during the first 15 minutes the procedure completed the deletion of records in all child tables and started executing the last DELETE statement, which deleted data from the Transactions table. I sat there for 6 hours (!) before I lost my patience and killed it. The last statement had the following, very simple syntax:
delete   from Transactions   where Period = 23      and OriginType = ‘calculated’;
I checked on all available indexes of the Transactions table and, sure enough, the composite index on Period and OriginType was there. I checked the execution plan for the corresponding SELECT statement:
select *   from Transactions   where Period = 23      and OriginType = ‘calculated’;
The execution plan was correct: it was using that index. SELECTing count(*) with the same WHERE clause gave me the correct result (10,000) in less than one second. OK, so what was the next step?

What’s the difference between SELECT and DELETE with the same WHERE clause? The answer is that DELETE triggers RI rules, so the database has to issue some SQL commands in the background to enforce those rules. I now had to find which SQL statements were being executed.

Using TOAD to View SQL in Real Time
I decided to find out exactly what the database was executing during that six-hour period while sitting on the same DELETE statement. I used TOAD from Quest Software, Inc. as my main Oracle database developer’s tool. (In one of my previous 10-Minute Solutions, “Profiling in Oracle 8i,” I show how to utilize TOAD’s profiling capabilities.)

In TOAD 7.1, the latest version, there’s a Kill/Trace Session option under the DBA menu. Instead of killing a session, I was able to see all current sessions including the one that was running the procedure. After I selected that session, I saw SQL statements in the lower panel popping up and going away as they were being executed in real time. That was really useful!

I found that this SQL query was being executed again and again and again:

select count(0)   from Credits   where TransactionID = :b1;
This statement would appear on the screen for five to seconds, on average, only to go away and come back again.

Obviously, this was not part of the procedure’s code. It was generated by the database due to the presence of the bind variable (:b1) in the SQL statement. It was the SQL generated by the declarative RI rules! Even though I had deleted all the child records, the database was not aware of this, so it was enforcing FK constraints by checking whether any children existed in four related tables. In the case of the Credits table, it was getting a record count for each TransactionID it was trying to delete to make sure it’s zero. If it was greater than zero, it would’ve raised an exception error: “ORA-02292 integrity constraint violated child record found.”

I generated an execution plan for that SQL statement to find out why it was taking so long to get a simple count. Sure enough, the execution plan showed that the optimizer selected a full table scan?which, in the case of the Credits table, amounted to a scan of 25 million rows for each of the 10,000 TransactionIDs selected from the Transactions table. The total time for the last DELETE was 50,000 seconds (5 seconds times 10,000) or almost 14 hours!

After adding an index to the TransactionID in the Credits table and running the procedure again, I saw that it took less than 14 minutes. Obviously, the database was using the newly created index to perform declarative RI checking.

First Solution: Detect Unindexed Foreign Key Constraints and Add Indexes to Support Them
After what happened, I wanted to find all constraints that didn’t have indexes to support them. Oracle 9i includes the SKIP SCAN index feature that allows the optimizer to choose an index even if the leading column is not part of the query. (Previous versions of Oracle allow you to utilize an index only if the leading column(s) is present in the WHERE clause.) To support an FK constraint on the TransactionID I needed a single index on the TransactionID or composite index, with TransactionID as a leading column.

I wrote the SQL script shown in Listing 1 to perform this task by making an outer join between two inline views. The first view is built by joining two Oracle views with constraint information?User_Constraints and User_Cons_Columns. The second inline view is created from an Oracle view with information from User_Ind_Columns. Then the outer SELECT joins these two inline views on the table name and the comma-delimited string of columns. The condition “Table_name is null” filters out only the constraint rows from the first inline view, which do not have matching index rows from the second view. This fact tells us that indexes with a matching comma-delimited string of columns don’t exist, and FK constraints selected by this script were created with no index support.

Here is the output from the SQL script shown in Listing 1:

TABLE_NAME         CONSTRAINT_NAME           CONSTRAINT_COLUMNS—————— ————————- ——————-CLASSIFIER         CLASSIFIER_PERIODID_FK    PERIODIDCREDIT             CREDIT_TRANSACTIONID_FK   TRANSACTIONID
As you can see, there are two FK constraints with no supporting indexes. One of them is CREDIT_TRANSACTIONID_FK, which is part of the problem described above.

Second Solution: Disable Constraints Temporarily to Speed Up Massive DML
Adding an index to support the FK constraint generally is a good idea. However, each additional index slows the performance of your DML (Data Manipulation Language) statements because the database has to update the index to keep it in sync with the table. In my case, the CREDIT table contains very dynamic data and adding an index was not a desirable solution.

Instead, I decided to disable this specific constraint temporarily to speed up my DELETE statement, and enable it back with the NOVALIDATE clause after DELETE completed. I knew that data integrity would be preserved because I deleted all the child records from the other four tables first. Therefore I could enable the constraint without having to validate the existing data, which lead to a significant performance gain because enabling the constraint with NOVALIDATE was almost instantaneous, whereas doing the same operation without that clause took 25 minutes! Here is the fragment of code, which is utilizing Native Dynamic SQL introduced in Oracle 8i (see my 10-Minute Solution, “New PL/SQL Features in Oracle 8i: Part I“) to disable and enable the FK constraint:

…/* Disable the constraint */vSQLStatement := ‘alter table Credit ‘                        ||                 ‘modify constraint CREDIT_TRANSACTIONID_FK ‘ ||                 ‘disable’;execute immediate vSQLStatement;delete   from Transactions   where Period = 23      and OriginType = ‘calculated’; /* Enable the constraint */vSQLStatement := ‘alter table Credit ‘                        ||                 ‘modify constraint CREDIT_TRANSACTIONID_FK ‘ ||                 ‘enable novalidate’;execute immediate vSQLStatement;
Now the whole procedure took only 12.5 minutes to execute.

Later I encapsulated “enable/disable constraint” code inside the function as a part of my maintenance package. This is good programming practice because it hides the implementation details and makes the code reusable and clean:

/* Disable the constraint */Maintenance.AlterConstraint(‘CREDIT’, ‘CREDIT_TRANSACTIONID_FK’, 	‘DISABLE’);delete   from Transactions   where Period = 23      and OriginType = ‘calculated’; /* Enable the constraint */Maintenance.AlterConstraint(‘CREDIT’, ‘CREDIT_TRANSACTIONID_FK’, 	‘ENABLE NOVALIDATE’);
As you can see, FK constraints are a powerful mechanism for enforcing RI rules within a database, and you have to provide it with the necessary supporting indexes to ensure optimal performance.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Related Posts