Browse DevX
Sign up for e-mail newsletters from DevX


Retrieve Rows Set by Set from the Database—Without Duplicates-4 : Page 4




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Packages to the Rescue
Clearly, the problem with the previously discussed solutions is that they try to retrieve results again and again from the beginning. Wouldn't making the database remember what it had shown in previous batches be better? It would completely eliminate the inefficiency problem that unnecessary sorting causes, and it would even take care of the dupes problem.

What would enable such a mechanism? Packages. I am always surprised at how many Oracle programmers move from SQL to PL/SQL, learn how to write procedures and functions, and yet stop one step short of packages. The author of my favorite Oracle book ("Oracle PL/SQL Programming"), Steven Feuerstein, wrote "Packages are among the least understood and most underutilized features of PL/SQL. That is a shame, because the package structure is also one of the most useful constructs for building well-designed PL/SQL-based applications."

Packages enable you to keep data in memory (for the quickest access) and share it across all your application components. The data would persist for the duration of the session. Unless you want to persist data to the table(s), permanent or temporary, and worry about how to maintain/clean it up later, packages are exactly what you need.

Develop an API to Perform Retrieval of the Next N Rows
The API I propose will be based on a package with one cursor and multiple procedures to open, close, and fetch rows from the cursor. The cursor stays open throughout the session, and it remembers the fetch history because it automatically sets its pointer to the next available row: fetching the first 10 rows would move the pointer to row 11, the next 10 to row 21, etc.

The package specification consists of five procedures:

  1. OpenCursor
  2. CloseCursor
  3. RetrieveRows
  4. SetRows - to set the default value of rows to fetch
  5. SetSalary - to set the default value of the salary parameter that will be passed to the cursor
And two types:
  1. tName - PL/SQL table of Emp.Ename type
  2. tSal - PL/SQL table of Emp.Sal type
The following listing shows the GetNextSet package specification:

create or replace package GetNextSet as type tName is table of Emp.Ename%type index by binary_integer; type tSal is table of Emp.Sal%type index by binary_integer; procedure CloseCursor; procedure OpenCursor (pSalary pls_integer := null); procedure RetrieveRows (pRows pls_integer := null, pSalary pls_integer := null); procedure SetRows (pRows pls_integer); procedure SetSalary(pSalary pls_integer); end GetNextSet; /

The RetrieveRows procedure does the actual work of retrieving the next N rows. It uses the new Oracle 8i BULK COLLECT clause of the FETCH INTO command with the LIMIT keyword. The BULK COLLECT clause allows us to bulk-fetch rows from the cursor into the vName collection of employees' names and the vSal collection of salaries, while LIMIT keyword sets the size of the batch:

fetch csrEmployees bulk collect into vName, vSal limit vRows;

That's it! That one line of code does all the work! Passing batch size as a parameter to RetrieveRows procedure allows you to change it from one fetch to another (i.e., give me the first 10 rows, then the next 25 rows, etc.).

Listing 1 shows the GetNextSet package body.

Use the GetNextSet Package
You use the GetNextSet package by calling its procedures from SQL*Plus. First, you need to enable screen output by setting the SERVEROUTPUT value to ON:

set serveroutput on

Now, start retrieving rows of all employees with salaries greater or equal to 800 in chunks of three (remember, you have 14 total rows in the Emp table):

begin GetNextSet.RetrieveRows(3, 800); end; /

Here is the output from the above code:

Opening the cursor. SMITH 800 JAMES 950 ADAMS 1100

You've sorted the first three rows by salary and then by name, because the cursor definition in your SQL is using the Emp_Sal_Ename index as you planned. Now retrieve the next set:

begin GetNextSet.RetrieveRows; end; /

Here is the output of the above code:

MARTIN 1250 WARD 1250 MILLER 1300

Only 14 rows are returned and you're probably getting a little impatient. Change the size of the set from three to six:

begin GetNextSet.RetrieveRows(6); end; /

Now you're rolling:

TURNER 1500 ALLEN 1600 CLARK 2450 BLAKE 2850 JONES 2975 FORD 3000

Now, perform the last fetch:

begin GetNextSet.RetrieveRows; end; /

Here are the results:

SCOTT 3000 KING 5000 All rows retrieved. Closing the cursor.

To try out other procedures in the GetNextSet package, execute the following PL/SQL block, which changes your default settings for set size to five and for salary to 2000:

begin GetNextSet.SetRows(5); GetNextSet.SetSalary(2000); end; /

Now you will start retrieving all high-salaried employees in chunks of five:

begin GetNextSet.RetrieveRows; end; /

Here is the output:

Opening the cursor. CLARK 2450 BLAKE 2850 JONES 2975 FORD 3000 SCOTT 3000

The next fetch:

begin GetNextSet.RetrieveRows; end; /

appears to be your last one:

KING 5000 All rows retrieved. Closing the cursor.

Looks good and runs instantly! Of course, the table is tiny—only 14 rows. What would happen in a table with hundreds of thousands of rows? I conducted simple performance testing to find out.

As I mentioned previously, the packages solution eliminates the need to retrieve or sort for the consequent requests. The cursor keeps the pointer on the next available row, so you get just the next N rows from the cursor every time. Easy and simple!

Performance Testing Results
I populated a table with 200,000 rows and retrieved rows in chunks of 10 with a total of 100,000 matching my filter criteria. Opening the cursor took 0.2 seconds, and fetching the next 10 rows took 0.15 seconds—regardless of whether it was the first 10 rows or a set in the middle of the 100,000 rows. I consider these good results, particularly in light of Solution #2 from the Erroneous Solutions section. Executing Solution #2 code got slower for subsequent requests and it became really painful as I got to the middle of the 100,000 rows. It took 15-17 seconds, which is completely unacceptable for retrieving just 10 rows.

What to Improve If You Are on Oracle 9i
Oracle 8i imposes a restriction on the usage of the BULK COLLECT clause: the target cannot be a collection of records; it has to be individual collections. That's why I declared two types (tName and tSal) and initialized two collections (vName and vSal). Oracle lifted this restriction in 9i. You need to just declare a type and initialize a collection of that type.

Two new PL/SQL features in Oracle 8i that I like are Native Dynamic SQL (see New PL/SQL Features in Oracle 8i: Part I) and Bulk Binds (see New PL/SQL Features in Oracle 8i: Part II). The problem is you cannot use them together in 8i. If you open a cursor with Native Dynamic SQL, you cannot perform bulk-fetching. That's why I couldn't use Native Dynamic SQL and had to hardcode the cursor in my package body. The current code works only against Emp table and retrieves only two fields, name and salary.

But if you are on Oracle 9i, you could make the cursor definition code more generic by constructing a SQL string using passed in parameters: table name(s), list of fields, WHERE or ORDER BY clauses—all of it could be parameterized. Once your package is more generic, you can use it in many places. (Just to clarify: building a cursor's SQL dynamically can be done in Oracle 8i also, but then you can not use the BULK BINDS/BULK COLLECT feature and you have to settle for fetching rows one at a time, which is not nearly as efficient as bulk-fetching.)

The good thing about packages is that the existing package structure allows you to add more procedures to support additional functionality (e.g., skipping forward or backward, getting the total amount of sets (pages) with a specified set size, and more). That's just another example that shows why packages is the way to go in PL/SQL development.

Boris Milrud has 10 years of experience as a software developer. He is a Senior Database Engineer with Callidus Software, Inc. located in San Jose CA. He is specializing in all aspects of Oracle database software development including database design, programming, optimization and tuning.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date