Retrieve Rows Set by Set from the Database—Without Duplicates

Retrieve Rows Set by Set from the Database—Without Duplicates

ince the beginning of the year I’ve received multiple emails from DevX readers regarding a discussion titled “How can I retrieve set by set records from a table?” on the DevX Oracle discussion group. The discussion revolves around a reader who wants to retrieve rows in sets of 10 and each time he or she retrieves another set the 10 records must not be previously obtained (i.e., the first time the reader gets rows 1 to 10; the second time, 11 to 20 without getting rows 1 to 10 again; the next time, 21 to 30, etc.).

The reader interest has been so high and the number of erroneous approaches so prevalent that I have decided to dedicate this 10-Minute Solution to describing a solution to the reader’s request.



How do I retrieve rows set by set from a database without getting previously obtained rows in subsequent sets?



Use Oracle’s package structure as a holder for session-persistent data in combination with the new Oracle 8i BULK COLLECT feature and LIMIT keyword.

Use the EMP Table
In this 10-Minute Solution, I am going to use the EMP table in the query examples. This table belongs to the SCOTT schema of the Oracle starter database (provided with Oracle database server versions 7.3 to 9i).

Here is the table’s structure:

Name         Null?     Type———–  ——–  ————EMPNO        NOT NULL  NUMBER(4)ENAME                  VARCHAR2(10)JOB                    VARCHAR2(9)MGR                    NUMBER(4)HIREDATE               DATESAL                    NUMBER(7,2)COMM                   NUMBER(7,2)DEPTNO                 NUMBER(2)

Run the following query to get a look at the data in the EMP table:

select Empno, Ename, Job, Mgr, Hiredate, Salfrom   Emp;

The following listing shows the output of the above code:

  EMPNO ENAME      JOB              MGR HIREDATE         SAL——- ———- ——— ———- ——— ———-   7369 SMITH      CLERK           7902 17-DEC-80        800   7499 ALLEN      SALESMAN        7698 20-FEB-81       1600   7521 WARD       SALESMAN        7698 22-FEB-81       1250   7566 JONES      MANAGER         7839 02-APR-81       2975   7654 MARTIN     SALESMAN        7698 28-SEP-81       1250   7698 BLAKE      MANAGER         7839 01-MAY-81       2850   7782 CLARK      MANAGER         7839 09-JUN-81       2450   7788 SCOTT      ANALYST         7566 19-APR-87       3000   7839 KING       PRESIDENT            17-NOV-81       5000   7844 TURNER     SALESMAN        7698 08-SEP-81       1500   7876 ADAMS      CLERK           7788 23-MAY-87       1100   7900 JAMES      CLERK           7698 03-DEC-81        950   7902 FORD       ANALYST         7566 03-DEC-81       3000   7934 MILLER     CLERK           7782 23-JAN-82       130014 rows selected. 

The query returns only 14 rows, but that is enough if you want to retrieve records in sets of three (i.e., if you choose the value of N = 3). For all the code examples, you will retrieve the names and salaries (ENAME and SAL columns) of employees whose salaries are greater than or equal to $750.

The following query returns all the results in one batch:

select Ename,       Salfrom   Empwhere  Sal >= 750;

The following listing shows the output from the above code:

ENAME             SAL———- ———-SMITH             800ALLEN            1600WARD             1250JONES            2975MARTIN           1250BLAKE            2850CLARK            2450SCOTT            3000KING             5000TURNER           1500ADAMS            1100JAMES             950FORD             3000MILLER           130014 rows selected.

The query ran almost instantly and the results look great. Again, you have only 14 rows.

Erroneous Solutions: What You Should Not Do and Why
A common mistake among the erroneous solutions for retrieving record sets row by row is trying to utilize the Oracle pseudocolumn ROWNUM. For each row returned by a query, the ROWNUM pseudocolumn returns a number that indicates the order in which Oracle selects the row or a set of joined rows from a table. The first row selected has a ROWNUM of 1, the second has 2, and so on.

The first erroneous solution posted in the discussion used the following syntax:

select Ename, Salfrom   Empwhere  sal    >= 750and    rownum >= 1and    rownum <= 3;

The following listing shows the output:

ENAME             SAL———- ———-SMITH             800ALLEN            1600WARD             1250

The obvious question is how do you then get the next N (three) rows? The author suggested that you just need to keep adding N to the lower and upper boundaries, like this:

select Ename, Salfrom   Empwhere  sal    >= 750and    rownum >= 4and    rownum <= 6;

If the author had run his suggested solution more than once, he would have noticed that such a query returns nothing. The reason is right in ROWNUM’s definition: if you want to get only rows 4 through 6, that means you don’t want rows 1 to 3. But that’s the only way to get to rows 4 to 6 with this solution.

Solution 2 offered a slightly more complex syntax that aimed to correct the ROWNUM problem by moving ROWNUM filtering from the WHERE clause to the HAVING clause:

select rownum,       max(Ename) as Ename,       max(Sal) as Salfrom Empwhere sal >= 750group by rownumhaving rownum > = 1and    rownum < = 3;

The following listing shows the output:

  ROWNUM ENAME             SAL——– ———- ———-       1 SMITH             800       2 ALLEN            1600       3 WARD             1250

The solution then called for the user to query to get the next set:

select rownum,       max(Ename) as Ename,       max(Sal) as Salfrom Empwhere sal >= 750group by rownumhaving rownum > = 4and    rownum < = 6;

The following listing shows the output of the above code:

   ROWNUM ENAME             SAL——— ———- ———-        4 JONES            2975        5 MARTIN           1250        6 BLAKE            2850

These results look much better than those for Solution 1, but they are far from perfect. Why?

First of all, this solution uses a GROUP BY clause so the SQL engine will accept the HAVING clause syntax. But using GROUP BY necessitates the MAX() function for each of the columns in the FROM list. That is totally unnecessary and inefficient because it leads to an additional SORT GROUP BY operation. Most importantly, filtering in the HAVING clause applies only after all of the matching rows are retrieved and sorted. So if 100,000 rows match the criteria in the WHERE clause, every time you ask for the next 10 rows, the database will retrieve and sort all 100,000 and then filter them by row number. Extremely inefficient!

In addition, the solution assumes that the database will return rows in the same order for the same SQL query every time. That’s a dangerous assumption! The user cannot guarantee that the database will retrieve rows in the same sequence unless it is enforced by the ORDER BY clause. The ORDER BY clause would force sorting of the whole set of retrieved rows, which is time-consuming and may not be necessary at all. In the example of 100,000 rows, every time you ask for the next 10, the database will retrieve all 100,000 rows, sort them (SORT ORDER BY operation), and then return a batch of 10 rows. That does not sound good at all!

Sort order is not important in certain situations, and an index could be used to avoid using the ORDER BY clause. For example, if you create a composite index on the SAL and ENAME columns, you would get rows sorted by salary and then by name without using the ORDER BY clause and the SORT ORDER BY operation.

Try using an index in your queries:

create index Emp_Sal_Enameon emp(sal, ename)compute statistics;

Even with the GROUP BY clause enforcing the same sequence in the retrieved rows, this solution has a potential problem. Let’s say you’re searching multiple Web sites for a new house. After all, the house market is still strong–unlike the NASDAQ. You set your preferences to sort houses by price and display 10 per page. While you are browsing through the first 10, the database gets updated with three new inexpensive offers. (Living in the San Francisco Bay Area has erased the word cheap from my vocabulary.) That would shift some of the houses you’re viewing three positions down. So what you actually are looking at are no longer rows 1 through 10. You’re seeing the first, third, and fourth, and 7 through 13 (the newly inserted bargains take the second, fifth, and sixth positions).

This should not be a problem at all. After all, databases have to be up-to-date. Plus, the Web site cannot lock the database while you view the data; you are one of many users and the database cannot work exclusively for you. However, what would happen when you click the Next button to get the next 10 records? You would see the first three (11, 12, and 13) repeated again. That would either imply software problems or duplicates (or dupes) in that Web site’s listings.

Packages to the Rescue
Clearly, the problem with the previously discussed solutions is that they try to retrieve results again and again from the beginning. Wouldn’t making the database remember what it had shown in previous batches be better? It would completely eliminate the inefficiency problem that unnecessary sorting causes, and it would even take care of the dupes problem.

What would enable such a mechanism? Packages. I am always surprised at how many Oracle programmers move from SQL to PL/SQL, learn how to write procedures and functions, and yet stop one step short of packages. The author of my favorite Oracle book (“Oracle PL/SQL Programming”), Steven Feuerstein, wrote “Packages are among the least understood and most underutilized features of PL/SQL. That is a shame, because the package structure is also one of the most useful constructs for building well-designed PL/SQL-based applications.”

Packages enable you to keep data in memory (for the quickest access) and share it across all your application components. The data would persist for the duration of the session. Unless you want to persist data to the table(s), permanent or temporary, and worry about how to maintain/clean it up later, packages are exactly what you need.

Develop an API to Perform Retrieval of the Next N Rows
The API I propose will be based on a package with one cursor and multiple procedures to open, close, and fetch rows from the cursor. The cursor stays open throughout the session, and it remembers the fetch history because it automatically sets its pointer to the next available row: fetching the first 10 rows would move the pointer to row 11, the next 10 to row 21, etc.

The package specification consists of five procedures:

  1. OpenCursor
  2. CloseCursor
  3. RetrieveRows
  4. SetRows – to set the default value of rows to fetch
  5. SetSalary – to set the default value of the salary parameter that will be passed to the cursor
And two types:
  1. tName – PL/SQL table of Emp.Ename type
  2. tSal – PL/SQL table of Emp.Sal type
The following listing shows the GetNextSet package specification:
create or replace package GetNextSet as      type tName is   table of Emp.Ename%type   index by binary_integer;      type tSal is   table of Emp.Sal%type   index by binary_integer;      procedure CloseCursor;      procedure OpenCursor (pSalary pls_integer := null);   procedure RetrieveRows (pRows   pls_integer := null,                           pSalary pls_integer := null);                              procedure SetRows (pRows pls_integer);      procedure SetSalary(pSalary pls_integer);end GetNextSet;/

The RetrieveRows procedure does the actual work of retrieving the next N rows. It uses the new Oracle 8i BULK COLLECT clause of the FETCH INTO command with the LIMIT keyword. The BULK COLLECT clause allows us to bulk-fetch rows from the cursor into the vName collection of employees’ names and the vSal collection of salaries, while LIMIT keyword sets the size of the batch:

fetch csrEmployees bulk collect into vName, vSal limit vRows;

That’s it! That one line of code does all the work! Passing batch size as a parameter to RetrieveRows procedure allows you to change it from one fetch to another (i.e., give me the first 10 rows, then the next 25 rows, etc.).

Listing 1 shows the GetNextSet package body.

Use the GetNextSet Package
You use the GetNextSet package by calling its procedures from SQL*Plus. First, you need to enable screen output by setting the SERVEROUTPUT value to ON:

set serveroutput on

Now, start retrieving rows of all employees with salaries greater or equal to 800 in chunks of three (remember, you have 14 total rows in the Emp table):

begin   GetNextSet.RetrieveRows(3, 800);end;/

Here is the output from the above code:

Opening the cursor.SMITH                 800JAMES                 950ADAMS                1100

You’ve sorted the first three rows by salary and then by name, because the cursor definition in your SQL is using the Emp_Sal_Ename index as you planned. Now retrieve the next set:

begin   GetNextSet.RetrieveRows;end;/

Here is the output of the above code:

MARTIN               1250WARD                 1250MILLER               1300

Only 14 rows are returned and you’re probably getting a little impatient. Change the size of the set from three to six:

begin   GetNextSet.RetrieveRows(6);end;/

Now you’re rolling:

TURNER               1500ALLEN                1600CLARK                2450BLAKE                2850JONES                2975FORD                 3000

Now, perform the last fetch:

begin   GetNextSet.RetrieveRows;end;/

Here are the results:

SCOTT                3000KING                 5000All rows retrieved. Closing the cursor.

To try out other procedures in the GetNextSet package, execute the following PL/SQL block, which changes your default settings for set size to five and for salary to 2000:

begin   GetNextSet.SetRows(5);   GetNextSet.SetSalary(2000);end;/

Now you will start retrieving all high-salaried employees in chunks of five:

begin   GetNextSet.RetrieveRows;end;/

Here is the output:

Opening the cursor.CLARK                2450BLAKE                2850JONES                2975FORD                 3000SCOTT                3000

The next fetch:

begin   GetNextSet.RetrieveRows;end;/

appears to be your last one:

KING                 5000All rows retrieved. Closing the cursor.

Looks good and runs instantly! Of course, the table is tiny?only 14 rows. What would happen in a table with hundreds of thousands of rows? I conducted simple performance testing to find out.

As I mentioned previously, the packages solution eliminates the need to retrieve or sort for the consequent requests. The cursor keeps the pointer on the next available row, so you get just the next N rows from the cursor every time. Easy and simple!

Performance Testing Results
I populated a table with 200,000 rows and retrieved rows in chunks of 10 with a total of 100,000 matching my filter criteria. Opening the cursor took 0.2 seconds, and fetching the next 10 rows took 0.15 seconds?regardless of whether it was the first 10 rows or a set in the middle of the 100,000 rows. I consider these good results, particularly in light of Solution #2 from the Erroneous Solutions section. Executing Solution #2 code got slower for subsequent requests and it became really painful as I got to the middle of the 100,000 rows. It took 15-17 seconds, which is completely unacceptable for retrieving just 10 rows.

What to Improve If You Are on Oracle 9i
Oracle 8i imposes a restriction on the usage of the BULK COLLECT clause: the target cannot be a collection of records; it has to be individual collections. That’s why I declared two types (tName and tSal) and initialized two collections (vName and vSal). Oracle lifted this restriction in 9i. You need to just declare a type and initialize a collection of that type.

Two new PL/SQL features in Oracle 8i that I like are Native Dynamic SQL (see New PL/SQL Features in Oracle 8i: Part I) and Bulk Binds (see New PL/SQL Features in Oracle 8i: Part II). The problem is you cannot use them together in 8i. If you open a cursor with Native Dynamic SQL, you cannot perform bulk-fetching. That’s why I couldn’t use Native Dynamic SQL and had to hardcode the cursor in my package body. The current code works only against Emp table and retrieves only two fields, name and salary.

But if you are on Oracle 9i, you could make the cursor definition code more generic by constructing a SQL string using passed in parameters: table name(s), list of fields, WHERE or ORDER BY clauses?all of it could be parameterized. Once your package is more generic, you can use it in many places. (Just to clarify: building a cursor’s SQL dynamically can be done in Oracle 8i also, but then you can not use the BULK BINDS/BULK COLLECT feature and you have to settle for fetching rows one at a time, which is not nearly as efficient as bulk-fetching.)

The good thing about packages is that the existing package structure allows you to add more procedures to support additional functionality (e.g., skipping forward or backward, getting the total amount of sets (pages) with a specified set size, and more). That’s just another example that shows why packages is the way to go in PL/SQL development.

devx-admin

devx-admin

Share the Post:
USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India,

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor

USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies in the USA. Through a

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in and explore the leaders in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India, and kickstart your journey to

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner for your online project. Your

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not

Cybersecurity Strategy

Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies

Global Layoffs

Tech Layoffs Are Getting Worse Globally

Since the start of 2023, the global technology sector has experienced a significant rise in layoffs, with over 236,000 workers being let go by 1,019 tech firms, as per data

Huawei Electric Dazzle

Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting

Cybersecurity Banking Revolution

Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising

FinTech Leadership

Terry Clune’s Fintech Empire

Over the past 30 years, Terry Clune has built a remarkable business empire, with CluneTech at the helm. The CEO and Founder has successfully created eight fintech firms, attracting renowned

The Role Of AI Within A Web Design Agency?

In the digital age, the role of Artificial Intelligence (AI) in web design is rapidly evolving, transitioning from a futuristic concept to practical tools used in design, coding, content writing

Generative AI Revolution

Is Generative AI the Next Internet?

The increasing demand for Generative AI models has led to a surge in its adoption across diverse sectors, with healthcare, automotive, and financial services being among the top beneficiaries. These

Microsoft Laptop

The New Surface Laptop Studio 2 Is Nuts

The Surface Laptop Studio 2 is a dynamic and robust all-in-one laptop designed for creators and professionals alike. It features a 14.4″ touchscreen and a cutting-edge design that is over

5G Innovations

GPU-Accelerated 5G in Japan

NTT DOCOMO, a global telecommunications giant, is set to break new ground in the industry as it prepares to launch a GPU-accelerated 5G network in Japan. This innovative approach will

AI Ethics

AI Journalism: Balancing Integrity and Innovation

An op-ed, produced using Microsoft’s Bing Chat AI software, recently appeared in the St. Louis Post-Dispatch, discussing the potential concerns surrounding the employment of artificial intelligence (AI) in journalism. These

Savings Extravaganza

Big Deal Days Extravaganza

The highly awaited Big Deal Days event for October 2023 is nearly here, scheduled for the 10th and 11th. Similar to the previous year, this autumn sale has already created

Cisco Splunk Deal

Cisco Splunk Deal Sparks Tech Acquisition Frenzy

Cisco’s recent massive purchase of Splunk, an AI-powered cybersecurity firm, for $28 billion signals a potential boost in tech deals after a year of subdued mergers and acquisitions in the

Iran Drone Expansion

Iran’s Jet-Propelled Drone Reshapes Power Balance

Iran has recently unveiled a jet-propelled variant of its Shahed series drone, marking a significant advancement in the nation’s drone technology. The new drone is poised to reshape the regional

Solar Geoengineering

Did the Overshoot Commission Shoot Down Geoengineering?

The Overshoot Commission has recently released a comprehensive report that discusses the controversial topic of Solar Geoengineering, also known as Solar Radiation Modification (SRM). The Commission’s primary objective is to

Remote Learning

Revolutionizing Remote Learning for Success

School districts are preparing to reveal a substantial technological upgrade designed to significantly improve remote learning experiences for both educators and students amid the ongoing pandemic. This major investment, which

Revolutionary SABERS Transforming

SABERS Batteries Transforming Industries

Scientists John Connell and Yi Lin from NASA’s Solid-state Architecture Batteries for Enhanced Rechargeability and Safety (SABERS) project are working on experimental solid-state battery packs that could dramatically change the

Build a Website

How Much Does It Cost to Build a Website?

Are you wondering how much it costs to build a website? The approximated cost is based on several factors, including which add-ons and platforms you choose. For example, a self-hosted