RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX

By submitting your information, you agree that devx.com may send you DevX offers via email, phone and text message, as well as email offers about other products and services that DevX believes may be of interest to you. DevX will process your information in accordance with the Quinstreet Privacy Policy.

Tip of the Day
Language: SQL Server
Expertise: Beginner
Nov 16, 1999



Building the Right Environment to Support AI, Machine Learning and Deep Learning

Importing Gigabytes of Data from a Flat File

I'm currently working on a data warehousing application using SQL Server 7. The problem I have is that I have to import 610,000 records of mainframe data per day into my database from a flat file.

I'm currently using BCP to copy the data into an initial import table, but then I have to perform a number of conversions on datetime, money, and some business-specific fields before writing the data into five main DB tables.

So far I'm using a cursor that is populated with the contents of the import table and then calls a number of stored procedures to format and convert the data before finally writing it to its respective tables. The server is a powerful machine (250GB HDD, 1GB RAM) so I thought it could handle the workload, but 100,000 records and 5 hours later I'm beginning to wonder.

If you have any great tips, tricks, or advice, that would be great as I have three years' worth of data—around 180GB to process my way through. Help!

To be honest, everything I anticipated you already mentioned, but here's a few grasping at straws...

BCP is fast, but I hear that the Bulk Transfer in DTS is no slouch either, so maybe that bears investigating as an alternative to straight BCP. I know it sounds like more overhead, but it may be worth a shot.

Is there any possibility of reconceptualizing the solution without using a cursor? If the data is already being scrubbed in an intermediate staging area, finding a single SQL statement to move the data will yield so much performance improvement as to be a really compelling reason to look for it. Cursors, though convenient, are never the choice for speed.

That's all I can suggest. Look at the bulk stuff in DTS—or, since you seem to be doing these transformations, perhaps a data-driven query would afford you some time savings in the entire process, though the loading part may be as slow.

DevX Pro
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date