Question:
I’m currently working on a data warehousing application using SQL Server 7. The problem I have is that I have to import 610,000 records of mainframe data per day into my database from a flat file.
I’m currently using BCP to copy the data into an initial import table, but then I have to perform a number of conversions on datetime, money, and some business-specific fields before writing the data into five main DB tables.
So far I’m using a cursor that is populated with the contents of the import table and then calls a number of stored procedures to format and convert the data before finally writing it to its respective tables. The server is a powerful machine (250GB HDD, 1GB RAM) so I thought it could handle the workload, but 100,000 records and 5 hours later I’m beginning to wonder.
If you have any great tips, tricks, or advice, that would be great as I have three years’ worth of data?around 180GB to process my way through. Help!
Answer:
To be honest, everything I anticipated you already mentioned, but here’s a few grasping at straws…
BCP is fast, but I hear that the Bulk Transfer in DTS is no slouch either, so maybe that bears investigating as an alternative to straight BCP. I know it sounds like more overhead, but it may be worth a shot.
Is there any possibility of reconceptualizing the solution without using a cursor? If the data is already being scrubbed in an intermediate staging area, finding a single SQL statement to move the data will yield so much performance improvement as to be a really compelling reason to look for it. Cursors, though convenient, are never the choice for speed.
That’s all I can suggest. Look at the bulk stuff in DTS?or, since you seem to be doing these transformations, perhaps a data-driven query would afford you some time savings in the entire process, though the loading part may be as slow.