Question:
I’m working with large multidimensional dataset (> 10 dimensions and > 100,000 points). One particular problem I have is the use of virtual memory (for example, via new). When I input dataset with more than 6 dimensions and 100,000 points, the program halts with the message: “virtual memory exceeded in new()”. Can you provide me with some general guidelines when working with large datasets?
The program I create uses dynamic linklist structure, a fixsize lookup table (= datasize), and creation of dynamic array in a loop (I use new and delete within the loop so that the memory can be refreshed and reused).
Answer:
It’s hard to be very specific without more information, but, basically, it sounds like you are simply running out of memory.
While I don’t know how much memory each item in your dataset takes, if each one were only 1 byte (very unlikely), the number of bytes you’d need to allocate would be something like 1 with 30 zeros following. It sounded like you were even trying to allocate memory on top of that!
I know you provided more details but it wasn’t entirely clear exactly how those elements came together, and without understanding the task, I really can’t be more specific.
You need to rethink your approach; see if there are ways to use less memory; see if you can allocate only small portions of your data at a time. Either that, or move to a purely disk-based approach and wait about five or 10 years for hard disks that will handle that much memory.