Just how big are Big Data anyway? There’s no fixed answer. In fact, Wikipedia for one defines Big Data as a “collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.”
Other definitions follow the same pattern: if your data set is so large that traditional or available tools aren’t up to the task of processing it, then it counts as Big Data.
Not only is this definition relative to the industry and type of data problem, it’s also a moving target. Today, for example, a data set that requires Hadoop and MapReduce for processing would typically fall into the Big Data bucket — for now. But once Hadoop matures enough so that it qualifies as a traditional data processing application, then any data challenge within the reach of Hadoop can no longer be considered to be Big Data.
Does that mean that once our current crop of Big Data tools — Hadoop and its brethren — reach a level of maturity, then no data sets will qualify as Big Data? Not on your life. Remember, the quantity of data we may wish to process continues to grow exponentially. There will always be a Big Data category at the fringe of what our tools can currently handle. As the tools mature, just how big data sets must be to qualify as Big Data will continue to grow. But we’ll never, ever run out of Big Data.