big data

There used to be big oil, big tobacco, and big pharma... now it's big data

In information technology, "big data" traditionally refers to a collection of data so large and complex that it becomes difficult to process using database management tools. The challenges include capture, curation, storage, search, sharing, analysis, and visualization.

Big data is difficult to work with using relational databases (RDBMS) and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers". What is considered "big data" varies depending on the capabilities of the organization managing the set. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."

Historical perspective: Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks. According to Wikipedia, the world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as of 2012, every day 2.5 quintillion (2.5×1018) bytes of data were created. Which means data are proliferating at a rate that outpaces Moore's Law.