There is no Such Thing as Biomedical “Big Data”

At the moment, the world is obsessed with “Big Data” yet it sometimes seems that people who use this phrase don’t have a good grasp of its meaning. Like most good buzz-words, “Big Data” sparks the idea of something grand and complicated, while sounding ordinary enough that listeners feel like they have an intuitive understanding of the concept. However “Big Data” actually carries a specific technical meaning which is getting lost as the term becomes more popular.

The phrase’s predecessor, “Data Mining” was equally misunderstood. Originally called “database mining” (a subsequently trademarked term), the term “Data Mining” became common during the 1990s as many businesses rapidly adopted the use of relational database management systems (RDMS) such as Oracle. RDMS store, optimize, and manage large amounts of data on physical disks for the purpose of rapid search, retrieval and update. These large collections of data enabled businesses to extract new knowledge useful to their business practices by examining patterns within their data. Data mining refers to a collection of algorithms that attempt to extract knowledge (in the form of rules or associations) from large amounts of data by processing it in place on the disk, either within the RDMS or within large flat files. This is an important distinction, as the optimization and speed of algorithms that access data from the disk can be quite different from those which examine data within active memory.