Like the "system" or "software" bug - the "data" bug is a defect, fault, flaw, or imperfection in data. Data bugs may be hidden and difficult to find - considering the following:

Data quality issues

Data veracity issues

Data bias issues

Data cherry-picking issues

Data selection bias issues

Further, humans are flawed and have both naked and hidden biases as well as other incentives to skew data to obtain a desired result, including:

Confirmation bias issues

Narrative fallacy issues

Cognitive bias issues

Data science results from data bugs may be extremely serious - they are sometimes impossible or very difficult to detect and may trigger errors that can cause a myriad of secondary effects, resulting in an illusion of reality and bad decisions.

Moreover, data bugs may remain undetected for long periods of time. Data has many secondary uses with low barriers to sharing, combining with other data sources and transformation or manipulation.

What is urgently needed is a new "meta-data reporting system" that labels, defines, rates and categorizes all new and transformed data (structured, raw unstructured and semi-structured). This goes beyond the traditional simple "meta-data" definitions. Meta-data is information about data - describing how and when and by whom a particular set of data was collected, and how the data is formatted. Descriptive meta-data is about the data content and the creation, validation and transformation of the data - as well as specific instances of data application. Structural meta-data provides information about the technical design and specification of data structures.

A new meta-data reporting system should include:

Creation and origins of data (sensor, author, computer system)

Data transformation history (integration with other data)

Data quality ratings

Data veracity ratings

Data validation ratings

Records of specific instances of data applications and results

Potential bias ratings

Potential data manipulation ratings

This detailed meta-data information should follow the data like a "chain of data evidence" - for future users of the data. This is especially useful after the data is sliced, diced and combined with other data sources.

With sensors everywhere, from our cars and phones to roads and medical equipment, opportunities to collect data are endless. Combined with large scale analytics, new data driven business models are emerging, and are impacting healthcare, city planning, public transportation, crime prevention, power utilization, and local commerce.

This panel of experts will weigh in on the implications of this trend and examine the areas of greatest opportunity for innovation and business growth.