BDGMM: The 1st IEEE International Workshop on Big Data Governance and Metadata and Management

Call for Papers

Big Data is a collection of data so large, so complex, so distributed, and growing so fast (or 5Vs- volume, variety, velocity, veracity, and value). It has been known for unlocking new sources of economic values, providing fresh insights into sciences, and assisting on policy making. However, Big Data is not practically consumable until it can be aggregated and integrated into a manner that a computer system can process. For instance, in the Internet of Things (IoT) environment, there is a great deal of variation in the hardware, software, coding methods, terminologies and nomenclatures used among the data generation systems. Given the variety of data locations, formats, structures and access policies, data aggregation has been extremely complex and difficult. More specifically, a health researcher was interested in finding answers to a series of questions, such as “How is the gene ‘myosin light chain 2’ associated with the chamber type hypertrophic cardiomyopathy? What is the similarity to a subset of the genes’ features? What are the potential connections among pairs of genes”? To answer these questions, one may retrieve information from databases he knows, such as the NCBI Gene database or PubMed database. In the Big Data era, it is highly likely that there are other repositories also storing the relevant data. Thus, we are wondering

Is there an approach to manage such big data, so that a single search engine available to obtain all relevant information drawn from a variety of data sources and to act as a whole?

How do we know if the data provided is related to the information contained in our study?

To achieve this objective, we need a mechanism to help us describe a digital source so well that allows it to be understood by both human and machine. Metadata is “data about data”. It is descriptive information about a particular dataset, object or resource, including how it is formatted, and when and by whom it is collected. With those information, the finding of and the working with particular instances of Big Data would become easier. Besides, the Big Data must be managed effectively. This has partially manifested in data models a.k.a. “NoSQL”. The goal of this multidisciplinary workshop is to gather both researchers and practitioners to discuss methodological, technical and standard aspects for Big Data management. Papers describing original research on both theoretical and practical aspects of metadata for Big Data management are solicited.

Scope of the Workshop

Topics include, but are not limited to:

Metadata standard(s) development for Big Data management

Methodologies, architecture and tools for metadata annotation, discovery, and interpretation

In addition to the accepted papers, the workshop intends to have an industry focus through a keynote speaker and hackathon challenges. The hackathon session will explore interoperable data infrastructure for Big Data Governance and Metadata Management that is scalable and can enable the Findability, Accessibility, Interoperability, and Reusability between heterogeneous datasets from various domains without worrying about data source and structure.