Bottom Line:
However, due to the diverse nature of data, it is difficult to predict outcomes from it.The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets.The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

ABSTRACTA wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a "data modeler" tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

sensors-15-15772-f007: Time comparison of the proposed data modeler with traditional MS Excel program.

Mentions:
We have observed from our experiments that the proposed data modeler enhance average performance of both the domain expert and knowledge engineer by 84.1 percent i.e., saves 84.1 percent time of them. Evaluating the performance separately for expert and knowledge engineer, the tool saves 81.9% of expert time while 84.9% of the knowledge engineer. The time comparison of the proposed toll with the MS Excel program, where the datasets are manually combined to a unified dataset, is shown in Figure 7.

sensors-15-15772-f007: Time comparison of the proposed data modeler with traditional MS Excel program.

Mentions:
We have observed from our experiments that the proposed data modeler enhance average performance of both the domain expert and knowledge engineer by 84.1 percent i.e., saves 84.1 percent time of them. Evaluating the performance separately for expert and knowledge engineer, the tool saves 81.9% of expert time while 84.9% of the knowledge engineer. The time comparison of the proposed toll with the MS Excel program, where the datasets are manually combined to a unified dataset, is shown in Figure 7.

Bottom Line:
However, due to the diverse nature of data, it is difficult to predict outcomes from it.The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets.The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

ABSTRACTA wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a "data modeler" tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.