The Life of a Data Point at the HII

Comments (0)

Transcript of The Life of a Data Point at the HII

The Life of a Data Point at the University of South FloridaHealth Informatics InstituteScreeningRecruitmentSourcesMarketingSocial MediaPartnersFormsLabsVocabulariesParticipantsInstitutionsData StorageSecurityData FormatsData CollectionData ProcessingStructuringCleaningData WarehousingPoliciesData TransformationTechniquesPrivate CloudDownloadRESTfulAPItranSMARTExplorationAnalysisPublicationToolsPlinQMetaphlanImpactWorld's largest grant funded Type I Diabetes Data Coordinating CenterSteven W. Fiske, M.A.Kenneth G. Young II, M.Ed.UnstructuredSemiStructuredStructuredInformation Technology team comprises about 50 employeesSoftware EngineersComputer EngineersSolutions ArchitectsDatabase EngineersStatistical ProgrammersQuality Assurance AnalystsBusiness AnalystsWhat does IT do in an epidemiological research environment?Holistically involved from data collection to analysisTechnology plays a vital role in the conduction and operation of scientific researchOther marketing approachesSocial media, partners, campaigns, and to recruit participantsContact registriesCampaignsVarious systems used to store dataOracleHadoop (Big Data)SQL ServerFile System90% of world's data was created in the last few yearsStructured Data Semi-structured DataUnstructured Data1TB of information is stored during each trading session on Wall Street Forms80% of world's data is unstructuredData formats vary Accelerometer ImagesVideoData is in more readable form and easily accessible for planning and executing clinical researchConsolidates data from a variety of sources to present a unified view of the dataConsists of clinical, laboratory, operational, and financial dataThe emerging field of data science is revolutionizing how we explore dataData scientists assist in the planning, collection, transformation, analysis and reporting of clinical trial data and communication of their resultsYour NameSponsor a Rack!E-mailMedical image archives are increasing by 20-40% annually30MB X-RAY150MB MRI1GB 3D CT SCAN120MB MAMMOGRAMSoftware EngineersComputer EngineersStatistical ProgrammersDatabase AdministratorsQuality AssuranceData ScientistsSolutions ArchitectsBusiness AnalystsDr. DoeConstantly evolving to meet rapid changes in technology Improves data integrityRemoves data structure complexity for researchersProvide a suite of options to access and download curated data.Secure interfaces give external collaborators the ability to run analysis on our high performance computing cluster (Big Data)Developed systems to improve data quality through automated data cleaningProgram systems to aggregate data from various sourcesData sharing policies to comply with established standards in the fieldPatient portal7 billion = World population6 billion = People with cellphonesElectronic case report formsSpecimen/lab systemIn 1979 a 250MB hard drive weighed 550lbsIn 2014 a 16GB microSD card weighs 4/10 of a gramCost of storage has decreased, but amount of data has increased504 TB Hadoop Cluster1 TB = 1,048,576 MB 336 core processing units30 nodes1,792 GB RAMHigh-density storage serversHorizontally scaling compute and storage systemOnline screeningGoogle AnalyticsRecruitment can be a challenging part of a clinical research study For example, over 100,000 participants were screened on the TEDDY study for a study sample size of less than 10,000 500,000+ clinical research study participants187+ million participant responses to clinical research questionsSecure web based system to facilitate the collection of clinical research dataAdverse Event systemNumerous large scale international multi-institute clinical research trialsGlobalization/localizationInternal and external researchers need access to transformed dataData sets can be large and complexHigh powered computing clusterGrid computingGraph databasesData miningData values must be verified for integrityValidation rules must be applied to dataA date of birth should not be April 29, 3014Data should be run through a series of data checking operationsLeverage tools to migrate and transfer data The Data WarehouseData must be combined from various sources and transformed into a readable formatRInteroperable with statistical programsTechnologies and tools provide researchers with advanced capabilitiesResearchers must perform advanced statistical analysis and predictive analytics on large amounts of dataSASRCrucial component of our institute