{"title"=>"Using network methodology to infer population substructure", "type"=>"journal", "authors"=>[{"first_name"=>"Dmitry", "last_name"=>"Prokopenko", "scopus_author_id"=>"57093909200"}, {"first_name"=>"Julian", "last_name"=>"Hecker", "scopus_author_id"=>"56779754700"}, {"first_name"=>"Edwin", "last_name"=>"Silverman", "scopus_author_id"=>"7201673300"}, {"first_name"=>"Markus M.", "last_name"=>"Nöthen", "scopus_author_id"=>"35355123900"}, {"first_name"=>"Matthias", "last_name"=>"Schmid", "scopus_author_id"=>"55684265900"}, {"first_name"=>"Christoph", "last_name"=>"Lange", "scopus_author_id"=>"7202366568"}, {"first_name"=>"Heide Loehlein", "last_name"=>"Fier", "scopus_author_id"=>"36637266800"}], "year"=>2015, "source"=>"PLoS ONE", "identifiers"=>{"scopus"=>"2-s2.0-84939127783", "sgr"=>"84939127783", "issn"=>"19326203", "doi"=>"10.1371/journal.pone.0130708", "pmid"=>"26098940", "pui"=>"605586177"}, "id"=>"e0e061bf-b205-36c3-86ab-d30347ab73fa", "abstract"=>"One of the main caveats of association studies is the possible affection by bias due to population stratification. Existing methods rely on model-based approaches like structure and ADMIXTURE or on principal component analysis like EIGENSTRAT. Here we provide a novel visualization technique and describe the problem of population substructure from a graph-theoretical point of view. We group the sequenced individuals into triads, which depict the relational structure, on the basis of a predefined pairwise similarity measure. We then merge the triads into a network and apply community detection algorithms in order to identify homogeneous subgroups or communities, which can further be incorporated as covariates into logistic regression. We apply our method to populations from different continents in the 1000 Genomes Project and evaluate the type 1 error based on the empirical p-values. The application to 1000 Genomes data suggests that the network approach provides a very fine resolution of the underlying ancestral population structure. Besides we show in simulations, that in the presence of discrete population structures, our developed approach maintains the type 1 error more precisely than existing approaches.", "link"=>"http://www.mendeley.com/research/using-network-methodology-infer-population-substructure", "reader_count"=>5, "reader_count_by_academic_status"=>{"Student > Ph. D. Student"=>2, "Student > Master"=>1, "Student > Bachelor"=>1, "Unspecified"=>1}, "reader_count_by_user_role"=>{"Student > Ph. D. Student"=>2, "Student > Master"=>1, "Student > Bachelor"=>1, "Unspecified"=>1}, "reader_count_by_subject_area"=>{"Agricultural and Biological Sciences"=>2, "Computer Science"=>2, "Unspecified"=>1}, "reader_count_by_subdiscipline"=>{"Agricultural and Biological Sciences"=>{"Agricultural and Biological Sciences"=>2}, "Computer Science"=>{"Computer Science"=>2}, "Unspecified"=>{"Unspecified"=>1}}, "reader_count_by_country"=>{"United Kingdom"=>1}, "group_count"=>0}

{"files"=>["https://ndownloader.figshare.com/files/2129870", "https://ndownloader.figshare.com/files/2129871", "https://ndownloader.figshare.com/files/2129872", "https://ndownloader.figshare.com/files/2129873"], "description"=>"<div><p>One of the main caveats of association studies is the possible affection by bias due to population stratification. Existing methods rely on model-based approaches like <i>structure </i> and ADMIXTURE or on principal component analysis like EIGENSTRAT. Here we provide a novel visualization technique and describe the problem of population substructure from a graph-theoretical point of view. We group the sequenced individuals into triads, which depict the relational structure, on the basis of a predefined pairwise similarity measure. We then merge the triads into a network and apply community detection algorithms in order to identify homogeneous subgroups or communities, which can further be incorporated as covariates into logistic regression. We apply our method to populations from different continents in the 1000 Genomes Project and evaluate the type 1 error based on the empirical p-values. The application to 1000 Genomes data suggests that the network approach provides a very fine resolution of the underlying ancestral population structure. Besides we show in simulations, that in the presence of discrete population structures, our developed approach maintains the type 1 error more precisely than existing approaches.</p></div>", "links"=>[], "tags"=>["type 1 error", "population structures", "1000 Genomes data", "population substructure", "admixture", "1000 genomes project", "Network Methodology", "predefined pairwise similarity measure", "Association Studies", "method", "population structure", "eigenstrat", "component analysis", "novel visualization technique", "community detection algorithms", "network approach", "Infer Population Substructure", "population stratification", "triad"], "article_id"=>1457440, "categories"=>["Uncategorised"], "users"=>["Dmitry Prokopenko", "Julian Hecker", "Edwin Silverman", "Markus M. Nöthen", "Matthias Schmid", "Christoph Lange", "Heide Loehlein Fier"], "doi"=>["https://dx.doi.org/10.1371/journal.pone.0130708.s001", "https://dx.doi.org/10.1371/journal.pone.0130708.s002", "https://dx.doi.org/10.1371/journal.pone.0130708.s003", "https://dx.doi.org/10.1371/journal.pone.0130708.s004"], "stats"=>{"downloads"=>28, "page_views"=>12, "likes"=>0}, "figshare_url"=>"https://figshare.com/articles/_Using_Network_Methodology_to_Infer_Population_Substructure_/1457440", "title"=>"Using Network Methodology to Infer Population Substructure", "pos_in_sequence"=>0, "defined_type"=>4, "published_date"=>"2015-06-22 04:09:07"}