Genetics and South Asia

Behar et al Data

In their paper "The genome-wide structure of the Jewish people", Behar et al analyzed the genomes of some Jewish groups. More important than the Jewish samples (which include two South Asian Jewish groups) for us are the different South Asian, Middle Eastern, and European groups they sampled:

Ethnic group

Count

Saudis

20

Jordanians

20

Georgians

20

Turks

19

Iranians

19

Hungarians

19

Ethiopians

19

Armenians

19

Lezgins

18

Chuvashs

17

Syrians

16

Romanians

16

Uzbeks

15

Spaniards

12

Egyptians

12

Cypriots

12

Moroccans

10

Lithuanians

10

North Kannadi

9

Belorussian

9

Yemenese

8

Lebanese

7

Sakilli

4

Paniya

4

Cochin Jews

4

Bene Israel

4

Samaritians

2

Russian

2

Malayan

2

Of the 466 samples, I excluded 8 because they were either duplicates or too similar in their genomes to others.

Related

16 Comments.

I really appreciate this transparency about the datasets you're using; it lets us lowly commenters play along at home. Quick question: When you're pruning for linkage disequilibrium, what R^2 threshold are you using? It would be neat to see your summary statistics or your plink arguments.

This probably says more about how neurotic I am than anything else, but Behar, et al.'s labeling their South Indian sample "North_Kannadi" always annoyed me. It's one of those fake eastern adjectivalizations, like jihadi. It would've been better to use Kannadiga, or even Canarese.