General Forums - Note: You must Be Logged In to post. Anyone can browse. => X-chromosome (X DNA) => Topic started by: Seán MacGorman Powell on January 20, 2009, 02:19:55 AM

Edit 2 June 2014:Please note that due to changes in the chip design used by 23andMe for their DNA tests, I can no longer accept new data submissions for the X-chromosome Project, either from 23andMe or from any other lab.

The project was designed for the old v2 chip formerly used by 23andMe, and due to the inclusion of many more SNPs on the later chip versions, these newer chips do not produce data output in a format which can be easily matched up with the old chip versions' data.

I will maintain the old data on the following page, for any researchers who are interested in viewing it:

http://1drv.ms/1eykBQY

Sean MacGorman Powell

====================================

I am starting to play around with the data uploading functionality of the X-chromosome Project website. I've created a sample of one possible way that I envision X haploblock data being tabulated. This is just a very rough "alpha version" of the table, but I wanted to post it for opinions before I proceed any farther with adding additional haploblock candidates.

The project website is adopted from the WorldFamilies Y-DNA projects, so is not really optimal for X DNA data yet. Please ignore the "Notes for understanding results" below the spreadsheet, as that is the standard language posted on all Y-DNA results pages, and at present cannot be deleted (we're working on that).

Here are some things to note when you look at the results spreadsheet:

1) I have colorized all SNP values that have the same nucleotide among all the testees, with the same color. This should facilitate the identification of different haplotypes within the blocks. The more uniform the color is across a block, the better-conserved the block is over time (and the more pronounced is the linkage disequilibrium for that region of the chromosome).

2) You will notice that the rows and columns are transposed on this spreadsheet compared with the way that these haploblocks were presented over at dna-forums.org (so you read down a column for any given person here, not across). This was necessary for several reasons, but primarily because we only currently have the capability to add a total of two spreadsheets to the project. I therefore decided to include all haploblocks on the same spreadsheet, to be ordered in the same order as their position on the chromosome.

Excel has many more rows than columns that can hold data, so it will be necessary to put the SNPs in rows, and the less-numerous testees in columns. People who have been studying Adriano Squecco's Y-DNA spreadsheet and Ben Moscia's X-DNA spreadsheet will be used to this format, but people who are used to the horizontal format we've been using at dna-forums may find that they need to mentally adjust the way they look at this data.

If anybody has trouble visualizing things in this orientation, you can easily tranpose the rows and columns into an Excel spreadsheet of your own. Feel free to ask me how to do this if you're not familiar with that function. If we eventually gain the ability to post multiple spreadsheets on this website, such that we can have a separate spreadsheet for each haploblock, then we can decide if it would be more informative to go back to a horizontal layout. In the meantime, I'd be interested in hearing your thoughts about which way works better for you, and also whether you feel it would be better to have all the data in one spreadsheet, versus broken out into separate spreadsheets by haploblock.

3) I originally had the names/ID numbers of the testees spelled out sideways in a top row of this spreadsheet, but the forum apparently can't handle that kind of formatting (or at least I haven't figured out how to make it work in the project's text editor). I therefore replaced everybody's names with a shorter "Testee ID" code, with a legend near the top right showing which code corresponds with which testee.

4) There is an invisible column on the left margin of the spreadsheet that I will later use to mark the position of the centromere relative to the SNP sequences.

5) Using this spreadsheet format, with various haploblocks arranged all on the same page, I am not able to separately group different "Types" for each haploblock the way we have been doing at dna-forums (due to the fact that the order of the testees must always be the same from one haploblock to the next, in order for the columns to line up properly). Hopefully the colorization will make this grouping unnecessary, as the groupings should be readily visible at a glance.

6) Underneath each haploblock candidate are some statistics that I calculated for that block. I discussed some of the assumptions for these calculations on this post over at dna-forums:

If anybody feels that these calculations are invalid or uninformative as stated, I would welcome any discussions here about a better way to summarize information about the recombination rates of the various haploblocks.

So please take a look and let me know if you think this format will be useful to you. There are several more haploblocks that I can then add to the spreadsheet (as we've been posting over at dna-forums), but it requires quite a lot of work to set up this spreadsheet, so please bear with me as I add them one or two haploblocks at a time.

I've just added a third haploblock candidate to the project results page, again copied from what was previously posted at dna-forums.org, but converted to the new project format.

As I noted earlier, it is quite a lot of work to convert from the dna-forums.org format to what is being posted here (mainly due to all the colorization and the rearranging of all the columns to the same order for each block), so please bear with me while I add these blocks one at a time. I am starting with the blocks that I contributed, simply because they are easier for me to deal with, but I'll get around to adding everybody else's blocks too.

I've also done some minor reformatting of the spreadsheet, and I've renumbered the testee ID numbers. I may periodically reorder the columns in the spreadsheet, so please don't assume that the ID number that is currently assigned to you will be the same in the next iteration of the spreadsheet. Always check the legend on the right margin of the spreadsheet for the current testee ID assignments.

If anybody would like their name listed in a different way, just let me know.

Again, please note that these blocks are being arranged on the spreadsheet in the order of their relative positions on the X chromosome (not in the order that they are added to the spreadsheet). The latest block is the one that's in between the other two.

I've just added a fourth haploblock to the spreadsheet, again extracted from dna-forums.org:

http://www.worldfamilies.net/geo/xdna/results/raw

I've also started adding X-STR marker locations to the spreadsheet, in the appropriate positions in the chromosome sequence (as they've been getting posted on the Rootsweb list). If I miss any, or if anybody knows where the remaining STRs that are currently being tested are located in the nucleotide sequence, please let me know and I'll add them.

I'm trying to come up with a better way to present X-STR test results given the limited number of spreadsheet slots that I have available on the project website, so stay tuned.

I've just added a couple more haploblock candidates to the spreadsheet, extracted from the posts at dna-forums.org:

http://www.worldfamilies.net/geo/xdna/results/raw

I'm discovering that it's no easy matter to consolidate all the haploblocks that people have been posting, such that they are all formatted the same way! Also, people's names are listed in different ways from one source to another, so it's proving to be a challenge at times to figure out who is whom. People are listed by Ben's spreadsheet codes in some places, but by name in others, so it is possible that I have you listed by name in one of the right-hand columns of the spreadsheet for one haploblock, whereas I have you listed by ID number for a different haploblock. This makes no difference at all to any data analyses, but if you would prefer that I list you in a different way or consolidate your entries into a single column, just let me know. Otherwise, it's no big deal to have you listed in two different ways.

I'm adding researchers' haploblock contributions one at a time now in the order that they occur on the chromosome as I work my way down the chromosome (not in the order that they were discovered), so please bear with me if I haven't posted yours yet--it's going to take me a while to get to them all.

That's really up to the person who identified the haploblock candidate. In some cases, they started out simply as a female testee noticing that she was completely homozygous (i.e., the same nucleotide value duplicated at any given SNP position for both of her X chromosomes) for a long sequence of SNPs (indicating that both of her parents had exactly the same SNP values for that block), so she invited other people to post their results for that block to see if they too shared those values. Chances are excellent that if a female is homozygous for a stretch of SNPs, then many other people will turn out to be too, and a new haploblock is born.

Speaking for myself, I visually scanned Ben Moscia's spreadsheet of X DNA testees (with the aid of some conditional formatting) and looked for blocks where most people (if not everyone) had the same SNP values as me. I ignored any no-calls and treated them as "wildcards" (as Jim Turner put it). Then I subjectively decided on upper and lower bounds for the sequence, to bracket a block that demonstrated minimal-to-no evidence of recombination. So the haploblocks that I contributed were visualized by comparing several different people simultaneously. It's effectively the same technique as a female looking for homozygous blocks in her own test results, but expanded to more than just two people being compared at a time.

Note that I am calling these blocks "haploblock candidates," because we don't really know at the time of their discovery how well-conserved they are across everybody, and different people have different definitions for the population that a block has to be conserved across (and for the length of the sequences) for it to be defined as a "haploblock." So people can decide for themselves if it really is a haploblock.

The trick is finding haploblocks that don't contain any genes though, because we don't want to include any of those, for privacy reasons.

I've rearranged the columns so that the testees are in alphabetical order, and reassigned new ID numbers to everyone.

In the words of Jim Carey, "smokin". What an amazing amount of info is now, thanks to your tireless efforts GhostX, availble for consideration. We are well and truly underway. Hopefully Terry will be able to add a bit more "X specificity" and functional space for what we hope to do. I expect that this will happen in due course, and in the meanwhile we have plenty to explore. I have a few things to add today.

In the words of Jim Carey, "smokin". What an amazing amount of info is now, thanks to your tireless efforts GhostX, availble for consideration. We are well and truly underway. Hopefully Terry will be able to add a bit more "X specificity" and functional space for what we hope to do. I expect that this will happen in due course, and in the meanwhile we have plenty to explore. I have a few things to add today.

Thanks for the comments, David. The thanks really deserve to be spread around among all of us enthusiastic amateur genetic genealogists who take the trouble to look for these haploblocks (and other kinds of data), and the testing lab customers who contribute their test results. It's obvious that it's not the sole domain of professional geneticists to advance this field of study (though their contributions are invaluable, and we wouldn't be where we are without them).

I agree with David, this is good work GhostX! This method is good for smaller interesting blocks, but I dont see how comparing several hundred SNP would fit so well using this method.

Thanks for the comment, Svaale. I agree with you that this wouldn't work so well for really large blocks of several hundred SNPs, but blocks that large would be much more specific to smaller subset of the population of testees, and would be outside of the scope of what we are able to present here on the results page (indeed, it would probably exceed the size of the spreadsheet slot available to us!). It's going to be up to specific researchers who have interest in those large blocks to analyze them on their own, and hopefully report their findings here.

My goals with the spreadsheet that's posted here, at least for the present, is to identify haploblock candidates that are extremely well-conserved across most people, which most people would therefore have common interest in, and to consolidate the blocks that have been examined so far into a single place, with a common format. These blocks appear to be largely in the range of about 20-60 SNPs or so (judging from the contributions so far).

I'm actually beginning to wonder how much bigger the spreadsheet can get before we run out of space available to us at the website...

This one was by far the hardest one for me to colorize, and some of the color groupings are largely arbitrary, with various possible recombination and mutation events producing them. There's quite a lot of recombination that obviously happened in this block, though there are still some sub-segments that are well-conserved.

It seems to me that blocks like this one might be more useful for separating out subgroups and identifying more recent lineage splits (relatively speaking--we're still potentially talking about many thousands of years between crossover events), whereas blocks that are more uniformly-colored might be more useful for reconstructing very ancient ancestry, and for determining what the earliest versions of the human X chromosome looked like--at least as far as is possible to trace through the X chromosome.

I've also recolorized some of the sequences in a few of the blocks. The more I stare at a given haploblock, the more the patterns of recombination seem to rearrange themselves in my mind! There are several different ways that the recombination events could have been indicated (and on many occasions it is not clear whether a recombination or a mutation was the cause of a variant), so this is just my interpretation.

I think that's the last of the haploblocks that's been discussed on the "X-DNA Haploblocks" sub-forum at dna-forums. If I missed any, or if anybody has any corrections or data additions, please feel free to send me a PM and I'll be happy to fix them.

If anybody has a new haploblock that they've discovered, please feel free to start a new topic thread on this forum board, and I'll post it to the spreadsheet if it looks like a lot of people share the sequence. Again, for privacy reasons, please be careful not to post any sequences that contain genes.

Hi Sean,I am in the spreadsheet under some haploblocks as testee 55 "Eldon" and under some other haploblocks as testee 125 "Wade". That was not your mistake but maybe you could fix it. There are six haploblocks for which I have not yet provided any information. I have that information in an Excel spreadsheet. How may I get that spreadsheet to you? Regards,Eldon Wade ewade@cfl.rr.com

Hi Sean,I am in the spreadsheet under some haploblocks as testee 55 "Eldon" and under some other haploblocks as testee 125 "Wade". That was not your mistake but maybe you could fix it. There are six haploblocks for which I have not yet provided any information. I have that information in an Excel spreadsheet. How may I get that spreadsheet to you? Regards,Eldon Wade ewade@cfl.rr.com

Hi Eldon,

I'll be happy to make those corrections for you. I'll send you a PM with some details in a minute.

1) Added data from Ben Moscia's spreadsheet for all haploblocks for which this data was incomplete. This updates the haploblock data by over 30 new testees for several haploblocks.

2) Renumbered the Testee ID numbers, due to new members having been inserted into various places in the spreadsheet. Please check the legend for your current ID number each time you visit the results page.

3) Added ancestry information to testees, where available (info taken from Ben Moscia's spreadsheet and David Faux's spreadsheets). See the legend in the right margin of the spreadsheet, and please let me know if anything is incorrect or missing.

When reporting ancestry, please make sure you are reporting your X-DNA lineages only. X-DNA ancestor contributors can be determined from charts found on the following websites:

I would prefer to show actual percentage contributions for all X-DNA ancestors by major ethnic group/country, if known (see the spreadsheet's legend for some examples--look for the entries showing percentages). These percentages can be calculated from the above charts.

A new haploblock candidate has just been posted on the project's X-SNP results page. It's between position numbers 105,197,873 and 105,712,333 (SNPs rs5916968 thru rs5917009):

http://www.worldfamilies.net/geo/xdna/results

This is an intergenic region, as far as I am able to determine.

I have only posted the results for people who have given me explicit permission for this particular block, even if they had previously told me that they no longer care to be anonymous in Ben's spreadsheet. I have also posted the results of all people from Ben's spreadsheet who are still completely anonymous. If you are one of the people who is in Ben's spreadsheet, but who I had previously removed from the anonymous section, I have your data all ready to add to the results page, so just send me a PM and let me know that I have your permission to do so (check the project results page to see if your data is there first).

Anybody who is not on Ben's spreadsheet can also PM me with their e-mail address if they want to be included, and I will send you a data submission form on which you can enter your data. This also applies to any other haploblock candidates for which you want to submit data.

1) Added a new project member, currently assigned Testee ID#35 (ID numbers subject to change, so always consult the legend on the right margin of the results chart).

2) Added a new haploblock candidate, between positions 66,573,001 (rs2497931) and 67,018,756 (rs2781516)

3) Began adding X-STR positions to the results chart to show their locations relative to the SNP's. So far I have only done this to the new block mentioned above, but I will continue to add more as they are found to be relevant to a particular haploblock candidate.

It would not be difficult to gather all the SNP results from the HapMap Utah residents and put thesein our Haploblocks. It is task to go through all the batches of results, but it can be done.

This is the group I am talking about:“30 mother-father-child trios from the CEPH collection (Utah residents with ancestry from northern and western Europe), representing one of the populations studied in the International HapMap project ( http://www.hapmap.org). See http://www.hapmap.org/citinghapmap.html.en for further information about this population and others studied in the project. http://www.hapmap.org/hapmappopulations.html.en also has relevant information.”

But I don’t think we can find STR results on any of these individuals. Maybe they are all LDS members who have extensive pedigrees, and some of them would not mind at least giving us their X percentages and would be willing to run their DNA for STR testing. Well you never know, some of them may read this forum some day and want to participate by allowing us to use their intergenic results in order to push genetic genealogy forward.

It would not be difficult to gather all the SNP results from the HapMap Utah residents and put thesein our Haploblocks. It is task to go through all the batches of results, but it can be done.

That's a really good idea, Kathy. I'm rather bogged down with some data analysis issues at my "real job" right now, and may not have time to look into that for a while, but if anybody else cares to do that and submit the data to me, then I'll be happy to add it to the project results chart (assuming there are no restrictions on reproducing such data). I have a data submission form already prepared (for use for new members to submit their data) that would work very nicely for such a purpose.

It would not be difficult to gather all the SNP results from the HapMap Utah residents and put thesein our Haploblocks. It is task to go through all the batches of results, but it can be done.

That's a really good idea, Kathy. I'm rather bogged down with some data analysis issues at my "real job" right now, and may not have time to look into that for a while, but if anybody else cares to do that and submit the data to me, then I'll be happy to add it to the project results chart (assuming there are no restrictions on reproducing such data). I have a data submission form already prepared (for use for new members to submit their data) that would work very nicely for such a purpose.

You mean you are not ready to quit your day job?Anthro-X-genetic genealogy pays so well.

I've just added a new person's data to the X-SNP results chart. This is testee ID #35, an anonymous contributor from Ben's spreadsheet, with reported ancestry from Ukraine, Belarus, and Lithuania:

http://www.worldfamilies.net/geo/xdna/results/raw

I've also renumbered most of the testee ID numbers, to make room for this contributor and other future contributors from Ben's spreadsheet, so be sure to check the blue legend box on the far right of the chart for your current ID number.

Also, I would like to again urge everybody to make sure that the ancestry that they reported is for their X chromosome only, as discussed in this thread:

1) Added results for new contributor "ID037," with 100% German ancestry2) Added results for new contributor "Nevelainen," with 100% Finnish ancestry3) Added some incomplete data for previous contributor "ID006," with British Isles ancestry.

1) Added results for new contributor "ID037," with 100% German ancestry2) Added results for new contributor "Nevelainen," with 100% Finnish ancestry3) Added some incomplete data for previous contributor "ID006," with British Isles ancestry.

As usual, I've renumbered the testee ID numbers, so check the legend.

http://www.worldfamilies.net/geo/xdna/results?raw=1

This new Finn "Nevelainen" is interesting in my favorite slow-moving ancient block between 66,228,526 and 66,564,941. My parents could be a combination of two uncommon haplotypes, this Finn in one X block and also the X that has the nice crossover recombination, Italian/Ashkenazi “Warwick”. So if you will permit me to (again) engage in seemingly senseless unscientific speculation, that could mean a couple of ancient founders since I have neither Finnish nor Italian nor Ashkenazi in my background. So we will have to be on the lookout for more people falling into these two haplotypes to see if we can find the X founders in some likely geographical area.Kathy J.

1) Added results for new contributor "ID037," with 100% German ancestry2) Added results for new contributor "Nevelainen," with 100% Finnish ancestry3) Added some incomplete data for previous contributor "ID006," with British Isles ancestry.

As usual, I've renumbered the testee ID numbers, so check the legend.

http://www.worldfamilies.net/geo/xdna/results?raw=1

This new Finn "Nevelainen" is interesting in my favorite slow-moving ancient block between 66,228,526 and 66,564,941. My parents could be a combination of two uncommon haplotypes, this Finn in one X block and also the X that has the nice crossover recombination, Italian/Ashkenazi “Warwick”. So if you will permit me to (again) engage in seemingly senseless unscientific speculation, that could mean a couple of ancient founders since I have neither Finnish nor Italian nor Ashkenazi in my background. So we will have to be on the lookout for more people falling into these two haplotypes to see if we can find the X founders in some likely geographical area.Kathy J.

That's probably my favorite haploblock too. It's so stable, and such a long block to be so well-conserved. It really impresses me that over 336,000 base pairs could be that preserved (with just a bit of variation) over what is obviously a very ancient span of time.

I love it when a new project contributor shows up who is 100% of any ethnicity. I think I actually uttered out loud, "Yessss, another one!" when I saw the latest contributor.

1) Added results for new contributor "ID037," with 100% German ancestry2) Added results for new contributor "Nevelainen," with 100% Finnish ancestry3) Added some incomplete data for previous contributor "ID006," with British Isles ancestry.

As usual, I've renumbered the testee ID numbers, so check the legend.

http://www.worldfamilies.net/geo/xdna/results?raw=1

This new Finn "Nevelainen" is interesting in my favorite slow-moving ancient block between 66,228,526 and 66,564,941. My parents could be a combination of two uncommon haplotypes, this Finn in one X block and also the X that has the nice crossover recombination, Italian/Ashkenazi “Warwick”. So if you will permit me to (again) engage in seemingly senseless unscientific speculation, that could mean a couple of ancient founders since I have neither Finnish nor Italian nor Ashkenazi in my background. So we will have to be on the lookout for more people falling into these two haplotypes to see if we can find the X founders in some likely geographical area.Kathy J.

I checked the HapMap group to see if I could find European matches with our more unusual haplotypes in this block and it looks like in the Utah group, NA07022, NA10837 and NA12752 match our #12, NA11891, NA11930 and NA12413 match Nevelainen and NA07349 and NA11839 match Warwick. I think most of the others fit into our main haplotypes but I was only eyeballing the results so I could have missed something.

So the prize goes to the person who can figure out who these founder X-eves could have been. When and where did each one live? Were they Europeans? Kathy J.

I added two more contributors to the X-SNP results chart today, labeled as follows:

1) "Wehba," with 50% Lebanese / 50% unspecified Colonial American ancestry. That Lebanese part is a very welcome contribution to the project! This is a female, so her results are presented as usual in as an "a" and "b" column, with asterisks replacing the heterozygous SNPs.

2) "Heffernan," with British Isles ancestry.

I've renumbered the testee ID numbers as usual, to make room for the new contributors.

I added two additional contributor's results today, one labeled "Cofgene2 (father)," with 100% German ancestry, and the other adding some previously-incomplete data for member "Haegen" (Belgian ancestry).

I just added a column to the legend (the blue box at the far left margin of the results chart) for contributors to note their mtDNA haplogroup.

Note that the mtDNA lineage is only a small proportion of a person's X-DNA (a very small proportion, after going back several generations), but this information may be helpful to some researchers.

If you've already contributed data to the results chart, I encourage you to send me a PM with your mtDNA haplogroup, and I will add it to the chart. While you're at it, please make sure that your ancestry percentages are only for your X-chromosome lineages, not your overall ancestry. There are links on the bottom of the results chart explaining how to calculate this ancestry.

http://www.worldfamilies.net/geo/xdna/results?raw=1

When you reply, please let me know which chart(s) you have data submitted to (X-STR and/or X-SNP), and how you are identified in the chart(s). Please do not refer to the "Testee ID" number on the X-SNP chart, because those numbers change frequently, but it's okay to refer to an ID number on the X-STR chart, because those numbers are relatively fixed.

Please note that I am no longer extracting data for female contributors from Ben's spreadsheet, because it is too time-consuming to manually phase the data and delete all the heterozygous SNPs (i.e., SNPs for which you have two different nucleotide base letters in your results, indicating a different nucleotide inherited from each parent), and because presenting such incomplete data adds little useful information to the results chart. However, any women who wish to be included can send me a PM with their e-mail address, and I will send you a data submission form with instructions on how you can report just your homozygous results to me, after which I would be happy to add it to the chart. Your results will be presented anonymously unless you choose to be shown otherwise.

Please note that I am no longer extracting data for female contributors from Ben's spreadsheet, because it is too time-consuming to manually phase the data and delete all the heterozygous SNPs (i.e., SNPs for which you have two different nucleotide base letters in your results, indicating a different nucleotide inherited from each parent), and because presenting such incomplete data adds little useful information to the results chart. However, any women who wish to be included can send me a PM with their e-mail address, and I will send you a data submission form with instructions on how you can report just your homozygous results to me, after which I would be happy to add it to the chart. Your results will be presented anonymously unless you choose to be shown otherwise.

Results chart:http://www.worldfamilies.net/geo/xdna/results?raw=1

The chart didn't get uploaded properly for some reason earlier today. It should be visible now.

There are some interesting new haplotypes contributed by some of these people for the haploblock candidate that runs from position #66,228,526 - 66,564,941. Specifically, see the new sub-block shown in green between positions 66,456-836 - 66,484,467. Interestingly, the two people who show that new green block do not have any reported common recent ancestral origins.

Please note that I am no longer extracting data for female contributors from Ben's spreadsheet, because it is too time-consuming to manually phase the data and delete all the heterozygous SNPs (i.e., SNPs for which you have two different nucleotide base letters in your results, indicating a different nucleotide inherited from each parent), and because presenting such incomplete data adds little useful information to the results chart. However, any women who wish to be included can send me a PM with their e-mail address, and I will send you a data submission form with instructions on how you can report just your homozygous results to me, after which I would be happy to add it to the chart. Your results will be presented anonymously unless you choose to be shown otherwise.

Results chart:http://www.worldfamilies.net/geo/xdna/results?raw=1

The chart didn't get uploaded properly for some reason earlier today. It should be visible now.

Sean

Check your data. I sent in number ID043 who should match the yellow group. There was a frameshift in the data. Thanks for all your help.Kathy

Check your data. I sent in number ID043 who should match the yellow group. There was a frameshift in the data. Thanks for all your help.Kathy

Thanks Kathy, you're right. There appears to have been a bug in the program I wrote to convert the data from Ben's spreadsheet to the format of the project results chart. Why that only affected two of the datasets I added today, and not the others, is a mystery to me, but I set it so that it shouldn't happen again. No wonder the data looked so strange for those two people! The fact that two different people shared the same odd recombination should have tipped me off. I'm glad you were paying attention.

Hopefully there are no other frameshifts with any of the other data. It should stand out very clearly if it happened elsewhere, and I don't see any such standouts.

I added a new contributor's results today, labeled "ASmith." Her X-ancestry is 100% English. Note that because she is female, all of her heterozygous results have been replaced with asterisks, as they could not be resolved between her two X chromosomes.

I added two new contributors' results today, labeled "ID047" and "ID048." Neither of them unfortunately provided any X-chromosome ancestry information.

http://www.worldfamilies.net/geo/xdna/results?raw=1

ID047 has a very interesting and rather unique haplotype in the block (actually a subsection of one of the haploblock candidates) that runs from positions 75,322,143 to 76,202,684. Parts of that block have never been seen before in the dataset, and parts of it are similar to Currie's results, and I therefore wonder if they might represent African or Native American (or possibly earlier Asian) ancestry. I'd have to examine the SNPs one-by-one in dbSNP before I could really come to that conclusion, but it is interesting.

Hopefully we can get the ancestry info for these contributors, as it could be particularly enlightening in this case.

I added two new contributors' results today, labeled "ID047" and "ID048." Neither of them unfortunately provided any X-chromosome ancestry information.

http://www.worldfamilies.net/geo/xdna/results?raw=1

ID047 has a very interesting and rather unique haplotype in the block (actually a subsection of one of the haploblock candidates) that runs from positions 75,322,143 to 76,202,684. Parts of that block have never been seen before in the dataset, and parts of it are similar to Currie's results, and I therefore wonder if they might represent African or Native American (or possibly earlier Asian) ancestry. I'd have to examine the SNPs one-by-one in dbSNP before I could really come to that conclusion, but it is interesting.

Hopefully we can get the ancestry info for these contributors, as it could be particularly enlightening in this case.

Some of the SNPs could have had a founder in Asia (or Native American?) such as rs6648142 (75844970) and rs958410 (76143152) but his 66 million block sure looks like the sub-Saharan African founder. So as you say, perhaps a mixture (why not African, Native American + European or similar heritage?) but I know better than to rule out anything. It would sure be interesting to see where these ancestors have been in the ancient past, or could there be a more recent ancestor in common between the two, ID047 and Currie?Such secrets are hidden in our genes.

I am starting to see a block that seems like it may be strongly associated with British Isles ancestry. It's the green block that runs from position # 75,919,876 to 76,202,684. It could possibly be associated with German ancestry too, but so far, every contributor who has this block has British Isles ancestry. That it not to say the converse: that every contributor with British Isles ancestry has this block.

Of 83 contributors who reported data for this block, only 4 of them (5%) have this haplotype, and all 4 of them have British Isles ancestry. Of course that's nowhere near enough of a sample to come to any conclusions (especially given how common British Isles ancestry is among DNA testees), but I thought it was an interesting trend that may merit attention as more people contribute their data.

Thank you for your patience waiting for this update, as I have been out of town on a long trip, and am just now getting caught up.

I noted one interesting observation regarding ID070: If you compare his results with those of ID047 for the haploblock candidate that runs from positions 75,003,877 to 76,603,730, these two people are identical, and have a highly unusual haplotype (pay particular attention to the single black-colored SNP and the olive-green block). They are not identical for any other haploblock in the spreadsheet. I compared their raw data from Ben's spreadsheet, and extending the ends of this haploblock, they actually share a 6.64 Mb block, not counting no-calls (the haploblock itself is 1.60 Mb). They obviously share a distant ancestor whose haplotype, with respect to that haploblock, is rare.

Unfortunately, ID047 did not provide any ancestry information with his data submission. That information could be very informative. Ben, if you are reading this, perhaps you could contact him (assuming you still have his contact info) and encourage him to anonymously submit that information to you, and then pass it along to me (or have him contact me directly)?

Just as a side note, I do not routinely extract data from females from Ben's spreadsheet for inclusion in the X-chromosome Project, because the data from females' two X chromosomes is unfortunately too ambiguous and incomplete for it to be of much use in any project data analyses. I will gladly accept data from women who wish to be included for their own purposes, however, upon request.

Due to almost nonexistent participation in this X-DNA forum lately, all future updates to the project's results charts will be announced over at the following dna-forums.org board, until further notice:

I believe that more people will see the updates there. I encourage people to continue to use the WorldFamiles boards for any X-DNA discussions. I have cross-posted this message there, but will not continue to do so after today.