After SNP testing to the hilt with decodeme and 23andme, it occurs that now is the time to dabble in other informative markers.

For the genetic genealogist Y-STRs clearly prove useful for kinship testing, especially when combined with Y-SNPs to add time depth and ensure that the "match" is not just a case of convergence.

Clearly combining X-SNPs and X-STRs is going to be the wave of the future, but for those of us stuck in the present, we will have to lay the groundwork for appreciating the value of a surf and turf blend of the two marker types.

Not having had any X-STRs tested it seemed best if ones could be chosen where there is a publication to which one can refer. DSX10079, DXS10074 and DXS10075 conform to this criteria. Szibor's team (Hering is usually the first author) have published a number of papers concerning the use of these three markers. A simple google search with the above keywords will work quite well. Two articles of particular interest are:

The second is freely available. The first, which includes tables of percentages in a German population of each 3 marker haplotype, is a pay per view and costs about as much as a dinner for two at a decent restaurant. I must have purchased this item in the days before I retired.

FTDNA will test these markers via their Advanced menu, and if your DNA is in Houston, and if my situation is any indication, the turn around time will only be a few weeks. All of these markers are located on a 280 Kb stretch within the cytoband Xq12. The authors did not find any examples of cross over within this region in the comlex kinship tests they did in 152 families. The recombination rate in this area is 0.3 cM (very low), so for untold generations these three markers should "travel together". The average mutation rate of these markers is 0.036 with DXS10075 being more stable and less prone to mutate.

Of course males will only have one repeat number for each marker, and females two. Of course this makes things easier for males (whats new), and so females will have to test relatives (e.g., a brother) to see which combinations of three are found on each chromosome.

My results were: DSX10074=16, DXS10075=17, and DXS10079=19.

I looked these up in Hering's charts for a German population and the markers are individually quite common (e.g., DXS10074=16 is found in 20% which is the second highest percentage in the database). However when these markers are combined the haplotype is quite rare, at least in Germany.

My above haplotype was observed in 2 (0.00256) of their sample of 781 German males.

My X ancestry is in the signature line and while German is possible, British is more likely. In going to the spreadsheet set up by Sean at:http://www.worldfamilies.net/geo/xdna/pats/raw or the DNA Fingerprint database noted elsewhere, I have a match! Not only that it is someone I know and to whom I owe lunch. A fellow resident of the OC in California and of British ancestry. I will of course contact my match and compare genealogical "X notes" although realizing that the match could be convergence (just as with Y-DNA work).

Anyway, the feet are now wet. Now they have to get moving on to the next step, whatever that might be.

It's funny you should pick that particular block on this particular day, David, because I have been working on that very sequence today, and was about to post it to the results page!

I've been studying the locations of all the STR's that are on the project results chart (as originally listed by Thomas Krahn), in relation to the SNP results on Ben's spreadsheet. I've found 2 or 3 regions of particular interest. There are others of interest as well, but I've moved them to lower priority for one reason or another, often because they contain too many genes to post, or because they are not yet being tested by FTDNA. The group that you posted about here was the most interesting of the ones I examined. I'll post the other high-priority blocks to the results chart as soon as I am done examining them.

This block does contain a large gene region, but it would be a shame to leave this sequence off the results chart on that basis, given its importance relative to the STR's, so I've gone ahead and posted it, and just left the genic region hidden. It's a shame that I need to do that, as there are some interesting things going on in that hidden region, but the part that I was able to show should be sufficient to give people an idea of the haplotypes involved.

I've modified the results chart to show the approximate positions of the STRs in the left margin. I debated creating a new row for each of them and inserting them into exactly the right rows, but that might make it difficult for people (including me) to paste their raw data into the chart in the proper uninterrupted sequence. I'll have to give that some more thought.

I am also debating whether it would be useful to post people's STR results right in the appropriate place on the SNP results chart, so that STR and SNP results can be directly compared, but that would greatly increase the size of the spreadsheet, probably adding more columns than it can handle (because most of the people who tested X-STR's didn't also test SNP's, and vice versa). I'll give that some further thought as well.

The recombination rate in this area is 0.3 cM (very low), so for untold generations these three markers should "travel together".

In case anybody is wondering why I came up with a slightly different recombination rate for this block on the results chart (0.25cM), it's just because I posted a larger block than the one that David was citing, and there are some wildly-varying recombination rates within that block. This observation makes me wonder how useful it really is to look at average recombination rates... but it's probably still a useful index for comparative purposes.

After SNP testing to the hilt with decodeme and 23andme, it occurs that now is the time to dabble in other informative markers.

For the genetic genealogist Y-STRs clearly prove useful for kinship testing, especially when combined with Y-SNPs to add time depth and ensure that the "match" is not just a case of convergence.

Clearly combining X-SNPs and X-STRs is going to be the wave of the future, but for those of us stuck in the present, we will have to lay the groundwork for appreciating the value of a surf and turf blend of the two marker types.

Not having had any X-STRs tested it seemed best if ones could be chosen where there is a publication to which one can refer. DSX10079, DXS10074 and DXS10075 conform to this criteria. Szibor's team (Hering is usually the first author) have published a number of papers concerning the use of these three markers. A simple google search with the above keywords will work quite well. Two articles of particular interest are:

The second is freely available. The first, which includes tables of percentages in a German population of each 3 marker haplotype, is a pay per view and costs about as much as a dinner for two at a decent restaurant. I must have purchased this item in the days before I retired.

FTDNA will test these markers via their Advanced menu, and if your DNA is in Houston, and if my situation is any indication, the turn around time will only be a few weeks. All of these markers are located on a 280 Kb stretch within the cytoband Xq12. The authors did not find any examples of cross over within this region in the comlex kinship tests they did in 152 families. The recombination rate in this area is 0.3 cM (very low), so for untold generations these three markers should "travel together". The average mutation rate of these markers is 0.036 with DXS10075 being more stable and less prone to mutate.

Of course males will only have one repeat number for each marker, and females two. Of course this makes things easier for males (whats new), and so females will have to test relatives (e.g., a brother) to see which combinations of three are found on each chromosome.

My results were: DSX10074=16, DXS10075=17, and DXS10079=19.

I looked these up in Hering's charts for a German population and the markers are individually quite common (e.g., DXS10074=16 is found in 20% which is the second highest percentage in the database). However when these markers are combined the haplotype is quite rare, at least in Germany.

My above haplotype was observed in 2 (0.00256) of their sample of 781 German males.

My X ancestry is in the signature line and while German is possible, British is more likely. In going to the spreadsheet set up by Sean at:http://www.worldfamilies.net/geo/xdna/pats/raw or the DNA Fingerprint database noted elsewhere, I have a match! Not only that it is someone I know and to whom I owe lunch. A fellow resident of the OC in California and of British ancestry. I will of course contact my match and compare genealogical "X notes" although realizing that the match could be convergence (just as with Y-DNA work).

Anyway, the feet are now wet. Now they have to get moving on to the next step, whatever that might be.

David.

I have been talking about this very same region on this web site and others. Let me summarize previous posts for you,

Maybe I am more optimistic or curious but there are STR results that I would like to see already in relation to our SNP blocks that are offered by FTDNA. For example, have you looked at the bimodal distribution of marker DXS10074 at DNA Fingerprint and wondered what caused it?

There are numerous people with 7 repeats from the Ukraine. Some (most?) are Ashkenazim.

So my question is what do their SNP blocks look like and do these differ from the majority of people who have 13 through 20 at DXS10074? I don’t see anyone with 9 or 10 or 11 or 12 although I see some with 8. The closest intergenic SNP that is tested at 23andMe would be rs1931545.

I know that I have nearly matching parents here (in the same block as the STR) in which the SNPs are typically found in Africa but almost never in Asia. So what do their STRs look like? My STR results are pending so I can’t answer that yet. But here are my 24 SNP results from position 66895296 ( rs1931545) through 67171306 (rs1576059) for anybody wanting to share SNPs plus STR DXS10074. GCG*GCCCAAGGCTATATGCCTGA (* is heterozygous CT)

I have deliberately left out the genes AR and OPHN1. My haplotype should be the minority found in Western Europe. We need also to see what the majority of Western Europeans look like and any other blocks that could show up in Eastern Europe that might be associated with the repeats of 7. Then we should look at Asians and Africans or other associations that are of interest.

So I think it is worth jump starting this investigation unless you can think of a very goodreason not to do so? Privacy was more of a concern to me than cost. Women spend moremoney on their hair per month than on the cost of these STRs.Kathy J.

C.In addition,For marker DXS10079, Thomas Krahn just corrected the position to 66632730 (corresponds to pre-AR gene, also my block ending with 66632931)For marker DXS10074, the position was corrected to 66893943 (corresponds to post-AR gene see above block)For marker DXS10075, the position was corrected to 66915015 (also in above block)

D.This morning I gave my results at DXS10074 as 7 and 8. Both my parents are 13 at DXS10075 and they nearly match in their SNP block here as well.

That means I am expecting my results to fall into the same haplogroup as the few Eastern Europeans who seem to have the 7 at DXS10074 and perhaps others.I am eager to hear from those with either the result of 7 or 8 at this marker because it would be really exciting if we can be the first to actually identify a haplogroup based on both SNP results and STR results. Maybe we can identify a geographical group as well.

David, your results are different at both the SNPs and STRS. You fall into a different haplogroup than me.Cool, huh?

So now we need to find others who match our two haplogroups and hopefully there won't be convergence. Kathy J.

I have been talking about this very same region on this web site and others. Let me summarize previous posts for you,

Kathy,

I do know that you've been discussing those STR markers (that's why I was examining that block at all yesterday, before I saw David's post), but I had no recollection of you having made your post (A) post above, where you had actually defined a SNP sequence to study. I must have missed that one completely, but you obviously did post it a while back--I'm so sorry about that!

I just added your results for the portion that overlap the block that I posted yesterday, and gave you credit on the results page. If you want to add your results for my upper portion (above the gene), feel free, and I'll post those as well.

As for the lower portion of the sequence that you originally posted (which got truncated when I added your results to that block)... I had actually been closely examining that portion of the sequence yesterday too, and decided not to include it because the block starts getting very "messy" in that region, with a lot of recombinant variants. I stuck with the higher-up block because it brackets all three of the STRs (even though it also includes a gene that I had to omit). If you have any particular reason for wanting to include the lower portion in the block, just say the word and I'll add it, but again, it's going to draw the focus away from the well-conserved haploblock that's above it. There's a very clear line of demarcation at the lower end of the block that I posted, when I examine the data in Ben's spreadsheet.

Thanks for summarizing your earlier posts and bringing this to my attention, Kathy, and thanks for getting the ball rolling with this one!

Another set of markers that I MYSELF posted for a friend has that same recombination at 10079..Sean? Anyone?Kathleen

Even though these STRs are located within a well-conserved block of SNPs, it is a block that shows some limited recombination (which makes it more interesting with respect to STRs, in my opinion, than a highly-static block, but also more complicated).

There are at least 3 major variants (types), and some more minor sub-variants, found within this block of SNPs... so when an individual of one type interbreeds with an individual of another type, it might look like recombination has just happened in the offspring, when really it's just a non-recombinant mixing of chromosomes from different types of individuals... Also, keep in mind that that the results for females are only presented in numerical order in the STR results chart for any given STR, not necessarily the order that they are really present on the chromosome for any given row.

Also, there could be a lot more recombinant types than are even shown on the SNP results chart, as the sample size is rather low at the moment. I'd be interested to see what things would look like if we added some Africans and Asians to the chart.

Hi KathySomeone else matches those markers on the Grid..they are 105b2..and it is a sibling pair..females..

I notice that the 2nd sibling has a slight recombination at 100079..

Does anyone want to speculate or do you know why that is?

Another set of markers that I MYSELF posted for a friend has that same recombination at 10079..Sean? Anyone?Kathleen

I assume you mean the female siblings have differing allele counts at 10079? Could be that the female siblings are exhibiting both maternal haploblocks and haploblocks are close cousins; matching on 10074 and 10075 and only slightly different on 10079?

I started posting a message here that might address some of the confusion here, but then decided that it would be better to post it on the main "X-STR Results Chart" thread, for general reference. So I'll just post a link to that message here, and then people can come back to this thread if they want to reply:

Am I imagining things, or did a couple of posts disappear from this thread in the last couple of hours? I know that things were getting moved over to a new server a little while ago and they were having some trouble with it, so I wonder if they had to revert to a backup file.

Kathy, I know you wrote a great post here earlier (at least I think it was in this thread, wasn't it?), which I don't see anymore, so you may want to try to reproduce it from memory as best as you can.

Also, Kathy, I revised the X-SNP results chart a bit to show how I envisioned showing STR and SNP data on the same spreadsheet, using your data as an example (after you posted here saying that you didn't mind people knowing your ID number). As I noted earlier, in my post that has disappeared, this format is going to create some data entry problems for me when I try to paste people's SNP results into the spreadsheet (I've already ran into a problem with it trying to add somebody's data to the page a few minutes ago), so I'm not sure if this is the best solution, but I thought I'd post it here to get people's feedback, before I decide if I should revert to the format I was using earlier.

Am I imagining things, or did a couple of posts disappear from this thread in the last couple of hours? I know that things were getting moved over to a new server a little while ago and they were having some trouble with it, so I wonder if they had to revert to a backup file.

Kathy, I know you wrote a great post here earlier (at least I think it was in this thread, wasn't it?), which I don't see anymore, so you may want to try to reproduce it from memory as best as you can.

Also, Kathy, I revised the X-SNP results chart a bit to show how I envisioned showing STR and SNP data on the same spreadsheet, using your data as an example (after you posted here saying that you didn't mind people knowing your ID number). As I noted earlier, in my post that has disappeared, this format is going to create some data entry problems for me when I try to paste people's SNP results into the spreadsheet (I've already ran into a problem with it trying to add somebody's data to the page a few minutes ago), so I'm not sure if this is the best solution, but I thought I'd post it here to get people's feedback, before I decide if I should revert to the format I was using earlier.

Yes, I believe my note at this thread was lost. I will try to recap but I am likely to make my messages even longer. Some of it I kept on another computer so I can retrieve it later. Yes, I think the only way to show the different tribes is to let people visualize what the (mostly European) minority types look like. I did reveal that I was 24a and b because that is the only way to visualize what I am talking about. Sean has the SNPs nicely colored. So maybe those who match my tribe of SNPs will also try to match my STRs. Normally I don’t think it is a good idea to reveal identities, and officially I tell people not to do it, but obviously I am a risk taker when it comes to exposing buried treasure. So only reveal what you feel comfortable exposing.

As we know, the X chromosome is made up of several ancestral lines spliced together. So the goal is to identify actual haploblocks that crossover infrequently so that we can determine ancestral origins for each line within the X. We are now starting to look at STR results as well as SNP results, looking for correlations. The nice thing about STRs is that these are not likely to be associated with genes.

Think of these X SNPs as major markers representing a haplogroup and the STR results would be the hypervariable regions within the haplogroup defining a subclade. I have no idea what population geneticists would call this type of "haplogroup" carried by both X-Eves and X-Adams. Does anybody know? These blocks are always subject to changewith crossover, but we think we have identified the best place to find published STRs associated with low recombination.

F.B. Machado has a letter to the editor in press, "Genetic map of human X-linked microsatellites used in forensic practice" Forensic Sci. Int Gene (2009) showing this area as well.

I am seeking to find out if the sub-block of 23andMe positions 66895296 (rs1931545) through 66920055 (rs5965443) of GCG*G (* is either C or T) is associated with STR X result DXS 10074 of 7 or 8 repeats and 10075 of 13 repeats. These positions were recently corrected at the DNA Fingerprint table if you have not seen these lately. TK just reassured me he would be working on getting it updated at FTDNA. I see that it is not expensive to order just two or three STR markers from FTDNA if you don't want the full panel.

I suspect that the European group that is the majority represented by the five SNP results TCACG will not have low numbers at DXS 10074. But we need more people to test.

For now, my question is, does a G at rs1931545 mean a low number such as 7 or 8 at DXS10074 and does a T mean a high number 13 - 20?

I may match some of the Ashkenazim from the Ukraine with the 10074 of 7 but I don't know if any of these people have done their 23andMe or deCODEme tests. For deCODEme users these 5 SNPs are: rs1931545, rs5918768, rs1337082, rs12010636, rs5965443 However, I just found out that the rs1337082 is not offered by deCODEme so that one would have to be left blank. The first one, rs1931545 a transversion, is the most important, at least for STR DXS10074. I have not checked to see if deCODEme has additional SNPs that would be useful.

At some point we may need a full sequence betweencritical markers, or something like that to show the different ancestral lines for both deep ancestry and more recent mutations. For those just wanting to order a few SNPs, the most important, to me anyway, would be 10074, 10075 and 10079.

Yes, I believe my note at this thread was lost. I will try to recap but I am likely to make my messages even longer.

<snip>

Ah, good, Kathy, I'm glad you were able to reproduce the post that got lost. From what I remember of your original post, it looks like you recreated it pretty completely.

I did reply to that lost message of your last night, and don't know if you had a chance to see it before it disappeared, but basically I was saying that I really like the way you are thinking about these STRs-within-SNPs as potentially providing different levels of resolution for the same haploblocks.

I also had noted last night, in my reply to you, that I reconfigured the X-SNP results chart to show your STR data merged with the STRs, so that we can evaluate whether this kind of format will be of enough use to justify the minor difficulties that it is causing transcribing people's data onto the chart.

I really hope we can get more people who have done both types of tests (X-STR and X-SNP) to contribute there results, as this is definitely looking to be a worthwhile area of study to pursue.

If you check out Table 2, you can visualize what different X tribes might look like based on both SNPs and STRs. Just knowing the number of repeats is not enough. You also have to know the "5' Repeat flanking region" and the "Variable repeat structure". Therefore, 15 repeats for the African with SNPs T and C is going to be a completely different "haplogroup" than the 15 repeats for the African with the CT SNPs and that could be diverged from the Germans with the CC SNPs.

Our goal should be to get a full sequence between identifiable markers. I tried to reconstruct my sequence and check for matches at GenBank but it does not look like we have access to the intergenic areas of the human genome project results. I could only find one sequence besides the reference sequence for a person with a disease.

Our goal should be to get a full sequence between identifiable markers.

Yeah, I agree that this sounds like it could be a very informative way to proceed. Unfortunately, I don't see that we'll be able to get this kind of data from the testing companies that project participants have been using (and I highly doubt that the testing labs could be convinced to test that level of data on a routine basis, given that the demand for focused X DNA testing doesn't seem to be very high right now), so I don't foresee that it's the kind of data that we could be posting on the project website anytime soon. Who knows what the near future will bring though.

Of course that doesn't mean that one of you can't analyze the existing data though, if you can get your hands on it somewhere. It's got to be out there somewhere, right?