Which SNPs are selected to go into the Wiki? And where are they selected from?[edit]

Anything for which we can find something worthy of recording. Our emphasis is on SNPs and mutations that have significant medical or genealogical consequences and are reproducible (for example, the reported consequence has been independently replicated by at least one group besides the first group reporting the finding). These are typically found in meta-analyses, studies of at least 500 patients, replication studies including those looking at other populations, genome-wide significance thresholds of under 5 x 10e-8 [PMID 25666886] for GWAS findings, and/or mutations with historic or proven medical significance. GWAS findings with odds ratios anywhere near 1.0 are in general not particularly interesting. The definition of 'worthy' is very subjective, but the gold standard is a link to published, peer-reviewed paper with credible statistics. And if they are not already in SNPedia, SNPs cited as significant in other credible sources such as OMIM, ClinVar, Coriell, or by the ACMG are certainly worthy of consideration.

It would be possible to load all ~10M SNPs from dbSNP, but then the only thing we could say about 99.99% of them would be 'this is a SNP' and perhaps which microarrays it occurs on. Few people would care.

If you know of a SNP you think should be added, and you're willing to put it in (citing a published reference for your information), it probably qualifies, so please do enter it. Some information for newbies is here. If you're not familiar with entering information in a Wiki, pretty much everything that applies to Wikipedia applies to SNPedia, so you can go to the How To Edit page at Wikipedia for plenty of information. If you're unsure or confused by any of this, you can send an email info@snpedia.com telling us about the SNPs you'd like to recommend be added.

Whether captured as genosets or in some other form, genetic risk scores (GRS) are likely to be added to SNPedia in the future. The criteria for deciding which genetic risk scores is evolving, but for the moment, important ones are (1) AUC >0.75, and preferably >0.85, (2) at least two independent GRS scores per condition, (3) internal and external validation, (4) clear statements as to the ethnicities studied, (5) at least 20 patients cases per parameter in study to reduce chances of overfitting, (6) preferably, the inclusion of high penetrance mutations in addition to lower penetrance risk factors, and (7) applicability on the personal rather than population level. If you have feedback about these criteria, we encourage you to contact us.

For every complex problem, there is an answer that is clear, simple, and wrong. - H.L. Mencken

dbSNP already assigns rs#s to small indels. A notable example is rs332. This is not a true single nucleotide polymorphism. Instead it is
a deletion of three nucleotides with 2 different insertion texts. dbSNP handles it fine, and so does SNPedia. We also handle

dbVar now assigns names to CNVs, however we've not yet found any notable literature which uses these identifiers. As naming standards emerge for other types of variations SNPedia expects be able to handle them too

The paper above should be cited when discussing SNPedia generally. To cite specific content in SNPedia you may wish to link to versioned 'Permanent link' found in the lower left corner of every SNPedia wiki page. This wikipedia article should further inform your consideration.

Mediawiki (the software which runs this website) prefers all wiki pages to begin with a uppercase letter, but the NCBI prefers to use the lowercase when referring to a snp. SNPedia prefers the lowercase, but sometimes the uppercase is visible. There is the ability to allow lowercase but so far SNPedia has not adopted this. This is because SNPedia wants to maintain backwards compatibility with its earliest versions which required uppercasing the first letter. Now existing software expects the capital letter. Enabling this feature would cause Rs1234 and rs1234 to be 2 different pages, causing information about a given snp to be scattered in 2 locations. In time I hope to have SNPedia supporting both styles smoothly.

Why does dbSNP list rs737865 as a C/T variant whereas other sources list it as an A/G variant?[edit]

I tested my FGS with familytreeDNA. I had heard that the results contain health related information,
and curious, I came to SNPedia to try and see what info I could learn about my FGS mutations.
Unfortunately I cannot make heads or tails of what my search yielded. For example, one of my
mutations is 1438G, so I searched that got rs6311. This is where I begin feel like I am trying to
read Klingon... " rs6311 (-1438A>G / A-1438G or -1438G>A / G-1438A)". To me, that says rs6311
times (negative1438A is greater than A minus 1438G OR negative 1438G is greater than A divided by G
minus 1438A). I know that cannot be right. I get further confused the more I read. I just wanted to
know what 1438G meant health-wise, as well as my other mutations.

SNPedia didn't make up all of these names and we frequently experience the same pain. In fact, this is the specific reason why we prefer rs#s. rs# names are meaningful names across the entire genome. Names such as the ones above are ambiguous unless there is other information such a gene or chromosome name is involved. For your specific cases, the gene of interest is HTR2A. The 1438 indicates that the SNP is 1438 bases/nucleotides/letters away from the start of that gene. The minus sign indicates that the SNP is upstream of the start site. A>G means that the reference genome has a A, but that a G was instead observed.

It gets worse. DNA is made of 2 complementary strands, SNPedia uses the same strand as dbSNP, but many sources, including 23andMe, will sometimes use the other strand. When this happens all nucleotides need to be switched to the form which is found on the other strand. So an A becomes a T, and a T becomes an A. C becomes a G, and G becomes a C. For reasons I can't explain the source you've chosen is referring to As and Gs, which is the opposite strand from dbSNP. In this case its unambiguous, but there are some nastier cases called ambiguous flips.

As a specific answer to your questions. You have a G at rs6311. Its not clear from your wording, if you have one or two copies of the SNP, so you are either rs6311(C;C) or rs6311(C;T). At present the SNPedia page shows several papers about this snp, but a clear consensus on the consequences is not yet known. It seems as though you should expect a lower risk of anorexia, bulimia, and tardive dyskinesia but may experience more anger- and aggression-related behavior.

Addendum from Ann Turner: the question came from someone who has 1438G in his complete mitochondrial DNA sequence (called FGS by FTDNA). It is practically universal, since the Cambridge Reference Sequence has the rare allele there. The number 1438 in the mtDNA molecule is unrelated to the site of the mutation in HTR2A.

There appear to be discrepancies between my 23andme data and my Promethease report. For example, for rs1051730

My genotype according to 23andme is AA,

Yet the SNPedia literature says that the genotypes are (C;C), (C;T), and (T;T).

Both 23andMe and SNPedia are using rs numbers, so they should be the same SNP?!

Your rs1051730(A;A) is also known as rs1051730(T;T). These are two different but equally valid names for exactly the same thing.

SNPedia has chosen to call it (T;T) because this is consistent with what we consider to be the highest authority -- a database called dbSNP run by NIH.gov . You can see a specific example of that here.

23andMe has in this case chosen to instead call it (A;A) for an entirely reasonable reason (this snp is on the minus strand). It doesn't matter, it is still the same thing. And you're seeing it in your report because Promethease was able to handle this.