It says that the "reference sequence" is the top line and that I can use the general genetic code to find the reading frame.

I can see that there are two N's instead of a G and a C. However I don't know what this means. I'm also not sure how I'd find the reading frame? I suppose I could look for a start and stop codon but I could read them both ways?

Last question: Is this change likely pathogenic? I'm not sure how to classify a variation as pathogenic or not.

2 Answers
2

So the two Ns that you see are not necessarily variants, but rather likely just poor quality reads. Essentially when you sequence DNA and the sequencer can't make a call as to what the base is, it will just designate it N meaning that the base could be any of the four DNA bases.

As for finding the reading from, if you know this sequence contains a stop codon, that helps, and just look for any of the stop sequences. If not, look for bases that match codon sequences and if you get a full strand of sequence that codes for amino acids you will likely be correct.

For reference here is a link to a codon chart that will help you decipher the reading frame.

$\begingroup$Yes absolutely, could you tell me if you can, whether the mutation is likely to be pathogenic, though? And if so, why?$\endgroup$
– PazeApr 22 '15 at 16:21

$\begingroup$So as cagliari says above it does not look like you have any mutations. You cannot call N a mutation because this could be any base including the one from the reference genome.$\endgroup$
– The NightmanApr 22 '15 at 16:48

The traces you have come from Sanger sequencing. N in genetics means nucleotide (surprising right?). N is used when the base at a given location is unknown (or could be any base pairs).

In your case you have Ns because the base-calling software is unable to determine the nucleotide. The first N is due to two peaks overlapping (G and A signals) and the second one should be a C rather than a N.

From what I can see, you have no apparent mutations so no those changes are unlikely to be pathogenic. A mutation can be called when you know a base in the reference is modified in your sample, which is not the case here (simply N -> G and a N -> C). The signals for your Ns look very similar between your reference and your sample therefore not suggesting a mutation.

For the reading frame, you have to look for a start (ATG) and a stop codon (TAG,TAA or TGA) to identify the open reading frame (ORF). Multiple software do that and here a link to an online NCBI tool called ORF Finder.

As you pointed out the ORF can also be in the reverse strand. You don't want to look at the reverse sequence but at the reverse complementary sequence. Usually tools to detect ORFs have the option to look at both strands.