Faith sent me an email expressing an interest in better understanding DNA, genes and chromosomes. She proposed a thread with goals the same as Introduction To Geology, but for genetics instead of geology, and she included a list of questions, which I include below. Faith will be participating so that this can be interactive.

If you have some cells of some creature but don't know what creature it is, is it always easy to identify it from its DNA?

What are the defining factors: the number of chromosomes only?

If you have DNA from strikingly different breeds of, say, dogs, say a greyhound, a chihuahua, a black lab, a Bichon Frise, a Dalmation, or take your pick of what are the least similar -- are you able to tell which is which from just looking at the DNA and what exactly do you look for?

Apparently it's possible to determine close relatedness of family members from their DNA. What exactly are the most important clues?

Do you ever actually look at the DNA itself or are you looking at some sort of indicator, model, or whatever you call it that you somehow derive from the cell? I know a DNA portrait as it were is often represented by some sort of bars that to me are indecipherable. Do I have to learn what those mean in order to get answers to the sort of questions I'm asking?

Apparently the DNA in different body cells is different. Can you nevertheless identify the creature from any of these different cells? Or, what SORT of difference are we talking about?

Apparently a gene is a segment of the DNA strand and it can be a very long segment, even of thousands of paired chemicals, and there is some kind of chemical sequence that is different from the body of the gene that tells you where it begins and ends. Or something like that?

The function of some genes is known, and can even be predicted across different species. That is, you know where the gene for oh say eye color is located, or the many genes that determine eye color if there is more than one, and this is predictable for many species. Or is it?

Generally speaking, are there many traits that are governed by more than one gene?

Where is the "junk DNA" located on the DNA strand? Is it interspersed with functioning genes or collected all in one place or what? Is there some way you can tell by just looking at it that it's "junk" or how do you tell?

Homozygosity is the pairing of identical forms of the gene, is that right? Does that mean an absolutely identical chemical sequence on both sides of the pair?

Whereas heterozogosity is the pairing of different alleles or different forms of the same gene, meaning a different chemical sequence on both sides of the pair? Or something like that?

If you have some cells of some creature but don't know what creature it is, is it always easy to identify it from its DNA?

At one time the answer would have been "No". In order to identify a species by their DNA you need DNA from a known source. Not too long ago (i.e. 20 years ago) there simply wasn't a lot of DNA sequence known. Luckily, methodologies and technologies have greatly increased our ability to sequence DNA akin to Moore's Law and computer chips. If you are trying to work on identifying the species then a larger BAC clone would be the way to go. In this methodology you use an enzyme to break up the DNA into large chunks (or physical shearing), and then randomly insert those chunks into the BAC plasmid. You then use Sanger sequencing from the known plasmid sequences that flank the insert of interest. This usually gives 300 to 700 bases of good read from both ends of the insert. In the olden days we used radioactive terminators and autoradiography from big gels to run these sequencing reactions. Now it is all automated on capillary gels that use fluorescent dyes instead. The results of the run will look something like this:

Once you have your sequence you then dump it into BLAST. If it is not able to find a perfect match for your sequence it will give a phylogeny of the closest related sequences so you will at least be able to see what kind of critter you are working with. With some luck, you may even be able to identify not only the species, but where in the genome that DNA chunk came from.

If you have DNA from strikingly different breeds of, say, dogs, say a greyhound, a chihuahua, a black lab, a Bichon Frise, a Dalmation, or take your pick of what are the least similar -- are you able to tell which is which from just looking at the DNA and what exactly do you look for?

Not without a lot of background work. You would first need to find breed specific sequences from each of those breeds which takes a lot of work. At best, you could determine if another dog is the parent/sibling of another dog assuming you had the STR data needed for such an analysis. You may want to read about human DNA fingerprinting to get a better idea of how this would work:

Do you ever actually look at the DNA itself or are you looking at some sort of indicator, model, or whatever you call it that you somehow derive from the cell? I know a DNA portrait as it were is often represented by some sort of bars that to me are indecipherable. Do I have to learn what those mean in order to get answers to the sort of questions I'm asking?

Most sequencing uses Sanger sequencing (as cited above). DNA profiling uses PCR to amplify short tandem repeats (STR's), and the gel then separates these short sequences by size. Those will produce the bands you see on some of those gels, such as this one:

IOW, there are many ways of doing DNA fingerprints, and not all of them require direct sequencing. PCR and endonucleases are two indispensible tools used in this type of work, so if you want to understand what is going on you should really learn what those tools are and how they are used.

I'm thinking of doing a series on "genetics for beginners" after I'm done with geology. Actually, I've already written most of the articles and drawn the diagrams.

However, I have a lot of things to do right now.

I'll undertake to answer the questions Faith wants answering if she'll turn up on this thread and ask me them one at a time. But I'm not going to write a series of posts explaining genetics from the bottom up until I've finished with geology.

Apparently the DNA in different body cells is different. Can you nevertheless identify the creature from any of these different cells? Or, what SORT of difference are we talking about?

There are cells that are anucleated such as the red blood cell that lacks a nucleus. Gametes will only have half of your genome as per the process of meiosis. Other than that, I am unaware of cells that have "different DNA". The only differences that you will see are epigenetic differences which are patterns of DNA methylation and histone packaging. However, the DNA will still have the same sequence from cell to cell.

The function of some genes is known, and can even be predicted across different species. That is, you know where the gene for oh say eye color is located, or the many genes that determine eye color if there is more than one, and this is predictable for many species. Or is it?

It is usually not a matter of location but of sequence similarity. Genes can move around quite a bit due to recombination events. Synteny and orthology are still important, but for pure protein function the only bit that counts is the sequence. Location tends to be more important for DNA regulation, that is when a gene is turned on and how strongly it is expressed. DNA regulation has as much to do with what an animal looks like as basic protein function.

Generally speaking, are there many traits that are governed by more than one gene?

For metazoans, this is definitely the case. You may want to look into "Evolutionary Developmental Biology". It is a huge field of study that looks at just that, the interaction of genes that give rise to different traits as a product of embryonic development.

Where is the "junk DNA" located on the DNA strand? Is it interspersed with functioning genes or collected all in one place or what? Is there some way you can tell by just looking at it that it's "junk" or how do you tell?

Junk DNA is spread throughout the genome. The most obvious examples are processed pseudogenes which carry obvious evidence of past function that has been knocked out by subsequent mutations. One example is the human vitamin C synthase pseudogene which prevents us from producing our own vitamin C. This is why we need to eat vitamin C to prevent scurvy. There are also old transposons and just long strings of DNA that have no identifiable features, be it a promoter, transcription factor, or open reading frame.

A lot of these stretches of junk DNA can be identified by comparing one species to another. Using some basic assumptions you can determine the rate at which different areas of the genome are accumulating mutations. If the accumulation is consistent with neutral drift then the chances are that there is no DNA sequence specific function in that part of the genome.

Homozygosity is the pairing of identical forms of the gene, is that right? Does that mean an absolutely identical chemical sequence on both sides of the pair?

These were initially identified as the same phenotype. As you are probably aware, two alleles can differ in DNA sequence but still produce the same physical trait (and even the same protein sequence). So the answer is yes and no. You just have to be aware of the comparison being made.

(Took me so long to answer because I didn't see the "request password" link when I tried to log in)

I wrote out one long reply to both your posts and was driven to the conclusion that there’s just too much here to try to respond to all of it at once and maintain any kind of order, so I decided to try to limit things by focusing on one question at a time.

If you have some cells of some creature but don't know what creature it is, is it always easy to identify it from its DNA?

At one time the answer would have been "No". In order to identify a species by their DNA you need DNA from a known source. Not too long ago (i.e. 20 years ago) there simply wasn't a lot of DNA sequence known. Luckily, methodologies and technologies have greatly increased our ability to sequence DNA akin to Moore's Law and computer chips. If you are trying to work on identifying the species then a larger BAC clone would be the way to go. In this methodology you use an enzyme to break up the DNA into large chunks (or physical shearing), and then randomly insert those chunks into the BAC plasmid. You then use Sanger sequencing from the known plasmid sequences that flank the insert of interest. This usually gives 300 to 700 bases of good read from both ends of the insert. In the olden days we used radioactive terminators and autoradiography from big gels to run these sequencing reactions. Now it is all automated on capillary gels that use fluorescent dyes instead. The results of the run will look something like this: [chart]

Once you have your sequence you then dump it into BLAST. If it is not able to find a perfect match for your sequence it will give a phylogeny of the closest related sequences so you will at least be able to see what kind of critter you are working with. With some luck, you may even be able to identify not only the species, but where in the genome that DNA chunk came from.

So you are apparently giving a “yes” to this question because now it is possible, by the various means you mention (that are technically rather over my head), to find out “at least…what kind of critter you are working with.” I gather it’s not “easy” at all since it does require quite a bit of technical work to arrive at the conclusion but that you CAN indeed find out the species.

So I'd sum up:

ANSWER: YES, WE CAN IDENTIFY THE SPECIES, BUT IT’S NOT EXACTLY “EASY.”

Fair enough?

I am a bit surprised because I did assume that the number of chromosomes would be the major indicator, which I also assumed wouldn’t be hard to determine, but you didn’t address that part of the question. (“What are the defining factors: the number of chromosomes only?” )

If my paraphrase of your answer is correct enough I’m happy with it, but I don’t want to move on until any further thoughts on the subject that might occur to anyone are considered.

P.S. I did look up some unfamiliar terminology but think I'll leave the technical parts of the answers to later. Right now they could get to be a long rabbit trail.

Dr. A: Thanks and I may need your help with getting some answers in more accessible English at least, so I may well ask some questions of you.

I’m glad to hear you’ve been thinking of doing a course on genetics for beginners as I very much appreciate your course on geology.

What I told Percy is that I would like to see such a course done but expected it would rapidly get over my head, so I didn’t really think of this thread along those lines, but as a way of learning more about some particular questions I’ve had in mind for some time.

I don't think it needs to get over anyone's head. The tricky bit is the chemistry (I think) but the fact is that you don't need to know any of the chemistry, any more than if I was teaching computer programming I'd get into the details of how electricity works. I don't even know how electricity works. I think it involves electrons ... and magic.

In the meantime, I'm happy to answer particular questions, but as I say I won't start a general course until I've finished with geology. My devoted fans will be pleased to know that I've managed to get an actual job for which I get paid with genuine pieces of green paper, so I don't have as much free time as I used to.

quote:I am a bit surprised because I did assume that the number of chromosomes would be the major indicator, which I also assumed wouldn’t be hard to determine, but you didn’t address that part of the question. (“What are the defining factors: the number of chromosomes only?” )

No, it is quire possible for different species - even those that are not closely related - to have the same number of chromosomes. Chimpanzees, orangutans, gorillas and hares all have 24 pairs of chromosomes. Humans and Reeves's muntjacs have 23 pairs of chromosomes.

I am a bit surprised because I did assume that the number of chromosomes would be the major indicator, which I also assumed wouldn’t be hard to determine, but you didn’t address that part of the question. (“What are the defining factors: the number of chromosomes only?” )

If I can jump in to address this - you're right that it's very easy to determine the number of chromosomes that a cell has; unlike the latest and greatest molecular biology techniques, counting chromosomes is something you do with a special dye and a microscope. They're large enough (when we collect them in their "chromatid" state, found in cells that are captured in the process of division, when the cell very conveniently "packages" its chromosomes so that they can be moved around.)

But the wrinkle is that different species may have the same number of chromosomes, especially if they're closely related; different individuals in the same species may have different numbers of chromosomes, different cells in the same organism may have different numbers of chromosomes, and prokaryotic organisms like bacteria have only a single circular chromosome, regardless of their species. So what can be gleaned strictly from chromosome count is often subject to what species you're talking about. There's no way to look at just a chromosome number and say "well, that's from Bos taurus."

So you are apparently giving a “yes” to this question because now it is possible, by the various means you mention (that are technically rather over my head), to find out “at least…what kind of critter you are working with.” I gather it’s not “easy” at all since it does require quite a bit of technical work to arrive at the conclusion but that you CAN indeed find out the species.

To sum it up briefly, technology has made it much easier and faster to sequence genomes so many more genomes have been sequenced. This makes it easier to determine which species a given DNA sequence came from, or at least a species group (e.g. genus or family).

A social security number could serve as an analogy here. If all we had was the first number of a SSN it would be difficult to figure out who it belonged to. The more numbers we have the more we can narrow it down. There are still groups of species that we do not have much sequence data from, but for many groups it is equivalent to having all but one number of a SSN which allows us to really narrow it down.

As to the technical details, I included links so you could check it out yourself. It is a bit difficult to explain these techniques in a forum format. All I can do is encourage you to do some extended searches on some of those keywords if they interest you. I am sure there are some good videos on Sanger sequencing, as one example. A new technology is pyrosequecing which is really interesting as well. This technique is often used on ancient DNA that is made up of many short pieces of DNA.

I am a bit surprised because I did assume that the number of chromosomes would be the major indicator, which I also assumed wouldn’t be hard to determine, but you didn’t address that part of the question.

I guess I ignored it because it is a very minor indicator. Chromosome number can vary even within species, but can be the same between very distantly related species (I believe that tobacco plants and humans have the same number of chromosomes while our closest ape relative has one more than we do). The sequence is way more important for determining relatedness.

Do you ever actually look at the DNA itself or are you looking at some sort of indicator, model, or whatever you call it that you somehow derive from the cell? I know a DNA portrait as it were is often represented by some sort of bars that to me are indecipherable. Do I have to learn what those mean in order to get answers to the sort of questions I'm asking?

Last question first: No. The "black bar" pictures you're familiar with are an older way of characterizing DNA by chemically cutting it at specific sequence sites, and then measuring the length of the fragment - the closer to the bottom of the picture the black bar is, the shorter the fragment is. Different rows of black bars are different samples, and a "ladder" - a mixture of fragments of known sizes - runs along the sides for comparison. The pattern of fragment sizes can be compared with other people's pattern of fragment sizes, assuming you broke up their DNA in the same way.

So that's a case where you're actually looking at DNA - it's been dyed, or sometimes made radioactive, and used to expose a piece of photographic film. That's why it's kind of fuzzy. We have new methods, though, where we can use chemistry and special machines to sequence DNA directly, and have a digital readout of all of it's bases. When we do that, we're starting with DNA, usually making a lot of copies of it so that there's more of it to work with (we have a chemical DNA copier called "PCR"), and then producing a digital file that contains all the bases in our sample. Then we just look at it on computers. We're using the third or fourth generation of technologies developed during the Human Genome project. We're experiencing the same kind of technological growth that we had in computers in the 80's and 90's.

But you can look at DNA right now, without much of anything special:

In this form, it doesn't tell anybody much at all. It's just a slimy fiber. Learning to read this - for that matter, learning what it did - was the central task in biology and biochemistry for about 50 years. As a field, we're largely moving away from "black bars"-type techniques in molecular biology and more into direct sequencing. Actually I'd say that we already have. So, no, you won't have to learn how to read one of those.

I'm going to get lost if we don't stick more or less to the order of the questions. Although that's an artificial order it IS an order.

It looks like Question One is finished so on to Question Two.

If you have DNA from strikingly different breeds of, say, dogs, say a greyhound, a chihuahua, a black lab, a Bichon Frise, a Dalmation, or take your pick of what are the least similar -- are you able to tell which is which from just looking at the DNA and what exactly do you look for?

Not without a lot of background work. You would first need to find breed specific sequences from each of those breeds which takes a lot of work. At best, you could determine if another dog is the parent/sibling of another dog assuming you had the STR data needed for such an analysis.

Although 99.9% of human DNA sequences are the same in every person, enough of the DNA is different to distinguish one individual from another, unless they are monozygotic twins.[2] DNA profiling uses repetitive ("repeat") sequences that are highly variable,[2] called variable number tandem repeats (VNTRs), particularly short tandem repeats (STRs). VNTR loci are very similar between closely related humans, but so variable that unrelated individuals are extremely unlikely to have the same VNTRs.

My thoughts on this would probably lead down too many rabbit trails but I’ll mention this much anyway: I think I “knew” that a great percentage of DNA merely identifies us as human so that what identifies us as individuals involves a very small percentage but it helps to have it repeated. I would suppose then that a breed or a “race” of a Species might eventually be identifiable by certain characteristics within the smaller percentage and that this may be established some time in the future. Or make that a question.

Again, I’ll wait for thoughts about this one before going on to No. 3.

Hi Crash.This answer from you is going to be useful when that question comes up but please let's wait. I should have realized all those questions could drown a thread, but some of them will probably be answered fairly quickly.

I'm going to take a shot at answering the questions about chromosomes, but I'm going to keep it simpler than most have. Keep in mind that there can be exceptions to anything, but by and large what I'm going to tell you will be true 99 and 44/100% of the time.

Each species has a fixed number of chromosomes that is the same for all individuals. The number of chromosomes can vary from 1 up to 132 (the highest figure I happened to find) and beyond, but probably very few species have over a hundred chromosomes. Since there are millions of species naturally many species will have the same number of chromosomes, but that's a matter of coincidence and is of no consequence. The number of chromosomes has little effect on what a species looks like.

If someone gave you two cells from two different species and you found that they had the same number of chromosomes, simply assuming they're the same species would be a big mistake.

Each chromosome is a long tightly wound tangle of DNA. Coded within the DNA are the genes. A specific gene always resides on the same chromosome, though it can move around somewhat on that chromosome during reproduction.

I'm not a biologist, so I might be mistaken. I'm inclined to say that in theory, yes; in practice, no.

Most biologists would say that the DNA does determine the breed, but that there isn't any easy way of getting from DNA data to conclusions about which breed. There isn't a nice mathematical formula from one to the other, so getting breed information from DNA information depends on consulting a huge database which does not currently exist.

Reading the remainder of your post, I think you do already get this.

Fundamentalism - the anti-American, anti-Christian branch of American Christianity