1
00:00:15 --> 00:00:19
Professor Jacks is out of town so I
am going to tell you about
2
00:00:19 --> 00:00:24
Recombinant DNA 3,
then he's going to come back and
3
00:00:24 --> 00:00:29
tell you about Cell Biology,
and then you will have finished the
4
00:00:29 --> 00:00:34
foundations part of the course.
And we'll move onto things that
5
00:00:34 --> 00:00:38
build on the foundation,
the Formation Module and the part of
6
00:00:38 --> 00:00:43
the Systems Module,
which I'll be teaching you for the
7
00:00:43 --> 00:00:48
next few weeks,
but today is Recombinant DNA 3.
8
00:00:48 --> 00:00:52
And, as you've been hearing for the
last couple of lectures,
9
00:00:52 --> 00:00:57
this is one of the How-To Modules
that we've put in the course.
10
00:00:57 --> 00:01:01
How to make use of the information
that you have been learning in
11
00:01:01 --> 00:01:06
Molecular Biology and in
Biochemistry and in Genetics to use
12
00:01:06 --> 00:01:11
these disciplines or these pieces of
information to do something useful.
13
00:01:11 --> 00:01:15
And recombinant DNA is really an
extraordinary set of technologies
14
00:01:15 --> 00:01:19
that just keeps getting more and
more extraordinary.
15
00:01:19 --> 00:01:23
And the way one can manipulate
biological systems now is really
16
00:01:23 --> 00:01:27
very exciting.
And it continues to be exciting.
17
00:01:27 --> 00:01:31
When I was a beginning graduate
student we were able to clone the
18
00:01:31 --> 00:01:35
first pieces of DNA.
And now we can really do a lot more
19
00:01:35 --> 00:01:39
than just clone DNA.
So I want to tell you about some of
20
00:01:39 --> 00:01:44
the things that are really essential
to understand about this technology,
21
00:01:44 --> 00:01:48
and then take you through some of
the forefronts of where recombinant
22
00:01:48 --> 00:01:52
DNA technology is now.
We're going to cover three things
23
00:01:52 --> 00:02:00
in this lecture.
24
00:02:00 --> 00:02:10
DNA sequencing,
using genetic polymorphisms for
25
00:02:10 --> 00:02:20
various genotyping analyses,
and then I'm going to try to touch
26
00:02:20 --> 00:02:30
on, and we'll have to see how we do
here, making animals that are
27
00:02:30 --> 00:02:38
so-called transgenic.
So transgenic technology.
28
00:02:38 --> 00:02:44
And I'm going to use PowerPoint
pretty much for most of the lecture,
29
00:02:44 --> 00:02:50
so you have most of the relevant
stuff in front of you.
30
00:02:50 --> 00:02:56
I'm going to frame this in terms of
a human disease, familial
31
00:02:56 --> 00:03:02
hypercholesterolemia.
So you may remember way back when in
32
00:03:02 --> 00:03:06
biochemistry we talked about
cholesterol. Anyone remember what
33
00:03:06 --> 00:03:10
class of macromolecules cholesterol
belongs to? Lipids.
34
00:03:10 --> 00:03:14
Thank you. Lipids. OK.
I'm not even going to give a frog
35
00:03:14 --> 00:03:19
for that. And we have this sense of
cholesterol being a really bad kind
36
00:03:19 --> 00:03:23
of molecule but,
in fact, cholesterol is an essential
37
00:03:23 --> 00:03:27
lipid. It's extremely important.
Without cholesterol you'd die and
38
00:03:27 --> 00:03:32
you need it for many things.
Not only for building membranes in
39
00:03:32 --> 00:03:36
your cells but also,
if you think way back,
40
00:03:36 --> 00:03:40
you may remember me telling you that
cholesterol was part of or had a
41
00:03:40 --> 00:03:44
chemical structure that was very
similar to the steroid hormone
42
00:03:44 --> 00:03:48
family. And steroid hormones,
and we'll discuss this more in the
43
00:03:48 --> 00:03:52
future, are very important molecules
that tell one part of the body what
44
00:03:52 --> 00:03:56
to do, that regulate what different
parts of the body are doing.
45
00:03:56 --> 00:04:00
So cholesterol is part of this
whole signaling system.
46
00:04:00 --> 00:04:04
And really it's not actually
understood all of what cholesterol
47
00:04:04 --> 00:04:08
does, but it's very important.
However, too much of it is not good.
48
00:04:08 --> 00:04:12
And it's probably not good because,
but it's not actually clear. I'll
49
00:04:12 --> 00:04:16
tell you what happens if you have
too much cholesterol,
50
00:04:16 --> 00:04:20
but actually why it happens is not
that clear. So let me talk about
51
00:04:20 --> 00:04:24
this slide up here,
and then we'll talk about what too
52
00:04:24 --> 00:04:28
much cholesterol does for you.
So familial hypercholesterolemia is
53
00:04:28 --> 00:04:33
an inherited disease,
and it's caused by mutations in a
54
00:04:33 --> 00:04:39
gene called the LDL receptor,
that encodes for something called
55
00:04:39 --> 00:04:44
the LDL receptor.
Now, LDL stands for low density
56
00:04:44 --> 00:04:49
lipoprotein. And you had this in a
previous lecture because I'd been
57
00:04:49 --> 00:04:55
mentioned these to you.
Low density lipoproteins.
58
00:04:55 --> 00:05:00
And these bind to various lipids,
including cholesterol, and are taken
59
00:05:00 --> 00:05:05
up into the cell.
And some of them are OK,
60
00:05:05 --> 00:05:09
you probably need some LDLs,
but too much LDL is bad. And if you
61
00:05:09 --> 00:05:14
have too much LDL receptor,
the thing that actually binds to the
62
00:05:14 --> 00:05:18
LDLs, you get too much LDL taken up
into the cell.
63
00:05:18 --> 00:05:23
So this LDL receptor,
you'll talk more about this in cell
64
00:05:23 --> 00:05:27
biology, this LDL receptor,
and you've already had some of this,
65
00:05:27 --> 00:05:32
the LDL receptor is a protein that
binds to these LDLs,
66
00:05:32 --> 00:05:37
takes them into the cell,
and then your cell gets full of LDLs.
67
00:05:37 --> 00:05:41
OK? And as a consequence of this,
your cholesterol levels go way up.
68
00:05:41 --> 00:05:46
Now, you can be heterozygote or
homozygote for familiar
69
00:05:46 --> 00:05:50
hypercholesterolemia,
for the LDL receptor gene.
70
00:05:50 --> 00:05:55
OK? For the familiar
hypercholesterolemia gene.
71
00:05:55 --> 00:06:00
Try to say that one quickly.
All right.
72
00:06:00 --> 00:06:06
So if you're heterozygote,
you have an increased risk of heart
73
00:06:06 --> 00:06:13
disease. In particular for this
thing called atherosclerosis I'll
74
00:06:13 --> 00:06:19
talk more about in a moment.
If you are homozygote, so you have
75
00:06:19 --> 00:06:26
two copies of a mutated LDL receptor
gene, you get severe heart symptoms
76
00:06:26 --> 00:06:32
and you die early. OK?
What is atherosclerosis?
77
00:06:32 --> 00:06:38
Atherosclerosis is a disease that
occurs because you get these
78
00:06:38 --> 00:06:44
buildups of stuff in the blood
vessels. And the stuff is fat and
79
00:06:44 --> 00:06:49
it's proteins,
and it basically makes a big lump
80
00:06:49 --> 00:06:55
that eventually occludes or blocks
the blood vessel.
81
00:06:55 --> 00:07:01
And so atherosclerosis is bad
because impedes blood flow.
82
00:07:01 --> 00:07:06
And if you impede blood flow,
eventually your heart will seize up
83
00:07:06 --> 00:07:12
and you will have a heart attack,
and that can have, obviously, very
84
00:07:12 --> 00:07:17
severe consequences.
So atherosclerosis occurs because
85
00:07:17 --> 00:07:23
you have high levels of LDL.
And it's really, the actual
86
00:07:23 --> 00:07:29
etiology of atherosclerosis
is not really clear.
87
00:07:29 --> 00:07:33
Part it may be that there's just too
much fat around and that starts
88
00:07:33 --> 00:07:37
actually getting deposited out of
solution, but it's much more
89
00:07:37 --> 00:07:42
complicated than that.
And there seems to be a very
90
00:07:42 --> 00:07:46
complicated chain of events by which
you get these atherosclerosis
91
00:07:46 --> 00:07:50
plaques sitting on the lining of
blood vessels and impeding blood
92
00:07:50 --> 00:07:55
flow. OK. So there is a lot of
interest medically in
93
00:07:55 --> 00:07:59
atherosclerosis,
particularly in countries such as
94
00:07:59 --> 00:08:04
ours where food is plentiful and
people tend to have too much.
95
00:08:04 --> 00:08:08
And obesity is a problem anyway
because that is part of the set of
96
00:08:08 --> 00:08:13
risk factors for atherosclerosis.
So here are the risk factors. High
97
00:08:13 --> 00:08:18
levels of LDL,
high blood pressure,
98
00:08:18 --> 00:08:23
diabetes, cigarette smoke and so on.
And familial hypercholesterolemia
99
00:08:23 --> 00:08:28
is contributory to high levels of
LDL and atherosclerosis. OK.
100
00:08:28 --> 00:08:34
So one of the things I want to do is
to keep thinking about this disorder
101
00:08:34 --> 00:08:40
and walk you through how you figure
out who's got FH.
102
00:08:40 --> 00:08:46
OK. What you can do is to get
blood cells from people at-risk,
103
00:08:46 --> 00:08:52
and you can actually examine the LDL
receptor gene in the blood cells of
104
00:08:52 --> 00:08:59
people who are at-risk for familial
hypercholesterolemia.
105
00:08:59 --> 00:09:03
And what I tell you about is how you
can actually sequence the gene,
106
00:09:03 --> 00:09:08
the FH gene, see if you can find the
mutation and see whether or not you
107
00:09:08 --> 00:09:13
can then identify people who are
at-risk for the disorder.
108
00:09:13 --> 00:09:18
So the first thing I want to tell
you about today is DNA sequencing.
109
00:09:18 --> 00:09:23
DNA sequencing. What is DNA
sequencing? Does someone care to
110
00:09:23 --> 00:09:28
give me a definition or think about
what I might mean by
111
00:09:28 --> 00:09:33
DNA sequencing?
In particular,
112
00:09:33 --> 00:09:38
what part of the DNA are we
sequencing? Thank you,
113
00:09:38 --> 00:09:43
Jamie. You want to say it louder?
The bases. Yes. So in DNA
114
00:09:43 --> 00:09:48
sequencing, and maybe I even wrote
this, what is this,
115
00:09:48 --> 00:09:53
what you want to do is to determine
the base sequence of the DNA.
116
00:09:53 --> 00:09:58
OK? You want to determine the
sequence of AGCT along
117
00:09:58 --> 00:10:03
a DNA fragment.
This technique is powerful beyond
118
00:10:03 --> 00:10:07
almost anything else.
It's an extraordinary technique.
119
00:10:07 --> 00:10:11
The ability to sequence DNA is
extraordinary.
120
00:10:11 --> 00:10:16
And it's extraordinary because you
can get out of it information that
121
00:10:16 --> 00:10:20
is absolutely essential for
understanding life.
122
00:10:20 --> 00:10:25
What you can get from DNA
sequencing is an understanding of
123
00:10:25 --> 00:10:29
the coding capacity of a gene.
So, just like you did in your exam,
124
00:10:29 --> 00:10:33
we gave you a string of DNA and you
conceptually translated
125
00:10:33 --> 00:10:38
it into the protein.
Well, you can do that in real life
126
00:10:38 --> 00:10:42
by looking through the genome,
the human genome and finding
127
00:10:42 --> 00:10:46
stretches of DNA and conceptually
turning them into RNA and into
128
00:10:46 --> 00:10:50
protein and saying,
OK, is this is a gene?
129
00:10:50 --> 00:10:54
Does it code for something?
And what does it code for? So you
130
00:10:54 --> 00:10:58
can figure out the coding capacity
of a gene. Part of that is actually
131
00:10:58 --> 00:11:02
identifying is a gene a gene?
So we've sequenced the entire human
132
00:11:02 --> 00:11:06
genome. And I've told you
previously that only about 5% of the
133
00:11:06 --> 00:11:10
genome is actually genes and the
rest is other stuff.
134
00:11:10 --> 00:11:14
So one of the things you want to do
with DNA sequencing is to identify
135
00:11:14 --> 00:11:18
genes. And that's actually very
difficult to do it turns out.
136
00:11:18 --> 00:11:22
But that's one of the things you
can do with DNA sequencing.
137
00:11:22 --> 00:11:26
I'll talk more about identifying
genes that are associated with
138
00:11:26 --> 00:11:30
disease, that are causative
of disease.
139
00:11:30 --> 00:11:33
And particularly alleles that are
associated with disease such as in
140
00:11:33 --> 00:11:37
the case of familial
hypercholesterolemia.
141
00:11:37 --> 00:11:40
One can figure out evolutionary
relationships between organisms.
142
00:11:40 --> 00:11:44
So you've probably heard for years
about how similar we are to
143
00:11:44 --> 00:11:48
chimpanzees or how similar we are to
dogs or to dolphins or whatever.
144
00:11:48 --> 00:11:51
But, actually, we didn't really
know. Now we can sequence a human
145
00:11:51 --> 00:11:55
genome, we can sequence a chimp
genome, a dog genome,
146
00:11:55 --> 00:11:59
a dolphin genome, and we can
actually look and see
147
00:11:59 --> 00:12:02
how similar we are.
And we can try to figure out,
148
00:12:02 --> 00:12:06
in evolutionary time, what's changed
between the dolphin and ourselves
149
00:12:06 --> 00:12:09
and what makes a dolphin a dolphin
and ourselves ourselves.
150
00:12:09 --> 00:12:13
It's a very tough question,
but DNA sequencing is essential for
151
00:12:13 --> 00:12:16
trying to answer that kind of
question. And then one can ask
152
00:12:16 --> 00:12:20
about the genome is other ways.
Can one find the promoters of all
153
00:12:20 --> 00:12:23
the different genes?
Remember promoters that make genes
154
00:12:23 --> 00:12:27
be transcribed?
The centromeres,
155
00:12:27 --> 00:12:31
the middle of chromosomes.
Various other elements in the genome
156
00:12:31 --> 00:12:36
that are essential for its function.
So I'm going to spend quite some
157
00:12:36 --> 00:12:41
time talking about DNA sequencing
and tell you that DNA sequencing,
158
00:12:41 --> 00:12:45
most of the DNA sequencing we do
uses a trick. And it's a terrific
159
00:12:45 --> 00:12:50
trick. It really is.
So this DNA sequencing,
160
00:12:50 --> 00:12:55
I'll write it because I don't think
I have this on one of
161
00:12:55 --> 00:13:01
your PowerPoints.
The method of DNA sequencing I'm
162
00:13:01 --> 00:13:09
going to tell you about was devised
by a scientist called Fred Sanger.
163
00:13:09 --> 00:13:17
So I'll tell you about it. It's
called dideoxy,
164
00:13:17 --> 00:13:25
it's also called chain termination,
and it's also called Sanger
165
00:13:25 --> 00:13:30
sequencing.
Professor Sanger is a British
166
00:13:30 --> 00:13:34
scientist who received two Nobel
Prizes. The first was for figuring
167
00:13:34 --> 00:13:37
out how proteins,
how to sequence proteins,
168
00:13:37 --> 00:13:41
and the second was for figuring out
how to sequence DNA.
169
00:13:41 --> 00:13:44
When I was a student,
I heard Professor Sanger talk.
170
00:13:44 --> 00:13:48
And he gave a lecture which was
really memorable.
171
00:13:48 --> 00:13:51
It was packed, a packed auditorium.
And he spoke the entire time like
172
00:13:51 --> 00:13:55
this. I don't think he looked up
once. He gave the entire lecture
173
00:13:55 --> 00:13:59
like this, and he was
barely audible.
174
00:13:59 --> 00:14:03
But at the end of the lecture he got
a standing ovation from everybody
175
00:14:03 --> 00:14:07
because really what he's done,
figuring out how to sequence
176
00:14:07 --> 00:14:12
proteins and how to sequence DNA was
really an extraordinary
177
00:14:12 --> 00:14:16
accomplishment.
So that's the method I'll tell you
178
00:14:16 --> 00:14:21
about. And it uses a cool trick.
So you know now that the sugar in
179
00:14:21 --> 00:14:25
DNA has a 3 prime hydroxyl group,
and that hydroxyl group is the group
180
00:14:25 --> 00:14:30
unto which the phosphate
gets added.
181
00:14:30 --> 00:14:35
Right? And without that hydroxyl
group you could not add on the next
182
00:14:35 --> 00:14:40
nucleotide, right?
It's a question. Think about it.
183
00:14:40 --> 00:14:46
OK? I don't mean it to be
rhetorical. I want you to really be
184
00:14:46 --> 00:14:51
thinking, OK, about this,
because otherwise you won't
185
00:14:51 --> 00:14:57
understand the method.
So here's the 3 prime hydroxyl on
186
00:14:57 --> 00:15:02
regular deoxyribose. OK?
In the Sanger or dideoxy method one
187
00:15:02 --> 00:15:07
uses in the reaction mix,
and I'll go through this with you in
188
00:15:07 --> 00:15:13
a moment, a sugar or nucleotide
that's a dideoxy nucleotide.
189
00:15:13 --> 00:15:18
In other words, on both the 2 prime
and the 3 prime of the sugar,
190
00:15:18 --> 00:15:23
of the ribose there is no hydroxyl
group. There are just
191
00:15:23 --> 00:15:28
those hydrogens.
Now, a dideoxy nucleotide such as
192
00:15:28 --> 00:15:34
this one can get incorporated into
DNA just fine because this phosphate,
193
00:15:34 --> 00:15:40
the triphosphate here can react with
a regular nucleotide that's got a 3
194
00:15:40 --> 00:15:46
prime hydroxyl.
However, once it's been
195
00:15:46 --> 00:15:52
incorporated you cannot elongate the
chain anymore because there is no
196
00:15:52 --> 00:15:58
reactive hydroxyl group.
OK. So based on this principle let
197
00:15:58 --> 00:16:03
me explain.
I've got one of your handouts here.
198
00:16:03 --> 00:16:07
OK. So here we go. Revision, your
template, your primer,
199
00:16:07 --> 00:16:11
here's your template strand,
always goes 3 prime to 5 prime.
200
00:16:11 --> 00:16:15
Here's your 5 prime to 3 prime
primer. If you add nucleotides,
201
00:16:15 --> 00:16:19
deoxynucleotide triphosphates and
DNA polymerase,
202
00:16:19 --> 00:16:24
you will polymerize the whole
fragment.
203
00:16:24 --> 00:16:29
If you add, however,
to the mix of dNTPs and DNA
204
00:16:29 --> 00:16:35
polymerase a low-level of dideoxy
nucleotide triphosphates,
205
00:16:35 --> 00:16:41
every time you add on a nucleotide
the polymerase can either use a
206
00:16:41 --> 00:16:46
regular nucleotide triphosphate,
in which case the chain can elongate
207
00:16:46 --> 00:16:52
subsequently, or it can use a
dideoxy nucleotide triphosphate.
208
00:16:52 --> 00:16:58
If it uses one of the dideoxy NTPs
the chain will terminate.
209
00:16:58 --> 00:17:03
It cannot be elongated any further.
So you get something like this.
210
00:17:03 --> 00:17:08
And the trick here is really this
low-level of ddNTPs.
211
00:17:08 --> 00:17:14
OK? So if you have your template
and your primer and you do a
212
00:17:14 --> 00:17:19
reaction with your dNTPs at a
reasonable level and you spike the
213
00:17:19 --> 00:17:24
reaction with a low-level of dideoxy
NTPs, you get a whole bunch of
214
00:17:24 --> 00:17:30
different length chains
polymerized.
215
00:17:30 --> 00:17:35
Because there is some probability,
at every position, that you're
216
00:17:35 --> 00:17:40
either going to get a ddNTP
incorporated, in which case the
217
00:17:40 --> 00:17:45
chain terminates,
or you're going to get a regular
218
00:17:45 --> 00:17:50
nucleotide incorporated in which
case the chain can continue for a
219
00:17:50 --> 00:17:56
bit. OK? So that is paramount to
dideoxy sequencing.
220
00:17:56 --> 00:18:01
So let's continue now by looking at
a specific polymer and following
221
00:18:01 --> 00:18:06
through exactly what happens.
So here I've given you a template
222
00:18:06 --> 00:18:10
and a primer. And we're going to do
the same reaction that we just did
223
00:18:10 --> 00:18:15
conceptually. We're going to do it
again conceptually except with
224
00:18:15 --> 00:18:19
letters. We're going to mix
together. And we're going to do,
225
00:18:19 --> 00:18:24
and I see a mistake up here already,
but that's OK. You'll bear with me.
226
00:18:24 --> 00:18:29
What I've done here is to put in
some dideoxy ATP.
227
00:18:29 --> 00:18:33
And I meant to say here I've got
dATP at high levels.
228
00:18:33 --> 00:18:37
And I've got all the other
nucleotides here,
229
00:18:37 --> 00:18:41
too, at high levels.
OK? That's my error and I will
230
00:18:41 --> 00:18:45
correct it. You should correct it
now in your handout.
231
00:18:45 --> 00:18:49
So where it says dATP high,
that should actually say dNTPs high,
232
00:18:49 --> 00:18:53
not just dATP. OK? All right. So
let's look and see what happens to
233
00:18:53 --> 00:18:57
this reaction.
And I've noted here that this
234
00:18:57 --> 00:19:02
dideoxy ATP can be radioactive
or florescent.
235
00:19:02 --> 00:19:06
Or actually it doesn't have to work
that way but let's just leave it
236
00:19:06 --> 00:19:10
that way for now.
OK. That actually is not
237
00:19:10 --> 00:19:15
necessarily true.
So let's just focus on the ddATP
238
00:19:15 --> 00:19:19
plus the high dNTPs,
and let's see what happens.
239
00:19:19 --> 00:19:24
OK. So one thing that can happen
is that, here's your primer in red
240
00:19:24 --> 00:19:28
and here's the polymerized DNA in
blue, you get a bit
241
00:19:28 --> 00:19:33
of DNA polymerase.
Now here's an A.
242
00:19:33 --> 00:19:38
See? It goes GAGTAA.
And I've given you a reaction where
243
00:19:38 --> 00:19:42
the first two As use regular dATP.
And so the chain will continue
244
00:19:42 --> 00:19:47
after that. All right?
So here we go, GAGTA. And then the
245
00:19:47 --> 00:19:52
next A that's put in is a dideoxy A.
And that's the end of that
246
00:19:52 --> 00:19:57
polymerization reaction,
and the fragments you're going to
247
00:19:57 --> 00:20:02
get out of it is this little red and
blue composite there.
248
00:20:02 --> 00:20:06
You can do the same thing where you
say actually in some molecules you
249
00:20:06 --> 00:20:10
get polymerization past the second A,
and you keep going until you get to
250
00:20:10 --> 00:20:15
the next A. And at that point,
by chance, you get a dideoxy ATP
251
00:20:15 --> 00:20:19
added to some molecules.
That is the end of polymerization
252
00:20:19 --> 00:20:24
for those molecules.
The chain terminates.
253
00:20:24 --> 00:20:28
For some molecules, however,
you'll put in a regular dATP and the
254
00:20:28 --> 00:20:33
chain will continue.
But it will terminate,
255
00:20:33 --> 00:20:38
excuse me, at the next A that's put
in because you put a dideoxy A in.
256
00:20:38 --> 00:20:43
So in different molecules you're
going to land up with a spectrum of
257
00:20:43 --> 00:20:47
elongated products of different
length. All right?
258
00:20:47 --> 00:20:52
And what's crucial here is that the
length of the molecules that chain
259
00:20:52 --> 00:20:57
terminate, because they incorporated
dideoxy nucleotide,
260
00:20:57 --> 00:21:02
correspond to the position of that
particular nucleotide
261
00:21:02 --> 00:21:07
along the chain.
So you're only going to get a
262
00:21:07 --> 00:21:13
molecule chain terminating with A
when there was a T on the template
263
00:21:13 --> 00:21:18
strand. OK? And so you can map the
positions of the T on the template
264
00:21:18 --> 00:21:23
or the A on the elongated strand by
the length of the elongated products
265
00:21:23 --> 00:21:29
that come out of this reaction.
I'm going to assume you're with me
266
00:21:29 --> 00:21:33
here.
OK. So the point is the polymerized
267
00:21:33 --> 00:21:37
fragments terminate where dideoxy A
incorporates. Now,
268
00:21:37 --> 00:21:40
you've got to do four reactions to
determine the sequence of something.
269
00:21:40 --> 00:21:44
OK. And I've noted here. And the
length of the terminated fragment
270
00:21:44 --> 00:21:48
indicates the position of A.
You may need to go and work with
271
00:21:48 --> 00:21:51
this a bit. OK?
It's a very clever method but it
272
00:21:51 --> 00:21:55
may not be something that's
immediately apparent,
273
00:21:55 --> 00:21:59
so go and work with it if you need
to.
274
00:21:59 --> 00:22:03
So the length of the terminated
fragments indicates the positions of
275
00:22:03 --> 00:22:07
A in the elongated strand,
or if you want in T of the template
276
00:22:07 --> 00:22:11
strand. In order to get the
positions of all the different
277
00:22:11 --> 00:22:16
nucleotides along that DNA fragment
you have to do four separate
278
00:22:16 --> 00:22:20
reactions. One that includes dideoxy
ATP, one that includes dideoxy CTP,
279
00:22:20 --> 00:22:25
one dideoxy GTP and one dideoxy TTP.
280
00:22:25 --> 00:22:29
And you do those separately so that
you can monitor the positions of
281
00:22:29 --> 00:22:33
each of those four nucleotides by
the position of chain terminating as
282
00:22:33 --> 00:22:38
you're going along.
OK. So assuming that you guys are
283
00:22:38 --> 00:22:42
with me here at this point,
are you? No. That's an honest
284
00:22:42 --> 00:22:46
answer. Raise your hands if you're
with me. OK. If you're not with me,
285
00:22:46 --> 00:22:51
don't worry about.
You have to go work with it.
286
00:22:51 --> 00:22:55
It's not intuitive. It's very
clever. I mean there's a reason
287
00:22:55 --> 00:23:00
this guy got the Nobel
Prize for this. OK?
288
00:23:00 --> 00:23:03
It's a really clever method.
OK. So the deal is this. So now
289
00:23:03 --> 00:23:07
what you get out of this is a whole
mix of fragments of different
290
00:23:07 --> 00:23:11
lengths that have terminated at
positions of particular nucleotides,
291
00:23:11 --> 00:23:14
depending on how you've spiked the
reaction. And you've got to
292
00:23:14 --> 00:23:18
separate them from one another
somehow to figure out what those
293
00:23:18 --> 00:23:22
positions are.
And you can do this in a couple of
294
00:23:22 --> 00:23:26
ways.
You can use gel electrophoresis,
295
00:23:26 --> 00:23:31
which was discussed with you
previously, where you separate the
296
00:23:31 --> 00:23:36
DNA on the basis of size where the
DNA migrates in a gel in an electric
297
00:23:36 --> 00:23:40
field and long fragments stay near
the top of the gel and short
298
00:23:40 --> 00:23:45
fragments go to the bottom of the
gel because they migrate quickly.
299
00:23:45 --> 00:23:50
And what you can do on a gel, and
you've somehow labeled,
300
00:23:50 --> 00:23:55
don't worry about this right now,
but somehow you're able to detect
301
00:23:55 --> 00:24:00
each of the fragments that has come
out of your mix.
302
00:24:00 --> 00:24:04
OK? So remember you're doing the
sequencing reaction on millions and
303
00:24:04 --> 00:24:08
millions or billions of molecules.
And so you've got this kind of
304
00:24:08 --> 00:24:12
stochastic mix of molecules of
different lengths.
305
00:24:12 --> 00:24:16
And you want to separate this mix
of molecules of different lengths.
306
00:24:16 --> 00:24:20
OK. So what you can end up with,
once you've separated all these
307
00:24:20 --> 00:24:24
different molecules,
is in your dideoxy A reaction mix a
308
00:24:24 --> 00:24:28
series of one,
two, three, four,
309
00:24:28 --> 00:24:33
five different sized fragments.
In your ddG mix,
310
00:24:33 --> 00:24:37
you got out of that also a series of
five different sized fragments.
311
00:24:37 --> 00:24:42
And notice that they're different
in size from the ones in the ddA
312
00:24:42 --> 00:24:47
lane, the ones in the ddC lane and
the ones in the ddT lane.
313
00:24:47 --> 00:24:51
And the reason they're different in
size is because their size indicates
314
00:24:51 --> 00:24:56
the position of where a particular
nucleotide is in the DNA fragment or
315
00:24:56 --> 00:25:01
particular bases in
the DNA fragment.
316
00:25:01 --> 00:25:06
And then the trick is you could look
at this gel and you could read off
317
00:25:06 --> 00:25:11
the sequence. So the shortest
fragments that you're going to get
318
00:25:11 --> 00:25:16
are the ones that are nearest the
beginning of that molecule you made,
319
00:25:16 --> 00:25:21
nearest the 5 prime end. So the
bottom one is G,
320
00:25:21 --> 00:25:26
here's the band in the ddG lane.
Then up above it there is this band
321
00:25:26 --> 00:25:32
indicating a fragment
in the ddA lane.
322
00:25:32 --> 00:25:38
Above it there's one in the G lane
again. Above it there's one in the
323
00:25:38 --> 00:25:44
T lane. So the sequence goes
G-A-G-T, and then you can keep
324
00:25:44 --> 00:25:50
reading A-A-C-G-G-T-A-T-G-C-A.
OK? Literally like that on a gel.
325
00:25:50 --> 00:25:56
OK? So you can do that on a gel.
It's really fantastic.
326
00:25:56 --> 00:26:02
And this is what old sequencing
gels look like.
327
00:26:02 --> 00:26:05
And, actually,
I used to run them.
328
00:26:05 --> 00:26:09
I used to spend hours and hours
running these gels.
329
00:26:09 --> 00:26:13
They're very, very thin.
They're about a millimeter thick
330
00:26:13 --> 00:26:16
acrylamide so that you can resolve
the fragments that are one
331
00:26:16 --> 00:26:20
nucleotide different in size.
Think about that. OK? Each of
332
00:26:20 --> 00:26:24
these fragments,
indicated by a band,
333
00:26:24 --> 00:26:28
is one nucleotide different in size.
Otherwise, you couldn't get the one
334
00:26:28 --> 00:26:32
nucleotide resolution.
So you do that by running very,
335
00:26:32 --> 00:26:37
very thin gels so that you can
resolve the fragments well,
336
00:26:37 --> 00:26:42
and then you read off the bottom.
OK? I've thrown out all my old
337
00:26:42 --> 00:26:46
sequencing gels.
And the reason that I have is that
338
00:26:46 --> 00:26:51
there is new technology where you
don't use this kind of display
339
00:26:51 --> 00:26:56
anymore. This is a display where
your fragments were labeled with
340
00:26:56 --> 00:27:01
radioactivity and you exposed them
to x-ray film and you read the
341
00:27:01 --> 00:27:06
sequence after exposure.
Nowadays this is done by machine.
342
00:27:06 --> 00:27:11
And the dideoxy nucleotides are
labeled fluorescently.
343
00:27:11 --> 00:27:15
OK? So they're not labeled with
radioactivity.
344
00:27:15 --> 00:27:20
They're literally labeled with
labels that fluoresce with different
345
00:27:20 --> 00:27:25
colors when you put UV light on them.
And you do your dideoxy reaction
346
00:27:25 --> 00:27:30
and you run a gel. Again,
it's a gel.
347
00:27:30 --> 00:27:35
It's actually a very thin tube of a
gel mostly, but your run your gel.
348
00:27:35 --> 00:27:41
And, again, it's the same idea.
You resolve fragments at single base
349
00:27:41 --> 00:27:46
resolution, single nucleotide
resolution, and they keep,
350
00:27:46 --> 00:27:52
the gel keeps running and running.
And single fragments actually run
351
00:27:52 --> 00:27:57
off the bottom of the gel.
And as they're passing down the gel
352
00:27:57 --> 00:28:03
they are detected by a laser.
A laser excites the fluorochrome.
353
00:28:03 --> 00:28:06
And the detector,
there is a detector which will
354
00:28:06 --> 00:28:10
detect whether or not it's yellow,
orange, blue or green. OK? And
355
00:28:10 --> 00:28:14
that will tell you which base is
being, has been incorporated at that
356
00:28:14 --> 00:28:18
position. So you get things that
come out. It's kind of small but
357
00:28:18 --> 00:28:22
you can go back and look,
where instead of getting a gel with
358
00:28:22 --> 00:28:26
those bands that I showed you,
you get these peaks and valleys that
359
00:28:26 --> 00:28:30
are different colors.
And that's what current DNA
360
00:28:30 --> 00:28:35
sequencing readout looks like.
And, in fact, there are machines.
361
00:28:35 --> 00:28:40
What did I do? Lots of primers.
Well, it depends.
362
00:28:40 --> 00:28:45
Many copies of the same primer,
right. Yes. Dr. Gardel is pointing
363
00:28:45 --> 00:28:50
out that there are many copies of
the same primer in a reaction mix.
364
00:28:50 --> 00:28:55
Certainly there are. There are
billions of molecules in the
365
00:28:55 --> 00:29:00
reaction mix, and so there are
billions of primers.
366
00:29:00 --> 00:29:03
OK, so you have to have a primer for
each molecule.
367
00:29:03 --> 00:29:06
OK. And each band,
you should realize, is not a single
368
00:29:06 --> 00:29:09
molecule. It's a composite of many,
many molecules, many thousands of
369
00:29:09 --> 00:29:12
molecules that have all chain
terminated at the same position.
370
00:29:12 --> 00:29:15
So what I want to point out here is
that this is what today's readout
371
00:29:15 --> 00:29:19
looks like. And,
in fact, nowadays you just get a
372
00:29:19 --> 00:29:22
printout from the company or from
the machine that tells
373
00:29:22 --> 00:29:26
you a DNA sequence.
And it's this improvement in
374
00:29:26 --> 00:29:31
technology, but that basically uses
this chain termination method,
375
00:29:31 --> 00:29:36
that has allowed one to sequence,
rapidly enough to sequence the human
376
00:29:36 --> 00:29:41
genome and to sequence multiple
human genomes in multiple animals.
377
00:29:41 --> 00:29:46
OK. So let's see. Actually, I
have a movie. I guess we can take
378
00:29:46 --> 00:29:51
the time to watch this movie.
Let's see if it will work. All
379
00:29:51 --> 00:29:56
right. So primer template.
Four reactions, each with lots of
380
00:29:56 --> 00:30:01
molecules, each with their primer.
DNA polymerase,
381
00:30:01 --> 00:30:05
dNTPs, dATP, dGTP,
dCTP, dTTP, dCTP, excuse me.
382
00:30:05 --> 00:30:09
OK. They're your four reactions.
OK. I think is a less dorky movie
383
00:30:09 --> 00:30:13
than some. OK.
So here we go. Here's your primer
384
00:30:13 --> 00:30:17
and your template,
and here's polymerization.
385
00:30:17 --> 00:30:21
And, ah, there we go, chain
termination, dideoxy nucleotide
386
00:30:21 --> 00:30:25
incorporation,
and you cannot get elongation.
387
00:30:25 --> 00:30:30
The poor G is thwarted in its
desire to elongate. OK?
388
00:30:30 --> 00:30:34
So you land up with this mix,
just like I showed you, and you land
389
00:30:34 --> 00:30:38
up with a set of four reactions,
each with molecules of different
390
00:30:38 --> 00:30:42
lengths in them.
And here's your gel,
391
00:30:42 --> 00:30:47
and you load them on your gel,
and they migrate through your
392
00:30:47 --> 00:30:51
electric field.
And there you have your things,
393
00:30:51 --> 00:30:55
you have your fragments. This is a
piece of x-ray film you put on top.
394
00:30:55 --> 00:31:00
There are your little bands, your
radioactive bands, and here we go.
395
00:31:00 --> 00:31:04
GT, you can read it.
OK. Enough. Enough.
396
00:31:04 --> 00:31:09
OK. You can go and look at this
yourself. This is an old gel
397
00:31:09 --> 00:31:13
apparatus that one used to do DNA
sequencing on.
398
00:31:13 --> 00:31:18
This was the first generation of
machine that you could do the
399
00:31:18 --> 00:31:22
fluorescent sequencing on.
This is a room full of sequencing
400
00:31:22 --> 00:31:27
machines of the kind that was used
to sequence the human genome.
401
00:31:27 --> 00:31:30
In fact, many rooms of machines
going all day and all night
402
00:31:30 --> 00:31:34
sequencing and sequencing and
sequencing. We have a lot of
403
00:31:34 --> 00:31:37
nucleotides. And it takes a long
time to sequence.
404
00:31:37 --> 00:31:41
Although, in retrospect it's not
such a long time.
405
00:31:41 --> 00:31:45
And now all the sequencing machines
that sequence the human genome are
406
00:31:45 --> 00:31:48
sitting around looking for other
work because they all exist.
407
00:31:48 --> 00:31:52
And so that is why we are
sequencing things like dolphins and
408
00:31:52 --> 00:31:56
dogs and multiple strains of dogs,
multiple breeds, excuse me, of dogs
409
00:31:56 --> 00:32:00
because we have all these sequencing
machines sitting around.
410
00:32:00 --> 00:32:04
OK. Honestly,
I think that's true,
411
00:32:04 --> 00:32:08
not that it's not useful.
All right. So I'm going to move on
412
00:32:08 --> 00:32:13
here. This is Professor Jack's joke
that I decided to use also.
413
00:32:13 --> 00:32:17
OK. This is something about DNA
sequencing and the implications of
414
00:32:17 --> 00:32:21
being able to use DNA sequencing for
genotyping. So I'm going to use
415
00:32:21 --> 00:32:26
that. You can go and read that on
your thing. I'm going to move on
416
00:32:26 --> 00:32:30
right to talking about familial
hypercholesterolemia and the notion
417
00:32:30 --> 00:32:35
of a disease allele.
So here's part of the normal FH gene,
418
00:32:35 --> 00:32:40
the LDL receptor gene,
and here it is. And there is a T
419
00:32:40 --> 00:32:45
here in red. And here is the mutant
gene sequence and there is an A.
420
00:32:45 --> 00:32:50
So if you're wild type you have a T
at this position that's arrowed and
421
00:32:50 --> 00:32:55
if you're a mutant you have an A.
And if you do your conceptual
422
00:32:55 --> 00:33:00
protein translation here you get
your amino acid, part of
423
00:33:00 --> 00:33:05
the amino acid chain.
Obviously it's not at the beginning.
424
00:33:05 --> 00:33:09
And obviously this is DNA and this
is protein, so we've removed the RNA
425
00:33:09 --> 00:33:14
here, the RNA step.
And you can see here is the amino
426
00:33:14 --> 00:33:19
acid of your wild type,
the sequence of your wild type gene.
427
00:33:19 --> 00:33:23
And in your LDL receptor mutant
there is a stop codon at this
428
00:33:23 --> 00:33:28
position that terminates the LDL
receptor. And so the receptor gene
429
00:33:28 --> 00:33:33
is mutant and does not function
as it should.
430
00:33:33 --> 00:33:38
OK. All right.
So let me move onto the next thing
431
00:33:38 --> 00:33:43
I want to talk about,
which is this question of
432
00:33:43 --> 00:33:48
polymorphisms. What
is a polymorphism?
433
00:33:48 --> 00:34:03
Anyone. All right.
434
00:34:03 --> 00:34:07
I'll tell you what a polymorphism
is. A polymorphism is defined as
435
00:34:07 --> 00:34:12
some kind of variation
in DNA sequence.
436
00:34:12 --> 00:34:23
And it's defined as a variation in
437
00:34:23 --> 00:34:27
DNA sequence at a particular
position.
438
00:34:27 --> 00:34:40
So our DNA, all of us have very
439
00:34:40 --> 00:34:45
similar DNA. If we were to sequence
me and we were to sequence you and
440
00:34:45 --> 00:34:49
we were to sequence you,
we would find that our DNA was
441
00:34:49 --> 00:34:54
greater than 99% identical.
If we lined up our three times ten
442
00:34:54 --> 00:34:59
to the ninth base pairs in a very
long line, we would find
443
00:34:59 --> 00:35:04
it was very similar.
There was about 1% difference in
444
00:35:04 --> 00:35:10
sequence between each of us.
And most of that, some of that
445
00:35:10 --> 00:35:15
corresponds to disease gene alleles.
We all are supposed to carry about
446
00:35:15 --> 00:35:20
a thousand bad genes,
or a thousand genes that if
447
00:35:20 --> 00:35:26
homozygous would give us something
bad, and sometimes do.
448
00:35:26 --> 00:35:31
And some of those correspond to
changes in differences in DNA
449
00:35:31 --> 00:35:37
sequence that are not
directly in genes.
450
00:35:37 --> 00:35:41
All of these differences between
different individuals are called
451
00:35:41 --> 00:35:46
polymorphisms,
DNA sequence variation.
452
00:35:46 --> 00:35:50
And you can use these to help
figure out whether or not someone
453
00:35:50 --> 00:35:55
has a particular disease allele,
and also you can use it to figure
454
00:35:55 --> 00:35:59
out where the DNA from a sample
comes from me or from you
455
00:35:59 --> 00:36:04
or from Dr. Gardel.
OK? And I'll talk about this,
456
00:36:04 --> 00:36:08
using polymorphisms to map genotype.
I'm going to talk about a
457
00:36:08 --> 00:36:12
particular kind of polymorphism,
and these are called SNPs which is
458
00:36:12 --> 00:36:17
pronounced “snip”.
This stands for single nucleotide
459
00:36:17 --> 00:36:21
polymorphisms.
So I've said again that human
460
00:36:21 --> 00:36:25
genomes are 99% identical,
but there are throughout the genome
461
00:36:25 --> 00:36:30
changes, differences
between regions.
462
00:36:30 --> 00:36:34
Single nucleotide polymorphisms are
variations in one region.
463
00:36:34 --> 00:36:38
Here's a sample sequence I made up.
Here's a G in one individual and an
464
00:36:38 --> 00:36:42
A in another individual.
And if you take the population,
465
00:36:42 --> 00:36:47
you find very often that there just
is a choice of two,
466
00:36:47 --> 00:36:51
sometimes more, but often just a
choice of two nucleotides in one
467
00:36:51 --> 00:36:55
position. Most of the genomes are
identical, but you find these little
468
00:36:55 --> 00:36:59
regions where in many individuals of
a population there are
469
00:36:59 --> 00:37:04
these variations.
In fact, these variations have to be
470
00:37:04 --> 00:37:08
present in more than 1% of the
population for this thing to be
471
00:37:08 --> 00:37:12
called a SNP. This is a definition
that humans have given but it's a
472
00:37:12 --> 00:37:16
useful definition as a genetic tool.
So if there is a polymorphism
473
00:37:16 --> 00:37:20
present in about 1% of the
population, whereby I might have an
474
00:37:20 --> 00:37:24
A here, excuse me,
and Dr. Gardel has a G at that
475
00:37:24 --> 00:37:28
position, that would be a SNP,
and we would be polymorphic for that
476
00:37:28 --> 00:37:32
SNP.
In fact, my two chromosomes,
477
00:37:32 --> 00:37:38
OK, that are homologous chromosomes
might on one copy carry an A and on
478
00:37:38 --> 00:37:43
the other copy carry a G.
Now, these different bases are
479
00:37:43 --> 00:37:49
present at different frequencies.
So, for example, it might be very
480
00:37:49 --> 00:37:54
common to have a G at this position
in the sequence and it might be very
481
00:37:54 --> 00:38:00
rare to have an A at that
position. All right?
482
00:38:00 --> 00:38:04
And that's useful because you can
use the frequency of these different
483
00:38:04 --> 00:38:09
nucleotides, these different bases
to help you use the SNP to genotype.
484
00:38:09 --> 00:38:13
And I want to point out that
usually SNPs occur outside coding
485
00:38:13 --> 00:38:18
regions because 95%,
actually more than that,
486
00:38:18 --> 00:38:22
99% of the genome is not coding per
se. 95% is not genes,
487
00:38:22 --> 00:38:27
but then if you remove all the
introns and promoters and so on,
488
00:38:27 --> 00:38:32
99% does not code for any protein.
489
00:38:32 --> 00:38:36
OK. So usually these SNPs are
present outside coding regions.
490
00:38:36 --> 00:38:40
So here's to explore this a bit
more. You can find lots of these
491
00:38:40 --> 00:38:44
SNPs. There are about three million
SNPs in the human genome,
492
00:38:44 --> 00:38:49
and a very large percentage of those
SNPs has been identified by DNA
493
00:38:49 --> 00:38:53
sequencing. So you can get the idea.
You have to sequence DNA from lots
494
00:38:53 --> 00:38:57
and lots of individuals to identify
these SNPs, but people
495
00:38:57 --> 00:39:02
have done it.
And we know now more than a million
496
00:39:02 --> 00:39:06
SNPs in the human genome that are
located all over different
497
00:39:06 --> 00:39:10
chromosomes, and we know where
they're located on different
498
00:39:10 --> 00:39:14
chromosomes. And so you can use
these SNPs to make kind of a map,
499
00:39:14 --> 00:39:19
I'll tell you in a moment. So here
are some possible genotypes.
500
00:39:19 --> 00:39:23
I've given you a choice of two for
each of these.
501
00:39:23 --> 00:39:27
OK? So, for example,
for this red SNP here you can be AA,
502
00:39:27 --> 00:39:32
AC or CC on the two homologous
chromosomes.
503
00:39:32 --> 00:39:36
All right. So let's keep going with
this thread. So because you have
504
00:39:36 --> 00:39:41
these SNPs all over your genome and
you know where they are,
505
00:39:41 --> 00:39:46
you can use them to make a map of
your entire genome.
506
00:39:46 --> 00:39:51
That doesn't depend on the genes.
It just depends on the sequence.
507
00:39:51 --> 00:39:56
And knowing these SNPs is a lot
easier to work with than having to
508
00:39:56 --> 00:40:01
sequence the entire genome of
somebody every time you
509
00:40:01 --> 00:40:06
want some information.
So you can use these SNPs to
510
00:40:06 --> 00:40:11
identify each person.
So I have a SNP map of all these
511
00:40:11 --> 00:40:16
hundreds of thousands of SNPs,
or up to a million. The usual maps
512
00:40:16 --> 00:40:20
presently used are about 300,
00 SNPs per genome. I have a map of
513
00:40:20 --> 00:40:25
300,000 SNPs where there are
different, actually,
514
00:40:25 --> 00:40:30
I don't, but I could,
where there are different alleles at
515
00:40:30 --> 00:40:35
different frequencies,
different bases present at different
516
00:40:35 --> 00:40:40
frequencies at specific positions.
And we could pick any one of you and
517
00:40:40 --> 00:40:44
make a SNP map for you.
And it would look really different
518
00:40:44 --> 00:40:48
from mine, not because the SNPs
themselves are different,
519
00:40:48 --> 00:40:53
they'd be the same SNPs, but the
actual bases and the combination of
520
00:40:53 --> 00:40:57
bases between all these different
SNPs would be different between
521
00:40:57 --> 00:41:01
different individuals.
And this SNP-type map is the basis
522
00:41:01 --> 00:41:05
for DNA fingerprinting that is used
in forensics and to figure out
523
00:41:05 --> 00:41:09
disease alleles.
I'll talk more about this in a
524
00:41:09 --> 00:41:13
second. I want to point out that
there are other kinds of
525
00:41:13 --> 00:41:16
polymorphisms that are used in
genotyping, restriction fragment
526
00:41:16 --> 00:41:20
length polymorphisms and things
called simple repeat polymorphisms.
527
00:41:20 --> 00:41:24
And you can look in your book for
these restriction fragment length
528
00:41:24 --> 00:41:28
polymorphisms, but let's
talk more about SNPs.
529
00:41:28 --> 00:41:32
So SNP genotyping,
here's a whole list,
530
00:41:32 --> 00:41:36
but the ones I'm going to focus on
are disease gene mapping and
531
00:41:36 --> 00:41:41
forensics. Also,
you use SNP genotyping for paternity
532
00:41:41 --> 00:41:45
suits. OK? So if someone comes and,
you know, if someone says it's my
533
00:41:45 --> 00:41:50
kid and the other one says it's my
kid, you can figure out very easily
534
00:41:50 --> 00:41:54
whose it is by looking at these
various SNPs and figuring out what
535
00:41:54 --> 00:41:59
pattern of SNPs is present
in the offspring. OK.
536
00:41:59 --> 00:42:02
So let me actually consider,
let me not deal with genotyping for
537
00:42:02 --> 00:42:06
disease alleles at this point.
Let me talk about forensics a bit
538
00:42:06 --> 00:42:09
because it's kind of interesting.
So how do you do this? Let's look
539
00:42:09 --> 00:42:13
through this slide.
You have it as a handout.
540
00:42:13 --> 00:42:17
Here are SNPs. And I've just given
you two chromosomes each with two
541
00:42:17 --> 00:42:20
SNPs. OK? And different people
will have different bases at these
542
00:42:20 --> 00:42:24
particular SNPs,
or they'll have different
543
00:42:24 --> 00:42:28
combinations of these bases.
So here's the spot of blood at the
544
00:42:28 --> 00:42:32
crime scene.
OK? Our red blood cells do not have
545
00:42:32 --> 00:42:37
nuclei so you cannot get DNA from
those, but there are enough white
546
00:42:37 --> 00:42:42
blood cells that do have nuclei so
you can. And,
547
00:42:42 --> 00:42:47
actually, you know from PCR now that
you need very little to amplify
548
00:42:47 --> 00:42:52
something up by PCR.
One cell is sufficient,
549
00:42:52 --> 00:42:58
right? It's pushing the technology,
but you can really use one cell.
550
00:42:58 --> 00:43:03
So there are plenty of cells in a
spot of blood at a crime scene to
551
00:43:03 --> 00:43:08
isolate the DNA and to PCR amplify
the regions surrounding the SNP.
552
00:43:08 --> 00:43:13
So you're not just dealing with
these two nucleotides or the choice
553
00:43:13 --> 00:43:19
of these two nucleotides at the SNP.
You've got a little piece of DNA
554
00:43:19 --> 00:43:24
that's usually maybe 20 or so bases
that includes this choice of single
555
00:43:24 --> 00:43:29
nucleotide polymorphism.
So you amplify the SNP region,
556
00:43:29 --> 00:43:34
OK, a region that's constant, that
includes the nucleotide polymorphism,
557
00:43:34 --> 00:43:39
and you determine the sequence at
the different single nucleotide
558
00:43:39 --> 00:43:44
polymorphism regions.
So you might get someone who,
559
00:43:44 --> 00:43:49
at the red position you an be A or C,
at the green you can be G.
560
00:43:49 --> 00:43:54
OK, let's have an example here.
You can get genotypes where at red
561
00:43:54 --> 00:43:59
you're A or C,
green you're G or G,
562
00:43:59 --> 00:44:04
purple GT, and yellow you can be A
or C.
563
00:44:04 --> 00:44:07
And here the example is C and C.
So here are the four suspects,
564
00:44:07 --> 00:44:11
numbers one to four.
OK. And here are their genotypes.
565
00:44:11 --> 00:44:15
OK. And here is the spot of blood
at the crime scene that actually has
566
00:44:15 --> 00:44:18
this genotype.
OK. So let me go back here.
567
00:44:18 --> 00:44:22
This is the genotype in the blood
at the crime scene.
568
00:44:22 --> 00:44:26
OK. So the red sequence on one
chromosome is an A,
569
00:44:26 --> 00:44:30
on the other is a C,
so you have AC.
570
00:44:30 --> 00:44:34
On the other, the green sequence you
have GG, purple you have GT,
571
00:44:34 --> 00:44:38
and yellow CC. So you're looking to
see whether or not any of the
572
00:44:38 --> 00:44:43
suspect genotypes map up with a spot
of blood, right?
573
00:44:43 --> 00:44:47
So we're assuming that a spot of
blood, you know,
574
00:44:47 --> 00:44:52
comes from one of the suspects that
was attacked by the person who was
575
00:44:52 --> 00:44:56
the victim. OK.
So you have a victim with scratch.
576
00:44:56 --> 00:45:01
Someone has a spot of blood.
And you see whether or not,
577
00:45:01 --> 00:45:05
or you can use semen samples,
you can see whether or not the DNA
578
00:45:05 --> 00:45:09
in the human tissue that is believed
to come from the attacker is
579
00:45:09 --> 00:45:13
matching of any of the suspects'
genotypes. So there are a lot of
580
00:45:13 --> 00:45:17
assumptions there,
right? You have to have tissue at
581
00:45:17 --> 00:45:21
the crime scene that you believe to
come from the attacker.
582
00:45:21 --> 00:45:25
And then, once you have that,
you can determine its genotype and
583
00:45:25 --> 00:45:30
compare it to the genotypes
of the suspects.
584
00:45:30 --> 00:45:35
And you find, for example,
here that, let's see, yeah,
585
00:45:35 --> 00:45:40
so I believe the suspect number
three has the same genotype as the
586
00:45:40 --> 00:45:45
DNA that was in the spot of blood at
the crime scene.
587
00:45:45 --> 00:45:50
And that would be some evidence
that this suspect number three was
588
00:45:50 --> 00:45:55
the person who did it.
Now, in actual fact, you do this
589
00:45:55 --> 00:46:00
not just for four SNPs,
you do it for thousands of SNPs.
590
00:46:00 --> 00:46:04
You don't usually do this for 300,
00 SNPs because that's expensive and
591
00:46:04 --> 00:46:08
it's a lot of work.
And forensics doesn't put that much
592
00:46:08 --> 00:46:12
money into this.
However, the more SNPs you use for
593
00:46:12 --> 00:46:17
genotyping the more sure you are of
the suspect's identity.
594
00:46:17 --> 00:46:21
OK? Because it's really a matter
of frequency of whether or not
595
00:46:21 --> 00:46:25
you're going to get the same
combination of these different SNP
596
00:46:25 --> 00:46:30
bases in different potential
suspects.
597
00:46:30 --> 00:46:33
So the greater the spectrum of SNPs
you look at, the more sure you are
598
00:46:33 --> 00:46:37
of the suspect's identity.
Now, in some cases this has been
599
00:46:37 --> 00:46:41
very, very useful.
And there are a number of people on
600
00:46:41 --> 00:46:45
Death Row who have been exonerated
by going back to DNA recovered from
601
00:46:45 --> 00:46:49
the crime scene sometimes years ago,
doing SNP mapping and showing that
602
00:46:49 --> 00:46:53
they really couldn't have done it
because the genotypes did not match
603
00:46:53 --> 00:46:57
up. Usually these were rape cases
and the semen genotype just did not
604
00:46:57 --> 00:47:01
match up with the semen genotype of
the person on Death Row.
605
00:47:01 --> 00:47:05
So this is very valuable technology.
OK. It was used in the O.J.
606
00:47:05 --> 00:47:10
Simpson trial,
but not as well as it could have
607
00:47:10 --> 00:47:14
been which lead to equivocation
there. OK. So time is fleeting.
608
00:47:14 --> 00:47:19
I'm going to mention a technology
to you in the last couple of minutes,
609
00:47:19 --> 00:47:24
and then we'll come back to it as we
go on through later parts of the
610
00:47:24 --> 00:47:29
course. So I've talked today about
DNA sequencing.
611
00:47:29 --> 00:47:33
I've talked about using
polymorphisms to genotype people
612
00:47:33 --> 00:47:38
either, well, for disease alleles I
focused on who-done-its.
613
00:47:38 --> 00:47:43
Something else that I want to throw
out at you at this point is the
614
00:47:43 --> 00:47:48
notion of transgenic technology.
And I'm going to tell you what
615
00:47:48 --> 00:47:52
transgenic organisms are as part of
completing the Recombinant DNA
616
00:47:52 --> 00:47:57
Module. And then we'll come back in
future modules and talk more about
617
00:47:57 --> 00:48:02
how you make these things.
But I want to have this as part of
618
00:48:02 --> 00:48:07
your compendium now.
A transgenic animal or transgenic
619
00:48:07 --> 00:48:12
organism is an organism where you
have manipulated its genome in some
620
00:48:12 --> 00:48:17
way, where you've either inserted
extra DNA into its genome or you've
621
00:48:17 --> 00:48:22
removed DNA from its genome or
you've done something to its genome
622
00:48:22 --> 00:48:27
such that it was not the organism
that you started off with.
623
00:48:27 --> 00:48:32
Genetically modified organisms.
The food that you eat that is
624
00:48:32 --> 00:48:36
genetically modified has had its
genome tampered with.
625
00:48:36 --> 00:48:40
This type of transgenic technology
is very, very useful,
626
00:48:40 --> 00:48:44
not only for creating genetically
modified foods,
627
00:48:44 --> 00:48:49
but it's very, very useful for
creating disease models of animals.
628
00:48:49 --> 00:48:53
And I'll tell you now that there is
a mouse model of human familial
629
00:48:53 --> 00:48:57
hypercholesterolemia that has been
created by making a specific
630
00:48:57 --> 00:49:02
mutation, that T to A mutation in
the mouse LDL receptor gene.
631
00:49:02 --> 00:49:06
Another thing that is extremely
useful about transgenic animals is
632
00:49:06 --> 00:49:10
that you can get them to make
specific proteins.
633
00:49:10 --> 00:49:14
So, for example, there are goats
that have had inserted into their
634
00:49:14 --> 00:49:18
genomes genes that encode for
particular medications,
635
00:49:18 --> 00:49:22
for particular drugs. And you can
get these drugs out of the milk of
636
00:49:22 --> 00:49:26
the goats usually or out of the
serum of the goats because they are
637
00:49:26 --> 00:49:30
constitutively producing them
because you've put various genes
638
00:49:30 --> 00:49:35
into their genome.
So I'm going to leave it there and
639
00:49:35 --> 00:49:38
we'll talk about how to make
transgenics in a future lecture.