Sign up to receive free email alerts when patent applications with chosen keywords are publishedSIGN UP

Abstract:

Disclosed are methods and kits for genome-wide methylation of GpC sites
and for genome-wide chromatin structural determination. Specifically, the
methods and kits of the present invention make possible the simultaneous
determination of endogenous DNA methylation state and chromatin
architecture across the entire genome.

Claims:

1. A method for genome-wide methylation-sensitive chromatin structure
determination comprising: Providing eukaryotic cells with nuclei
comprised of chromatin, wherein the chromatin is comprised of nucleosomes
having DNA associated with histones and also optionally associated with
one or more tight-binding factors; Extracting the nuclei of the cells;
Methylating substantially all of the GpC sites of the chromatin not
associated with nucleosomes or tight-binding factors; Purifying the DNA;
Bisulfite converting the DNA; and Sequencing the DNA; wherein the
sequencing provides the endogenous methylation state of the DNA and the
GpC sites associated with the nucleosomes or tight-binding factors.

2. The method of claim 1, wherein the step of extracting the nuclei
comprises a step of lysing the cells to lyse the cytoplasmic membrane of
the cell.

3. The method of claim 1, wherein the step of methylating substantially
all of the GpC sites comprises contacting the cells with a GpC
methylating reagent comprising a methyl transfer agent, lysis prevention
agent and an effective amount of a GpC methyltransferase.

4. The method of claim 3, where the GpC methylating reagent further
comprises a buffer.

5. The method of claim 3, wherein the methyl transfer agent is SAM, the
lysing prevention agent is sucrose, and the GpC methyltransferase is M.
CviPI.

6. A kit for genome-wide methylation sensitive chromatin structure
determination comprising: a cytoplasmic membrane lysing reagent; a GpC
methylating reagent; a DNA purifying reagent; and instructions for using
the reagents to prepare chromatin DNA for sequencing, wherein, when used
as instructed, the endogenous methylation state of the DNA is preserved.

7. The method of claim 6, wherein, when used as instructed, the GpC sites
associated with the nucleosomes or tight-binding factors are preserved.

8. The kit according to claim 6 further comprising: a bisulfate
conversion reagent.

10. The method of claim 9, where the GpC methylating reagent further
comprises a buffer.

11. The method of claim 9, wherein the methyl transfer agent is SAM, the
lysing prevention agent is sucrose, and the GpC methyltransferase is M.
CviPI.

12. A method of genome-wide methylation of substantially all DNA GpC
sites not associated with nucleosomes and, optionally, other
tight-binding factors comprising: Providing eukaryotic cells with nuclei
comprised of chromatin, wherein the chromatin is comprised of nucleosomes
having DNA associated with histones and also optionally associated with
tight-binding factors; extracting the nuclei of the cells; contacting the
nuclei with a GpC methylating reagent comprised of a methyl transfer
agent, a lysis prevention agent and an effective amount of GpC
methyltransferase; and incubating the combination of the nuclei and GpC
methylating reagent such that substantially all of the GpC cites of the
nuclei's chromatin not associated with nucleosomes and, optionally,
tight-binding factors are methylated, wherein one or more of endogenous
DNA CpG methylation status, a native chromatin structure and the protein
binding is preserved.

14. The method of claim 12, wherein the lysis prevention agent is
sucrose.

15. The method of claim 12, wherein the step of extracting the nuclei
comprises a step of lysing the cells to lyse the cytoplasmic membrane of
the cell.

16. A kit for genome-wide methylation of substantially all GpC not
associated with nucleosomes or other tight-binding factors comprising: a
cytoplasmic membrane lysing reagent; a GpC methylating reagent comprised
of a methyl transfer agent, lysis prevention agent and an effective
amount of M. CviPI and instructions for using the reagents to methylate
substantially all of the GpC cites of the nuclei's chromatin not
associated with nucleosomes or tight-binding factors, wherein one or more
of endogenous DNA CpG methylation status, native chromatin structure and
protein binding is preserved.

17. The method of claim 16, wherein the endogenous DNA CpG methylation
status and the native chromatin structure and protein binding is
preserved.

18. The method of claim 16, wherein the endogenous DNA CpG methylation
status, the native chromatin structure and the protein binding is
preserved.

19. The kit according to claim 6 further comprising: a bisulfite
conversion reagent.

Description:

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application
No. 61/358,767, filed Jun. 25, 2010, the entire contents of which are
incorporated herein by reference in its entirety.

FIELD OF THE INVENTION

[0003] The present invention relates in general to methods and kits for
genome-wide methylation of GpC sites and for genome-wide chromatin
structural determination. Specifically, the methods and kits of the
present invention make possible the simultaneous determination of
endogenous DNA methylation state and chromatin architecture across the
entire genome.

BACKGROUND OF THE INVENTION

[0004] Gene expression is regulated by genetic and epigenetic mechanisms.
There are a variety of epigenetic mechanisms including DNA methylation
(at CpG dinucleotides) and nucleosome positioning, which work together to
generate chromatin states. Specific chromatin states facilitate, inhibit
or allow for the potential of gene activation. Genome wide studies of
chromatin states have focused on either DNA methylation or nucleosome
positioning, and as a result a comprehensive integrated genome-wide view
of DNA methylation and nucleosome positioning has not been done.

[0005] Methylation dependent single molecule footprinting techniques
(M-SPA) rely on CpG methylation. Since CpG methylation occurs
endogenously, analysis is limited to regions that are unmethylated. In
addition, CpG sites are predisposed to mutation and thus have become
under-represented in the genome overall and asymmetrically distributed
into CpG rich and CpG poor regions. Thus M-SPA is limited to regions that
are CpG rich. GpC dinucleotides do not have the same propensity for
mutation and are more broadly distributed throughout in the genome.

[0006] As such, there is a continuing need for improved methods for
determining endogenous methylation and nucleosome positioning
simultaneously.

[0008] One aspect of the present invention is the discovery that the GpC
methyltransferase enzyme, M. CviPI, only methylates DNA on a genome wide
basis under very certain conditions. As such, one aspect of the present
invention is the genome-wide methylation of GpC sites, preferably using
M. CviPI. Another aspect of the present invention is a kit for the
genome-wide methylation of GpC sites, also preferably using M. CviPI.

[0009] Another aspect of the present invention is a method for genome-wide
methylation-sensitive chromatin structure determination comprising
providing eukaryotic cells with nuclei comprised of chromatin, wherein
the chromatin is comprised of nucleosomes having DNA associated with
histones and also optionally associated with one or more tight-binding
factors, extracting the nuclei of the cells, methylating substantially
all of the GpC sites of the chromatin not associated with nucleosomes or
tight-binding factors, purifying the DNA, bisulfite converting the DNA,
and sequencing the DNA; wherein the sequencing provides the endogenous
methylation state of the DNA and the GpC sites associated with the
nucleosomes or tight-binding factors. Preferably, the step of extracting
the nuclei preferably comprises a step of lysing the cells to lyse the
cytoplasmic membrane of the cell. Preferably, the step of methylating
substantially all of the GpC sites comprises contacting the cells with a
GpC methylating reagent comprising a methyl transfer agent, lysis
prevention agent and an effective amount of a GpC methyltransferase.

[0010] The GpC methylating reagent preferably also comprises a buffer. In
a further preferred embodiment, the methyl transfer agent is SAM, the
lysing prevention agent is sucrose, and the GpC methyltransferase is M.
CviPI.

[0011] Another aspect of the present invention is directed to a kit for
genome-wide methylation sensitive chromatin structure determination
comprising a cytoplasmic membrane lysing reagent, a GpC methylating
reagent, a DNA purifying reagent; and instructions for using the reagents
to prepare chromatin DNA for sequencing, wherein, when used as
instructed, the endogenous methylation state of the DNA is preserved. The
kit may also include a bisulfit conversion reagent. Preferably, when used
as instructed, the GpC sites associated with the nucleosomes or
tight-binding factors are preserved. The GpC methylating reagent
comprises a methyl transfer agent, lysis prevention agent and an
effective amount of a GpC methyltransferase, and preferably, a buffer.

[0012] Another aspect of the present invention is directed to a method of
genome-wide methylation of substantially all DNA GpC sites not associated
with nucleosomes or other tight-binding factors comprising providing
eukaryotic cells with nuclei comprised of chromatin, wherein the
chromatin is comprised of nucleosomes having DNA associated with histones
and also optionally associated with tight-binding factors, extracting the
nuclei of the cells, contacting the nuclei with a GpC methylating reagent
comprised of a methyl transfer agent, a lysis prevention agent
(preferably sucrose) and an effective amount of GpC methyltransferase;
and incubating the combination of the nuclei and GpC methylating reagent
such that substantially all of the GpC cites of the nuclei's chromatin
not associated with nucleosomes and, optionally, tight-binding factors
are methylated, wherein one or more of endogenous DNA CpG methylation
status, a native chromatin structure and the protein binding is
preserved. Preferably, the DNA CpG methylation status, the native
chromatin structure and the protein binding are preserved. The step of
extracting the nuclei comprises a step of lysing the cells to lyse the
cytoplasmic membrane of the cell.

[0013] Another aspect of the present invention is directed to a kit for
genome-wide methylation of substantially all GpC not associated with
nucleosomes or other tight-binding factors comprising a cytoplasmic
membrane lysing reagent, a GpC methylating reagent comprised of a methyl
transfer agent, lysis prevention agent and an effective amount of M.
CviPI, and instructions for using the reagents to methylate substantially
all of the GpC cites of the nuclei's chromatin not associated with
nucleosomes or tight-binding factors, wherein one or more of endogenous
DNA CpG methylation status, native chromatin structure and protein
binding is preserved.

[0014] Another aspect of the present invention is the use of, amongst
other techniques, GpC methylation and bisulfite conversion, to determine
chromatin structure. Using the methods and kits of the present invention
enables the examination of both nucleosome positioning and endogenous CpG
methylation within the same DNA molecule. Using, for instance, massively
parallel sequencing combined with the GpC footprinting methodology, an
integrated view of DNA methylation and chromatin architecture across the
entire genome will be generated. In a preferred embodiment, cells will be
treated with a GpC methyltransferase enzyme, which will generate a
nucleosome footprint by methylating all GpC dinucleotides that are not
bound by nucleosomes or tight binding proteins. After this enzymatic
treatment, DNA is extracted and bisulfite converted. The resulting
bisulfite converted DNA is used to generate a library that will
subsequently be used for Solexa sequencing on the Illumina Genome
Analyzer. Nucleosome occupancy will be indicated by patches GpC sites,
which were protected and thus not methylated by the GpC
methyltransferase. Endogenous DNA methylation status will be obtained
from the same regions by examining methylation at CpG sites. Combining
this data will give the first genome wide-correlation of DNA methylation
and nucleosome positioning. Each region of the genome should be examined
approximately 4-5× times to give sufficient coverage and ensure
reliable and meaningful conclusions.

[0015] The approach described herein is significantly better than
currently available methods that analyze DNA methylation and protein
binding together. Importantly, in the approach described here, the
nucleosome and binding protein assay is done in living cells thus
providing an accurate, detailed picture in living cells. This is compared
to previous methods that determine nucleosome positioning using
sonication or micrococcal nuclease digestion that rely on DNA breakage,
which can be confounded by cleavage sensitivity of different genomic
regions. Thus, commonly used approaches are potentially limited to
regions of the genome that are sensitive to sonication or micrococcal
nuclease digestion and as a result do not provide a true genome-wide
approach.

[0016] As a result footprinting based on GpC methylation can be used to
interrogate both CpG rich and CpG poor regions. Imprinted regions and
X-linked genes are methylated on one allele, thus the positioning of
nucleosomes and other binding proteins cannot be examined using the M-SPA
method. In the technique described here, endogenous methylation is
obtained from the same DNA strand that is used for footprinting of
nucleosome and binding proteins thus making it possible to correlate
mono-allelic gene expression with specific chromatin structures. The use
of the GpC methyltransferase method overcomes the limitations of M-SPA
and can be used to generate an integrated view of methylated and
unmethylated regions, CpG rich and CpG poor regions, imprinted and
X-linked genes at the single molecule level, which has not been possible
up until this point.

[0017] The epigenetic landscape generated by the combined DNA methylation
analysis and nucleosome and binding protein footprint will have several
important implications for biology. The findings will provide valuable
insight into epigenetic changes that occur during a variety of diseases,
including cancer. This technique makes it possible to identify specific
chromatin structures that are correlated with particular disease states
and progression. Furthermore, this combined analysis can lead to the
identification of new drug targets and footprints can be generated as a
way to monitor a patient's response to treatment. The use of single
molecule sequencing is specifically important for disease related
changes. It allows the analysis single nucleotide polymorphisms (SNPs),
which often predispose an individual to a disease. The presence of
specific SNPs can be correlated with a particular chromatin structure or
methylation level or pattern and the susceptibility to specific diseases.

BRIEF DESCRIPTION OF THE FIGURES

[0018] FIG. 1 shows the schematic of M.SssI footprinting. First chromatin
is treated with M.SssI. This enzyme methylates all CpG sites in purified
DNA, but it cannot methylate the same sites when they are assembled into
nucleosomes or are associated with tight binding factors. Next the DNA is
purified, the sequences are bisulfite converted and individual molecules
are cloned. Patches which are inaccessible to M.SssI are revealed. Red
circles indicate CpG sites that are methylated and white circles indicate
sites that are unmethylated.

[0019] FIG. 2 is a schematic of the protocols according to one embodiment
of the present invention. The procedure can start with basic protocol 1,
which describes nuclei purification and treatment of nuclei with M.SssI,
or with basic protocol 2, which discusses in vitro remodeling and
treatment of the remodeled products with M.SssI. These two protocols are
then followed by bisulfite conversion (basic protocol 3) and PCR
amplification and cloning of individual molecules (basic protocol 4).

[0020] FIG. 3 shows images of cells before and after lysis of the cell
membrane. (A) Microscopic image of cells prior to lysis of the cell
membrane. (B) Microscopic image of cells after lysis of the cell membrane
by incubation with NP-40. (C) Microscopic image of cells after lysis of
the cell membrane by dounce homogenization.

[0021] FIG. 4 is a schematic for the bisulfite conversion of DNA. During
bisulfite treatment of DNA all unmethylated cytosines (C) are converted
into uracils (U), All methylated cytosines remain unchanged. After the
first PCR amplification cycle the U's are complemented with A's (adenine)
in the antisense strand and the methylated C's are complemented with G's
(guanine). Then after subsequent rounds of PCR the U's in the sense
strand become T's (thymidine) and the methylated C's in the sense strand
remain C's. Therefore, during the whole process unmethylated C's become
T's and methylated C's remain C's.

[0022] FIG. 5 shows the primer design for amplification of bisulfite
converted DNA. First take the genomic sequence and convert all C's that
are not part of a CpG site to T's. Then design a forward primer that is
complementary to the antisense strand. This primer should not contain any
CpG's in it and should end in a converted C (if possible). The primer
should be 18-30-bp and have a melting temperature above 50° C. Do
the same for the reverse primer, but have it complement the sense strand.
CpG sites are marked in red and primer is marked in blue.

[0023] FIG. 6 shows the methylation of mononucleosomes with increasing
amounts of M.SssI. Open circles represent CpG sites that were
inaccessible to M.SssI and closed circles indicate CpG sites that were
methylated by M.SssI. If too little M.SssI is used or if incubation times
are too short intermittent methylation patterns will be seen, as well as
protection patterns which are >150-bp per nucleosome (Panels A and B).
If experiment works correctly then a nice protection pattern of 150-bp
per nucleosome will be observed as patches of open CpG sites (Panel C).

[0024] FIG. 7 shows the methylation and expression of L1-MET correlates in
cell lines. A. Map of alternate transcript from L1-MET. Exons are
represented by black boxes and a red box represents the specific L1. The
bent arrows indicate transcriptional start sites and ATGs indicate
translational start sites. Horizontal arrows indicate the primers for PCR
of bisulfite converted DNA and RT-PCR. The bisulfite-specific primers
Bi-L1-5' and Bi-MET-3' were used to amplify L1-MET for methylation
analysis and Bi-L1-5' and Bi-L1-3' for global L1 methylation analysis.
The RT-PCR primers, RT-L1-MET-5' and RT-MET-3' were used to amplify cDNA
of the L1-MET transcript for expression analysis and RT-MET-3' and
RT-MET-5' for the full length MET expression analysis. The lower tick
marks represent each CpG site. Vertical arrows indicate the CpG sites
analyzed by the Ms-SNuPE assay. B. L1-MET methylation (red bars) and L1
methylation (black bars) was analyzed by Ms-SNuPE in 8 normal tissues,
one normal bladder fibroblast cell line (LD419), two non-tumorigenic
urothelial cell lines (UROtsa and NK2426), and 20 bladder carcinoma cell
lines. Values are the average of one CpG site for L1 and an average of
two CpG sites for L1-MET from technical triplicates. Error bars represent
the standard deviation. C. Expression of L1-MET was measured using
real-time RT PCR in one normal bladder fibroblast cell line, two normal
urothelial cell lines and 10 bladder carcinoma cell lines. There is
clearly a strong correlation between DNA methylation and expression in
all 13 cell lines examined. Values are the average from technical
duplicates. Red bars indicate the methylation status of L1-MET, which is
also represented in B, and green bars represent the level of expression
relative to GAPDH.

[0025] FIG. 8 shows that DNA methylation silences the L1-MET promoter. A.
Map of the CpG sites (represented by the lower tick marks) within the
L1-MET anti-sense promoter (ch7:116364010-116364564), which was ligated
into a CpG-less luciferase vector (pCpGL) in both orientations, allowing
for the measurement of either L1-MET activity (red bars) or L1 activity
(black bars). B. The relative luciferase activity (firefly luciferase
light units/Renilla luciverase light units) is represented as the
mean+/-SD and was high in the untreated vector, the methyl donor
S-adenosyl-methionine (SAM) alone, and the CpG methyltransferase (SssI)
alone. When the methyltransferase enzyme and the methyl donor (SssI+SAM)
were added to the luciferase vectors together then promoter activity was
silenced in both directions. The values are the average of three
biological replicates. Error bars represent the standard deviation.

[0026] FIG. 9 shows that chromatin remodeling occurs at an active L1-MET
promoter. A. DNA methylation at L1-MET and global L1s was determined by
pyrosequencing in the immortalized urothelial cell line UROtsa and
bladder carcinoma cell line T24. Chromatin immunoprecipitation was
performed using antibodies for H3K4me3, acetylated H3, and H2A.Z. The
values of the ChIP assay are the average of three experiments with
technical duplicates. Error bars represent the standard deviation, and
p16 represents a single copy gene control. The presence of active histone
marks was associated with absence of DNA methylation at L1-MET in the
cancer cell line. Methylase dependent single promoter analysis (MSPA)
with M. CviPI, a GpC methyltransferase, of the B. endogenously methylated
L1-MET promoter (ch7:116364020-116364664) in the UROtsa immortalized
urothelial cell line and the C. endogenously unmethylated L1-MET promoter
in T24 bladder carcinoma cells. D. DNA methylation at L1-MET and global
L1s was determined by pyrosequencing in the colon cancer cell line HCT116
and HCT116 DKO cells (DNMT1 hypomorph/DNMT3B knockout) [31,32]. Chromatin
immunoprecipitation was performed using antibodies for H2A.Z. The
presence of active histone marks was associated with absence of DNA
methylation at L1-MET in the DKO cell line. Methylase dependent single
promoter analysis (MSPA) with M. CviPI, a GpC methyltransferase, of the
E. endogenously methylated L1-MET promoter in HCT116 colon cancer cells,
and F. endogenously unmethylated L1-MET promoter in HCT116 DKO cells.
White circles indicate unmethylated sites and black circles indicate
methylated sites. Orange bars indicate areas of protection consistent
with the presence of a nucleosome.

[0027] FIG. 10 shows that nucleosome eviction is a frequent occurrence at
L1 promoters. A. Partial MNase digestion of nucleosomes was followed by
fractionation by a sucrose density gradient. When a Southern for genomic
DNA was performed on the DNA in each fraction (6-16), enrichment in the
mono- and dinucleosome fractions was revealed. When a Southern for L1s
was performed enrichment of L1s in the di- and tetranucleosome fractions
was found. According to our model the L1 promoters with a
tetranucleosomal structure should be inactive and methylated.

[0028] FIG. 11 shows that methylation and expression status of L1-MET
correlates in bladder tissues. Horizontal lines represent the mean and n
the number of patient samples. A. Methylation status was analyzed by
Ms-SNuPE in normal tissues (N, green), corresponding normal tissues (CN,
dark blue), and bladder tumors (T, red). Values are an average of two CpG
sites. B. Expression of the alternate transcript from L1-MET and C. the
host gene MET, and the control gene GAPDH was measured by real-time
RT-PCR. *** represents p<0.001, ** represents p<0.01, and *
represents p<0.05 as determined by the Mann-Whitney test. While there
are no error bars for the clinical sample analysis due to the extremely
limited amount of sample DNA, the results show a consistent trend.

[0029] FIG. 12 shows the methylation of L1-MET across the bladder. A.
Tissue samples were taken from five patients of their tumors (red, T) and
at increasing distances from the tumor (0.5 to 2 cm) in the surrounding
normal-appearing tissue in multiple directions (light blue, a to d).
Additionally, distant normal-appearing samples were taken at least 5 cm
from the tumor (dark blue, C). B. Methylation at L1-MET and C. global L1
was measured by pyrosequencing. The green line represents the mean
methylation of normal samples from cancer-free patients. While there are
no error bars for the clinical sample analysis due to the extremely
limited amount of sample DNA, the results show a consistent trend. D.
Bisulfite sequencing of L1-MET was performed on samples from two bladder
cancer-free patients (#4987 and #5240) and one bladder cancer patient
(#6519). White circles represent unmethylated CpGs and black circles
represent methylated CpGs.

[0030] FIG. 13 is a model of the epigenetic alterations that occur between
inactive L1s and active Ms during tumorigenesis. An L1 promoter is
usually silenced by DNA methylation and has a compact chromatin structure
with four nucleosomes occupying the promoter. Upon hypomethylation during
tumorigenesis the L1 promoter becomes transcriptionally active. The
active promoter loses a nucleosome upstream of each of the transcription
start sites, resulting in a dinucleosome structure. The remaining
nucleosomes have acetylated H3, H3K4me3, and H2A.Z. (-1) represents the
nucleosome directly upstream of the transcription start site, while (+1)
represents directly downstream nucleosome of transcriptional start site.

[0031] FIG. 14 shows the specific L1s with alternate transcripts located
in intron of genes. Black boxes represent exons of the host gene while
red boxes represent a specific L1. The black arrow represents the
transcriptional start site of the host gene while the red arrow
represents the alternate transcriptional start site within the
potentially active L1 promoter. GenBank accession numbers for
representative alternate transcripts are followed by the number in
parentheses of similar transcripts transcribed from the individual L1.
All L1s are antisense to their host genes, yielding alternate transcripts
that are sense with their host genes. Found at: doi:
10.1371/journal.pgen. 1000917.s001 (0.56 MB TIF)

[0032] FIG. 15 shows the truncated MET protein encoded by L1-MET. (A) The
functional domains of MET include the signal peptide (SP), sema domain at
the N-terminus, the PSI domain, IPT repeats, the transmembrane domain
(TM), and the kinase domain at the C-terminus. The structure of truncated
MET proteins 1 and 2 are shown, encoded by transcripts derived from
placenta (GenBank accession no. BX334980) and a bladder carcinoma cell
line (BF208095), respectively. (B) The two E1-MET transcripts, truncated
L1-MET-1 (T-MET-1) and truncated L1-MET-2 (T-MET-2), were cloned into a
pMEV expression vector with 2 HA tags fused at the N-terminal. Hela cells
were transfected with either the empty pMEV vector, pMEV T-MET-1, or pMEV
T-MET-2 and protein was extracted after 48 hours. The expression of
truncated MET-1 (90 kDa) and truncated MET-2 (60 kDa) was detected by
western blot using an HA antibody. (C) Results of 5'RACE reveal the start
site for L1-MET within the L1 element. The transcriptional start site of
L1-MET was confirmed by 5'RACE in the T24 cell line which expressed
L1-MET. The underlined sequence is located inside of the LINE-1. (D)
RT-PCR analysis of reactivation of L1-MET by 1 or 3 μM of 5-Aza-CdR
treatment for 24 hours (day 3 after treatment). β-actin expression
level was used as a control. Found at: doi:10.1371/journal.pgen.
1000917.s002 (1.22 MB TIF)

[0033] FIG. 16 shows the methylation and expression of L1-ACVRIC
correlates in cell lines. (A) Map of alternate transcripts from
L1-ACVRIC. Exons are represented by black boxes while the specific L1s
are represented by red boxes. The lower tick marks represent each CpG
site. The left bent arrow indicates transcriptional start sites and ATGs
indicate translational start sites. Green arrows indicate the primers
used to amplify the pyrosequencing product and the black arrow in between
indicates the location of the pyrosequencing primer for L1-ACVRIC. (B)
L1-ACVRlc methylation (red bars) and L1 methylation (black bars) was
analyzed by pyrosequencing in 6 normal tissues, one normal bladder
fibroblast cell line (LD419), one non-tumorigenic urothelial cell lines
(UROtsa), and 10 bladder carcinoma cell lines. Values are the average of
one CpG site for L1 and an average of two CpG sites for L1-ACVRIC from
two technical duplicates. (C) Expression of L1-ACVRIC was measured using
real-time RT PCR in one normal bladder fibroblast cell line, one normal
urothelial cell line, and 10 bladder carcinoma cell lines. Values are
also the average from two technical duplicates. Red bars indicate the
methylation status of L1-ACVRIC, which is also represented in (B), and
green bars represent the level of expression relative to GAPDH. Found at:
doi:10.1371/journal.pgen. 1000917.s003 (0.86 MB TIF)

[0034] FIG. 17 shows the methylation and expression of L1-RAB3IP
correlates in cell lines. (A) Map of alternate transcripts from
L1-RAB3IP. Exons are represented by black boxes while the specific L1s
are represented by red boxes. The lower tick marks represent each CpG
site. The left bent arrow indicates transcriptional start sites and ATGs
indicate translational start sites. Green arrows indicate the primers
used to amplify the pyrosequencing product and the black arrow in between
indicates the location of the pyrosequencing primer for L1-RAB3IP. (B)
L1-RAB3IP methylation (red bars) and L1 methylation (black bars) was
analyzed by pyrosequencing in 6 normal tissues, one normal bladder
fibroblast cell line (LD419), one non-tumorigenic urothelial cell lines
(UROtsa), and 10 bladder carcinoma cell lines. Values are the average of
one CpG site for L1 and an average of two CpG sites for L1-RAB31F from
two technical duplicates. (C) Expression of L1-RAB3IP was measured using
real-time RT-PCR in one normal bladder fibroblast cell line, one normal
urothelial cell line, and 10 bladder carcinoma cell lines. Values are
also the average from two technical duplicates. Red bars indicate the
methylation status of L1-RAB3IP, which is also represented in (B), and
green bars represent the level of expression relative to GAPDH. Found at:
doi:10.1371/journal.pgen.1000917.s004 (0.88 MB TIF)

[0035] FIG. 18 shows that chromatin remodeling occurs at active L1
promoters. (A) DNA methylation at specific and global L1s (with p16 as a
control) was determined by pyrosequencing in the immortalized urothelial
cell line UROtsa and bladder carcinoma cell line T24. The specific L1s
had less methylation in the cancer cell line. Chromatin
immunoprecipitation was performed using antibodies for (B) H3K4me3; (C)
acetylated H3; and (D) H2A.Z. The values of the ChIP assay are the
average of three experiments with technical duplicates. Error bars
represent the standard deviation. The presence of active histone marks
was associated with absence of DNA methylation at the specific Ms in the
cancer cell line. Found at: doi:10.1371/journal.pgen.1000917.s005 (0.67
MB TIF)

[0037] FIG. 20 shows that methylation and expression status of specific L
is correlates in bladder tissues. Horizontal lines represent the mean.
(A) Methylation status of L1-ACVRIC was analyzed by pyrosequencing in
normal tissues (N), corresponding normal tissues (CN), and bladder tumors
(T). Values are an average of two CpG sites. (B) Expression of the
alternate transcript from L1-ACVRIc and (C) the host gene ACVRIC, and the
control gene GAPDH was measured by real-time RT-PCR. *** represents
p<0.001, ** represents p<0.01, and * represents p<0.05. (D)
Methylation status of L1-RABSIP was analyzed by pyrosequencing in normal
tissues (N), corresponding normal tissues (CN), and bladder tumors (T).
Values are an average of two CpG sites. (E) Expression of the alternate
transcript from L1-RAB31P and F. the host gene RAB31P, and the control
gene GAPDH was measured by real-time RT-PCR. *** represents p<0.001,
** represents p<0.01, and * represents p<0.05 as determined by the
Mann-Whitney test. While there are no error bars for the clinical sample
analysis due to the extremely limited amount of sample DNA. the results
show a consistent trend. Found at: doi:10.1371/journal.pgen.1000917.s007
(0.58 MB TIF)

[0039] FIG. 22 shows the detection of L1-MET hypomethylation in urine
sediments of patients with bladder cancer. Bisulfite-specific primers and
a probe were designed for the MethyLight assay that amplified only
completely unmethylated strands of L1-MET. Bladder tissues (N) from
age-matched patients without bladder cancer (n=10) and urine (N) from
age-matched healthy volunteers (n=10) showed low levels of L1-MET
hypomethylation. However, urine (n=20) from patients with TCC showed high
levels of L1-METhypomethylation, which was specific to the bladder since
it was not detected in their white blood cells (WBC) (n=20). Unmethylated
levels (Y axis) indicate the Percent of fully Unmethylated Reference
(PUR) values. Found at: doi:10.1371/journal.pgen.1000917.s009 (0.30 MB
TIF)

[0040] FIG. 23 shows the methylation of specific L1s across the bladder.
(A) Tissue samples were taken from five patients of their tumors (red, T)
and at increasing distances from the tumor (0.5 to 2 cm) in the
surrounding normal-appearing tissue in multiple directions (light blue, a
to d). Additionally, distant normal-appearing samples were taken at least
5 cm from the tumor (dark blue, C). (B) Methylation at L1-ACVRIC and (C)
L1-RAB3IP was measured by pyrosequencing. The green line represents the
mean methylation of 12 normal samples from cancer-free patients. While
there are no error bars for the clinical sample analysis due to the
extremely limited amount of sample DNA. the results show a consistent
trend. Found at: doi:10.1371/journal.pgen. 1000917.s010 (1.62 MB TIF)

[0041] FIG. 24 shows the bisulfite sequencing of L1-MET. Biphasic
distribution of L1-MET methylation status in corresponding tissue from a
patient with bladder cancer is revealed by plotting the number of DNA
strands by the percent of CpG sites methylated. Found at:
doi:10.1371/journal.pgen. 1000917.s011(0.18 MB TIF)

[0042] FIG. 25 shows that Ms-SNuPE and pyrosequencing yield similar
methylation results. While both Ms-SNuPE and Pyrosequencing are
quantitative assays, Pyrosequencing is much more high throughput.
Therefore, we developed a Pyrosequenciag assay for the rest studies. (A)
We measured 4 CpG sites by Pyrosequencing assay in contrast with the CpG
sites by Ms-SNuPE. (B) We randomly chose 66 samples previously analyzed
by Ms-SNuPE to perform Pyrosequencing on and the results are very similar
from both assays (R=0.91). Found at: doi: 10.13711journal.pgen.
1000917.s012 (0.42 MB TIE).

[0043] FIG. 26 shows that the methods of the present invention can
accurately footprint open chromatin structures, without generating
aberrant accessibility in occupied and CpG methylated promoters. I
treated nuclei from human fibroblasts with different amounts of M.CviPI.
Both GRP78 and MLH1 are expressed (and thus should have a nucleosome
after the TSS and a nucleosome depleted region (NDR) before TSS).
Accurate footprinting of MLH1 was obtained using 100 U of M.CviPI,
however accurate footprinting of the NDR of GRP78 required the 200+100
M.CviPI condition. The 200+100 condition also accurately footprinted the
MLH1 promoter. MYOD1 and LAMB3 are not expressed in human fibroblasts and
are occupied by nucleosomes. The 200+100 condition did not result in
aberrant accessibility at these promoters. Combining these results shows
that 200+100 Units of enzyme can accurately footprint accessible
promoters without leading to aberrant GpC methylation of inaccessible
promoters. Black and white circles represent methylated and unmethylated
sites, respectively. M.SssI footprint is shown as a positive control for
GRP78, MLH1 and MYOD1 and endogenous methylation is shown for LAMB3.

[0044] FIG. 27 shows that the method of the present invention are able to
identify distinct chromatin configurations associated with specific
histone modifications and promoter types. (A) GNOMe-seq demonstrates that
H3K4me3 marked promoters are unmethylated and contain an NDR upstream and
well positioned nucleosomes after the TSS. H3K27me3 marked promoters are
unmethylated and nucleosome occupied as indicated by M.CviPI
inaccessibility. Methylated promoters are nucleosome occupied. (B) CpG
island promoters are characterized by a lack of CpG methylation, an
upstream NDR and well positioned nucleosomes after the TSS. The majority
of CpG island promoters are unmethylated (11,165) and display the same
pattern, while methylated CpG island promoters (781) are nucleosome
occupied and inaccessible to M.CviPI. (C) Non-CpG island promoters are
generally characterized by CpG methylation and inaccessibility to
M.CviPI, indicating nucleosome occupancy.

[0045] FIG. 28 shows that the methods of the present invention are able to
identify differences in chromatin configurations based on gene expression
level. Gene promoters were divided into quartiles based on transcription
level and the corresponding M.CviPI inaccessibility (1-GCH, gray line)
and DNA methylation (CGH, black line) is plotted.

[0047] We found variable chromatin configurations surrounding specific
transcription factor binding sites. (A) At AP-1 binding sites there is
low levels of DNA methylation and nucleosome depletion, while at (B) NF1
binding sites there is also a dip in DNA methylation levels but the sites
are nucleosome occupied. (B) At E2F binding sites there is a peak in
methylation that corresponds to nucleosome occupancy. Interestingly, at
CREB binding sites there is a peak in DNA methylation that corresponds to
a dip in nucleosome occupancy.

DETAILED DESCRIPTION OF THE INVENTION

[0048] Unless otherwise indicated, all terms used herein have the meanings
that the terms would have to those skilled in the art of the present
invention. Practitioners are particularly directed to Alberts et al.,
(2008) Molecular Biology of the Cell (Fifth Edition (Reference Edition))
Garland Science, Taylor & Francis Group, LLC, for definitions and terms
of the art. It is to be understood that this invention is not limited to
the particular methodology, protocols, and reagents described, as these
may vary.

[0049] The term "CpG site" refers to a region of DNA where a cytosine
nucleotide occurs next to a guanine nucleotide in the linear sequence of
bases along its length, 5' . . . CG . . . 3'. "CpG" is shorthand for
"-C-phosphate-G-", that is, cytosine and guanine separated by a
phosphate, which links the two nucleosides together in DNA. The "CpG"
notation is used to distinguish this linear sequence from the CG base
paring of cytosine and guanine.

[0050] A "GpC site" refers to a region of DNA where a guanine nucleotide
occurs next to a cytosine nucleotide in the linear sequence of bases
along its length, 5' . . . GC . . . 3' "GpC" is shorthand for
"-G-phosphate-C-", that is, cytosine and guanine separated by a
phosphate, which links the two nucleosides together in DNA. The "GpC"
notation is used to distinguish this linear sequence from the CG base
paring of cytosine and guanine.

[0051] The method for genome-wide methylation-sensitive chromatin
structure determination of the present invention includes a step of
providing eukaryotic cells with nuclei comprised of chromatin, wherein
the chromatin is comprised of nucleosomes having DNA associated with
histones and also optionally associated with one or more tight-binding
factors. The type of eukaryotic cells is not particularly limited. The
eukaryotic cells may be mammalian or non-mammalian eukaryotic cells. In a
preferred embodiment, the cells are mammalian cells, and, more
preferably, human cells. The cells may be a cell type or population
associated with a disease state or they may be so-called "normal cells,"
i.e. cells not typically associated with a disease state. Preferably, the
eukaryotic cells having a GpC frequency and distribution substantially
the same as human cells. Preferably, the GpC sites of the cells are not
endogenously methylated.

[0052] Preferably, the methods and kits of the present invention are
directed to genome-wide methylation-sensitive chromatin structure
determination. However, the methods and kits of the present invention may
also be used for methylation-sensitive chromatin structure determination
of a subset of the genome. Specifically, the structure of certain subsets
of the genome may be enriched by known methods, and the structure of
these enriched genomic subsets may be analyzed as described herein. For
instance, the genomic DNA may be treated with a restriction enzyme
according to known methods and the restriction fragments may be analyzed
separately. Further, treatment with antibodies according to known methods
may be used to enrich the antibody binding region of the genome. For
instance, an antibody to methylated DNA may be used to generate a
footprint of the subset of the genome that is methylated.

Nuclei Extraction

[0053] The method for genome-wide methylation-sensitive chromatin
structure determination of the present invention includes a step
extracting the nuclei of the cells provided.

[0054] Preferably, the cells containing the chromatin structure to be
analyzed are first trypsinized. Trypsinization is the process of using
trypsin, a proteolytic enzyme which breaks down proteins, to dissociate
adherent cells from the vessel in which they are being cultured. In
general, when added to a cell culture, trypsin breaks down the proteins
which enable the cells to adhere to a vessel, flask or container in which
the cells have been cultivated in containers that take the form of
plastic flasks or plates. Trypsin "digests" the proteins that facilitate
adhesion to the container and between cells. For instance, in connection
with the present invention, the actively growing cells are trypsinized
and washed once with cold phosphate buffer saline (PBS). In a preferred
embodiment, 250,000 cells per reaction are used and done in duplicate. An
untreated control is preferably also run. It should be noted that other
methods known to those of ordinary skill that dissociate adherent cells
from the vessel used to cultivate the cell may be used, so long as the
nuclei of the cells are not significantly altered in the process.

[0055] Preferably, the step of extracting the nuclei includes a step of
separating the nuclei of the cells from the other cytoplasmic contents of
the cell. In general, any method for separating the cellular nuclei from
the cytoplasmic content may be used so long as the chromatin remains
substantially unaltered. In a preferred embodiments, the cells are lysed
with cytoplasmic membrane lysing agent, which a lysing agent that is not
powerful enough to break the nuclear membrane, but can break the
cytoplasmic membrane. As such, cytoplasmic membrane lysing agent can be
used to separate the cytoplasmic contents of the cells from the nuclei.
In a preferred embodiment, the cytoplasmic cell lysing agent is NP-40, is
a commercially available detergent, Tergitol-type NP-40 (nonyl
phenoxypolyethoxylethanol).

[0056] The nuclei may then be separated by known techniques, for instance,
by centrifugation. Preferably, the nuclei are then washed first in a wash
buffer, as described herein. The sells may also be washed, depending on
the application in either a RSB Buffer+Sucrose wash or a RSB
Buffer+Sucrose+0.4M NaCl wash (salt wash to eliminate tight binding
transcription factors). In a typical procedure, 250,000 cells per 100 ul
are used.

Methylating Substantially all the GPC Sites

[0057] The method for genome-wide methylation-sensitive chromatin
structure determination of the present invention includes a step of (and
the associated method for) methylating substantially all of the GpC sites
not associated with the nucleosomes and also, in a preferred embodiment,
GpC sites not associated with tight-binding factors. The step of
methylating substantially all of the GpC sits preferably includes
contacting the cellular nuclei with a GpC methylating reagent. The GpC
methylating reagent preferably comprises a methyl transfer agent, lysing
prevention agent and an effective amount of a GpC methyltransferase. In a
preferred embodiment, the GpC methylating reagent further comprises a
buffer.

[0058] A suitable GpC methyltransferase is one that is capable of
methylating all cytosine residues (C5) within the double-stranded
dinucleotide recognition sequence 5' . . . GC . . . 3' that are not
associated with a nucleosome or a tight binding factor. The methylation
site of the GpC methyltransferase according to the present invention is:

##STR00001##

[0059] One suitable GpC methyltransferase useable in connection with the
present invention is M.CviPI. M.CviPI, is isolated from a strain of E.
coli which contains the methyltransferase gene from Chlorella virus. This
construct is fused to the maltose binding protein (MBP). M.CviPI is
commerically available from New England Biolabs.

[0060] The use of a GpC methyltransferase is especially advantageous since
GpC sites are not methylated in humans except in the context of the
sequence 5' . . . GpCpG . . . 3'. As such, so called "GpCpG sites" should
generally be excluded from analysis since it is not possible to
distinguish between endogenous CpG methylation and enzyme-induced GpC
methylation at such loci. The limited number and location of endogenous
CpG sites limits the resolution of prior methods based on CpG
methyltransferase. Therefore, the GpC methyltransferase based reagents
allowed an increased resolution over prior CpG methyltransferase based
reagents.

[0061] The DNA in the nuclei used in connection with the present invention
may be associated with nucleosomes or tight-binding factors. A "GpC
accessible site" is a GpC site that is capable of being methylated by the
GpC methyltransferase. A "GpC inaccessible site" is a site that is not
capable of being methylated by the GpC methyltransferase because the GpC
site is protected by (or associated with) either a nucleosome, or
alternatively, a tight binding factor. In connection with the present
invention, the GpC inaccessible sites thus provide a "footprint" of the
position of the nucleosome and/or the tight binding factors in the
chromatin.

[0062] In one embodiment of the invention, the methods and kits of the
present invention may be used to identify only the footprints of
nucleosomes and not tight binding factors. Specifically, tight binding
factors may be removed by use of a salt wash, for instance a wash that
contains 0.4M NaCl. It should be noted that nucleosomes can be made of
different types of histones. The stability of the nucleosomes depends on
which types of histones are in the nucleosome. Under certain conditions,
the salt wash may eliminate both the transcription factors and less
stable nucleosomes. The resulting footprint would include the more stable
nucleosomes. However, by comparing the size of the GpC inaccessible
region before and after salt treatment, one of ordinary skill can
determine whether the salt treatment washed out a transcription factor or
an unstable nucleosome.

[0063] The methods and kits of the present invention require that the GpC
methylating reagent comprise an effective amount of the GpC
methyltransferase and methyl donating agent. An "effective amount"
necessary is an amount necessary to methylate substantially all the GpC
accessible sites under the reaction (alternatively referred to as
"incubation") conditions. For purposes of the invention, an effective
amount of the GpC methylating reagent is an amount required to methylate
at least 80%, more preferably 90% and most preferably 99% of the GpC
accessible sites.

[0064] It is important incubation conditions and the amount of GpC
methyltransferase used be sufficient to methylate substantially all the
GpC accessible sites, but also sufficiently low to avoid substantial
methylation of the GpC inaccessible sites (for example, less than 20% of
the GpC inaccessible sites). Methylating substantially all the GpC
accessible sites means to methylate at least 80%, more preferably 90% and
most preferably 99% of the GpC accessible sites. Avoiding substantial
methylation of the GpC inaccessible sites means methylating less than at
least 20%, more preferably 10% and most preferably 1% of the GpC
inaccessible sites. The amount of the GpC methylating and methyl donating
agent and the incubation conditions may vary according to cell type.
Validation that substantially all the GpC sites are methylated but not
the GpC inaccessible sites may be done in accordance with the examples
(including the protocols) described herein.

[0065] Preferably, the amount of the GpC methylransferase is between about
50 and 500 U (U=Units, and one unit is defined as the amount of enzyme
required to protect 1 μg of lambda DNA in a total reaction volume of
20 μl in 1 hour at 37° C. against cleavage by HaeIII
restriction endonuclease). More preferably, the amount of the GpC
methyltransferase is about 100 U.

[0066] It is possible that the total amount of GpC methyltransferase may
be added in more than one aliquot. For instance, human fibroblasts
treated with different amounts of M.CviPI. Both GRP78 and MLH1 are
expressed (and thus should have a nucleosome after the TSS and a
nucleosome depleted region (NDR) before the transcription start site
(TSS). Accurate footprinting of MLH1 was obtained using 100 U of M.CviPI,
however accurate footprinting of the NDR of GRP78 required the 200 U+100
U M.CviPI condition. The 200 U+100 U condition also accurately
footprinted the MLH1 promoter. MYOD1 and LAMBS are not expressed in human
fibroblasts and are occupied by nucleosomes. The 200+100 condition did
not result in aberrant accessibility at these promoters. Combining these
results shows that 200+100 Units of enzyme can accurately footprint
accessible promoters without leading to aberrant GpC methylation of
inaccessible promoters. A footprint derived from the CpG
methyltransferase enzyme, M.SssI, can be used as a positive control for
GRP78, MLH1 and MYOD1 and endogenous methylation is shown for LAMBS.

[0067] The GpC methylating reagent preferably includes at least one methyl
transfer agent. Generally, any methyl transfer agent that is reactive
under the GpC methylation conditions and results in the donation of a
methyl group (CH3) a to the GpC cite of the acceptor DNA may be
used. In an especially preferred embodiment, the methyl transfer agent is
s-adenosyl methionine (SAM,
(28)-2-Amino-4-[[(2S,3S,4R,5R)-5-(6-aminopurin-9-yl)-3,4-dihydroxyoxolan--
2-yl]methyl-methylsulfonio]butanoate). Validation of a methyl transfer
agent for use in connection with the methods and kits of the present
invention may be accomplished by comparison of results using SAM with
results using a candidate methyl transfer agent under analogous
conditions as would be understood by a person of ordinary skill in the
art.

[0068] The GpC methylating reagent also preferably includes a lysis
prevention agent that prevents lysis of the nuclear membrane of the
nuclei under the enzyme conditions necessary for optimal methyl transfer.
Without being limited to theory, it is believed that the lysis prevention
agent adjusts the viscosity of the reaction media that permits the use of
concentrations GpC methyltransferase necessary for efficient methyl
transfer to the GpC sites but substantially reduces the lysis of the
nuclear membranes. In a preferred embodiment, the lysis prevention agent
is sucrose. Validation of a lysis prevention agent for use in connection
with the methods and kits of the present invention may be accomplished by
comparison of results using sucrose with results using a candidate methyl
transfer agent under analogous conditions as would be understood by a
person of ordinary skill in the art.

[0069] Following the step of contacting the nuclei with the GpC
methylating reagent, methods of the present invention preferably include
a step of isolating the DNA of the nuclei from the other components of
the nuclei. Any known method of isolating the DNA may be used so long as
it does not substantially affect the methylation state or sequence of the
DNA. In a preferred embodiment, the cells are treated with proteinase K,
and the DNA is purified by phenol/chloroform extraction and ethanol
precipitation.

Bilsulfite Conversion

[0070] The method for genome-wide methylation-sensitive chromatin
structure determination of the present invention includes a step of
bisulfate conversion of the DNA that has been subject to the methylating
step. The bisulfite conversion reaction was first described in 1980 as a
method for distinguishing between cytosine and 5-methylcytosine (5mC) in
DNA (Wang et al., 1980; FIG. 4). In this reaction, denatured DNA is first
treated with sodium bisulfite to convert cytosine residues to uracil,
under conditions such that 5mC remains essentially non-reactive. The DNA
sequence of interest is then amplified by PCR with primers specific for
bisulfite modified DNA. This leads to the replacement of the converted
uracil residues to thymidine residues. Therefore, during sequencing of
the bisulfite converted DNA, the unmethylated cytosines appear as
thymidine residues. Before bisulfite conversion the genomic DNA should be
digested with restriction enzymes, which cut outside the sequence to be
cloned. (Note: Bisulfite conversion can be done without cleavage of the
DNA, but this may lead to insufficient conversion of some sequences).

[0071] Bisulfite Conversion in according to the present invention can be
done using methods known to those of ordinary skill in the art.
Preferably, the methylated GpC sites are subjected to bisulfite
conversion using standard methods or commercially available kits, such as
the EZ DNA Methylation Kit, Cat. Nos. D5001 and D5002, available from
Zymo Research.

[0072] The method for genome-wide methylation-sensitive chromatin
structure determination of the present invention includes a step
sequencing the DNA.

[0073] The step of sequencing the DNA preferably includes a step of
shearing the DNA. The DNA may be sheared according to methods known to
those of ordinary skill in the art. These include Mnase Digestion,
Sonication, Nebulization and Restriction Digestion. The sheared DNA
results in a library of DNA fragments that may be sequenced, after the
library has been suitably prepared.

[0074] Once sheared, the DNA library may be prepared for sequencing
according to known methods. One method of preparing the DNA library for
use in massively parallel sequencing includes steps of End-repair,
addition of an `A` Base to the 3' end of the DNA fragments, ligation of
adapters to the ends the DNA fragments, gel purification of the products
from the ligation reaction, and enrichment of the adapter-modified DNA
fragments by PCR as known to those of ordinary skill in the art.

Sequencing and Analysis

[0075] The prepared DNA library may then be sequenced by known sequencing
techniques, including massively parallel sequencing of the fragment
library, preferably Solexa sequencing on the Illumina Genome Analyzer.
Other suitable sequencing platforms include 454 sequencing, SOLiD;
however these require a different library preparation protocol, which
protocols are well-known to those of skill in the art.

[0076] In another embodiment, paired end libraries were prepared from 5 ug
of DNA as previously described {Lister, 2009; Kelly, 2010} to generate 76
bp reads. Briefly, M.CviPI treated DNA is END repaired (Epicenter),
methylated adaptors ligated (Illumina), bisulfite converted (Zymo EZ DNA
methylation) and subject to 6 cycles of PCR and size selection by gel
purification. Clusters were generated following Illumina protocols and
the resulting library was sequenced on Illumina Hi-seq.

[0077] Using the GpC methyltransferase enzyme in accordance with the
methods and kits of the present invention enables the examination of both
nucleosome positioning and endogenous CpG methylation within the same DNA
molecule. In addition to being able to generate an integrated map of DNA
methylation and positioning of nucleosomes and other binding proteins,
the use of the GpC methyltransferase overcomes the limitations of CpG
methyltransferase based footprinting, as there is no endogenous GpC
methylation, and GpC are comparably more abundant in the genome than CpG
sites.

[0078] Using next-generation sequencing combined with the GpC footprinting
methodology as described herein, an integrated view of DNA methylation
and chromatin architecture across the entire genome can be generated.
Endogenous DNA methylation status will be obtained from the same regions
by examining methylation at CpG sites. Combining this data provides the
first genome wide-correlation of DNA methylation and nucleosome
positioning. Each region of the genome should be examined approximately
2-10× times to give sufficient coverage and ensure reliable and
meaningful conclusions.

[0079] The approach described herein is significantly better than
currently available methods that analyze DNA methylation and protein
binding together. Importantly, in the approach described herein, the
nucleosome and binding protein assay is done concurrently in living cells
thus providing an accurate, detailed picture simultaneously of the
methylation state and the nucleosome binding in living cells.

[0080] In the technique disclosed herein, endogenous methylation is
obtained from the same DNA strand that is used for footprinting of
nucleosome and binding proteins thus making it possible to correlate
mono-allelic gene expression with specific chromatin structures. The
epigenetic landscape generated by the combined DNA methylation analysis
and nucleosome and binding protein footprint has several important
implications for biology. The findings may provide valuable insight into
epigenetic changes that occur during a variety of diseases, including
cancer. This technique makes it possible to identify specific chromatin
structures that are correlated with particular disease states and
progression. Furthermore, this combined analysis can lead to the
identification of new drug targets and footprints can be generated as a
way to monitor a patient's response to treatment. The use of single
molecule sequencing is specifically important for disease related
changes. It allows the analysis single nucleotide polymorphisms (SNPs),
which often predispose an individual to a disease. The presence of
specific SNPs can be correlated with a particular chromatin structure or
methylation level or pattern and the susceptibility to specific diseases.

[0081] Another aspect of the present invention is directed to a kit for
genome-wide methylation sensitive chromatin structure determination
comprising a cytoplasmic membrane lysing reagent, a GpC methylating
reagent, a DNA purifying reagent; and instructions for using the reagents
to prepare chromatin DNA for sequencing, wherein, when used as
instructed, the endogenous methylation state of the DNA is preserved. The
kit may also include one or more of trypsin, a bisulfate conversion
reagent. Preferably, when used as instructed, the GpC sites associated
with the nucleosomes or tight-binding factors are preserved. The GpC
methylating reagent comprises a methyl transfer agent, lysis prevention
agent and an effective amount of a GpC methyltransferase, and preferably,
a buffer. The kit may also comprise a salt wash together with appropriate
instructions, for removing, for instance, tight binding factors.

[0082] The instructions included with the kit preferably include
instructions on how to use the kit to effectual a method for genome-wide
methylation-sensitive chromatin structure determination. The instructions
preferably include, for instance, a description of the eukaryotic cells
useable in connection with extracting the kit, methods for extracting the
nuclei of the cells, and more preferably instruction and protocols for
methylating substantially all of the GpC sites of the chromatin not
associated with nucleosomes or tight-binding factors. Preferably, the kit
also includes instructions and protocols for one or more of purifying the
DNA, bisulfite converting the DNA; and sequencing the DNA; wherein the
sequencing provides the endogenous methylation state of the DNA and the
GpC sites associated with the nucleosomes or tight-binding factors.

[0083] Another aspect of the present invention is directed to a kit for
genome-wide methylation of substantially all GpC not associated with
nucleosomes or other tight-binding factors comprising a cytoplasmic
membrane lysing reagent, a GpC methylating reagent comprised of a methyl
transfer agent, lysis prevention agent and an effective amount of M.
CviPI, and instructions for using the reagents to methylate substantially
all of the GpC sites of the nuclei's chromatin not associated with
nucleosomes or tight-binding factors, wherein one or more of endogenous
DNA CpG methylation status, native chromatin structure and protein
binding is preserved.

[0084] The following Examples are provided in order to demonstrate and
further illustrate certain embodiments and aspects of the present
invention and are not to be construed as limiting the scope thereof.

[0097] 4. DNA is purified by phenol/chloroform extraction and ethanol
precipitation. Do not use phase lock tubes as sucrose interferes.
[0098] Proceed with bisulfite Conversion for Genome Wide Sequencing.

B. Bilsulfite Conversion

[0099] Bisulfite Conversion can be done using different methods
Preferably, the methylated GpC sites are subjected to bisulfite
conversion using the EZ DNA Methylation Kit, Cat. Nos. D5001 and D5002,
available from Zymo Research.

C. Shearing DNA

[0100] The DNA may be sheared according to methods known to those of
ordinary skill in the art. These include Mnase Digestion, Sonication,
Nebulization and Restriction Digestion. The sheared DNA results in a
library of DNA fragments that may be sequenced, after the library has
been suitably prepared.

D. Prepare Library for Sequencing

[0101] Once sheared, the DNA library may be prepared for sequencing
according to known methods. One method of preparing the DNA library for
use in massively parallel sequencing is as follows:

** To remove remaining unligated adapters, adapters that may have ligated
to each other, and select a size-range of templates to go on the
sequencing platform. Purify up to 2-3 samples on a single gel to prevent
cross-contamination. Often materials will not be enough to be visualized
under UV, load ladder on both sides of the sample to estimate the size of
desired fragments to be isolated. [0114] Prepare 2% agarose (Biorad
cat#161-3106) gel in a final volume of 100 ml 1×TAE buffer (Biorad
cat#161-0743). Add the ethidium bromide (Sigma cat#E1510) to achieve 400
ng/ml final concentration. [0115] Add 3 μl of loading buffer (50 mM
Tris pH8.0, 40 mM EDTA, 40% (w/v) sucrose) to 8 μl of the ladder (NEB
cat#N3233L) and load all to the gel. Add 12 μl of loading buffer to
the DNA from section 3 (40 μl). Load all DNA and leave one empty lane
between ladder and sample. [0116] Run gel at 120V for 60 min. [0117]
Excise bands from 275 bp to 700 bp with a clean scalpel. [0118] Purify
DNA from agarose gel using Gel Extraction kit (Qiagen). Elute in 23 μl
of EB twice. Use 23 μl for the PCR reaction.

[0127] The prepared DNA library may then be sequenced by known sequencing
techniques, including massively parallel sequencing of the fragment
library, preferably Solexa sequencing on the Illumina Genome Analyzer.
Other suitable sequencing platforms include 454 sequencing, SOLiD;
however these require a different library preparation protocol, which
protocols are well-known to those of skill in the art.

[0128] Using the GpC methyltransferase enzyme in accordance with the
methods and kits of the present invention enables the examination of both
nucleosome positioning and endogenous CpG methylation within the same DNA
molecule. In addition to being able to generate an integrated map of DNA
methylation and positioning of nucleosomes and other binding proteins,
the use of the GpC methyltransferase overcomes these limitations as there
is no endogenous GpC methylation, and GpC are comparably more abundant in
the genome than CpG sites.

[0129] Using next-generation sequencing combined with the GpC footprinting
methodology as described herein, an integrated view of DNA methylation
and chromatin architecture across the entire genome will be generated.
Endogenous DNA methylation status will be obtained from the same regions
by examining methylation at CpG sites. Combining this data will give the
first genome wide-correlation of DNA methylation and nucleosome
positioning. Each region of the genome should be examined approximately
4-5× times to give sufficient coverage and ensure reliable and
meaningful conclusions.

[0130] The approach described herein is significantly better than
currently available methods that analyze DNA methylation and protein
binding together. Importantly, in the approach described here, the
nucleosome and binding protein assay is done in living cells thus
providing an accurate, detailed picture in living cells.

[0131] In the technique disclosed herein, endogenous methylation is
obtained from the same DNA strand that is used for footprinting of
nucleosome and binding proteins thus making it possible to correlate
mono-allelic gene expression with specific chromatin structures. The
epigenetic landscape generated by the combined DNA methylation analysis
and nucleosome and binding protein footprint will have several important
implications for biology. The findings will provide valuable insight into
epigenetic changes that occur during a variety of diseases, including
cancer. This technique makes it possible to identify specific chromatin
structures that are correlated with particular disease states and
progression. Furthermore, this combined analysis can lead to the
identification of new drug targets and footprints can be generated as a
way to monitor a patient's response to treatment. The use of single
molecule sequencing is specifically important for disease related
changes. It allows the analysis single nucleotide polymorphisms (SNPs),
which often predispose an individual to a disease. The presence of
specific SNPs can be correlated with a particular chromatin structure or
methylation level or pattern and the susceptibility to specific diseases.

[0132] Disclosed are four different basic protocols (FIG. 2) used in
connection with methods for the study of chromatin structure in purified
nuclei and remodeled reconstituted nucleosomes. Either purified nuclei or
remodeled mononucleosomes are treated with the CpG-specific DNA
methyltransferase SssI (M.SssI). They are presented herein, at least in
part, as guidance for those of ordinary skill in the art in adapting the
methods of the present invention to the use of various cells and
methylransferase reagents, including GPC methyl transferase, useable in
connection with the present invention. Although certain of the protocols
describe protocols specific to M.SssI DNA methyltransferase, a CpG
methyltransferase, the protocols herein provide general procedures and
guidance for extending the protocols to other systems.

[0133] The first two basic protocols represent two different preparations
of starting material. If the goal is to study chromatin structure in vivo
then basic protocol 1 should be referred to. This protocol describes the
purification of nuclei followed by the treatment of the nuclei with the
M.SssI DNA methyltransferase to obtain a high resolution footprint. If
the objective is to study how a specific chromatin modifier affects
chromatin structure in vitro, then basic protocol 2 should be used. This
section describes how to perform the remodeling reaction followed by
treatment of the remodeled products with M.SssI. Basic protocol 3
presents two conventional bisulfite conversion methods and lists some
commercially available kits. Basic protocol 4 presents strategies for
primer design and PCR amplification, followed by recommended sequence
analysis programs.

[0134] Although these protocols are meant to work together to determine
nucleosomal DNA accessibility at, for instance, unmethylated CpG islands
or on reconstituted nucleosomes, the last two sections can also function
together as independent methods. Bisulfite conversion is a popular
technique used in the studying of CpG methylation. PCR amplification of
the converted DNA is widely used after bisulfite conversion and can be
followed by sequencing, Ms-SNuPE (Gonzalgo and Jones, 1997; Gonzalgo and
Liang, 2007), and pyrosequencing (Tost et al., 2003) for the analysis of
endogenous DNA methylation.

[0135] Also described herein in Part II are methods for the study of
chromatin structure in purified nuclei and remodeled reconstituted
nucleosomes. Either purified nuclei or remodeled mononucleosomes are
treated with the CpG-specific DNA methyltransferase SssI (M.SssI),
followed by bisulfite sequencing of individual progeny DNA molecules
(FIG. 1; Fatemi et al., 2005, Gal-Yam et al., 2006, Lin et al., 2007,
Bouazoune et al, 2009). The basis for this method comes from the
observation that CpG sites within DNA are protected from methylation when
these sequences are wrapped around histones or tightly bound by
transcription factors. This method provides single molecule resolution
over a gene promoter or reconstituted nucleosomes under conditions in
which the physical linkage between nucleosomes and/or the tight binding
of transcription factors are maintained.

[0137] To date, most of the studies investigating nucleosome
rearrangements rely on DNA-cleaving reagents such as nucleases (Rando, D.
J. and Chang, H. Y. 2009). While very valuable, these approaches are
limited to analyzing average DNA accessibility. However, promoters are
molecular `modules`, which are controlled as individual entities. When
analyzed by conventional methodologies this modularity is destroyed.
Therefore we have modified a previously described footprinting strategy
(Kladde and Simpson, 1996; Kladde et al., 1996) such that it allows
studying the chromatin structure of individual molecules. This method can
be used to analyze unmethylated CpG islands in vivo by treatment of cell
nuclei with the M.SssI DNA methyltransferase followed by bisulfite
sequencing of individual progeny DNA molecules (Fatemi et al., 2005;
Gal-Yam et al., 2006; Lin et al., 2007). This single-molecule resolution
over the promoter allows for the physical linkage between binding sites
on individual promoter molecules to be maintained. Similarly, in
vitro-reconstituted nucleosomes, can be probed for changes in nucleosomal
DNA accessibility after remodeling using M.SssI to circumvent the
limitations of conventional methods, which monitor the remodeled products
in bulk.

[0138] The in vivo method has been used successfully in mammalian cells to
compare nucleosome positioning at the p16 promoter in two cell lines
expressing the p16 gene at different levels (Fatemi et al., 2005), to
identify transcription factor binding sites and their combinatorial
organization during endoplasmic reticulum stress (Gal-Yam et al., 2006),
to study changes in nucleosome occupancy that are involved in the
silencing of three transcription start sites of the bidirectional MLH1
promoter in cancer cells (Lin et al., 2007), to study how methylation of
a 3' promoter-proximal region affects nucleosome positioning at the TATA
box (Appanah et al., 2007), and to correlate de novo methylation patterns
with nucleosome footprint at the p16 promoter (Hinshelwood et al., 2009).
Lastly, the in vitro approach has been used to reveal the heterogeneity
of the products created by hSWI/SNF compared to human ISWI-family
remodeling factors (Bouazoune et al., 2009).

SUMMARY

[0139] Methylation-sensitive single-molecule analysis of chromatin
structure is a high-resolution method developed for studying nucleosome
positioning. As described, this method allows for the analysis of
chromatin structure of unmethylated CpG islands or in vitro-remodeled
nucleosomes by treatment with the CpG-specific DNA methyltransferase SssI
(M.SssI), followed by bisulfate sequencing of individual progeny DNA
molecules. Unlike nuclease-based approaches, this method allows for each
molecule to be viewed as an individual entity instead of an average
population.

[0140] Basic Protocol 1. Treatment of Nuclei with M.SssI

[0141] This section first describes a method for purifying nuclei from
mammalian cells. Once the nuclei are isolated M.SssI is added to
methylate the DNA at CpG sites that are not protected by nucleosomes or
tightly bound transcription factors. Proteins are then degraded and
genomic DNA is purified.

[0152] 1. Trypsinize (APPENDIX 3F) exponentially growing cells and wash
cells once with cold phosphate buffer saline (PBS). [0153] It is
recommended to start with at least 107 cells, however, this
procedure has been done successfully with 2×105 cells.

[0156] 3. Following the 10 min incubation, add 0.1 mL of 10% Nonidet P-40
(NP-40) detergent to the cells and homogenize with 15 strokes of the
tight pestle of a Dounce Homogenizer. If less than 107 cells are
used then cells can be lysed with NP-40 by pipetting up and down 15 times
with a pipette instead of using the dounce homogenizer. Transfer
homogenized cells to a 1.5 mL eppendorf tube and spin for 5 min at
800×g at 4° C. Discard supernatant.

[0157] 4. Resuspend nuclei in 1 mL of RSB buffer. At this time a small
aliquot can be checked for intact nuclei and complete lysis of the
cellular membrane under a microscope (FIG. 3). (note: Trypan blue can
also be used to visualize cell lysis under the microscope.) Centrifuge
samples for 5 min at 800×g at 4° C. Discard supernatant.

[0158] To remove tight binding transcription factors that may be
interfering with the nucleosome footprint, nuclei can be treated with RSB
buffer containing 400 mM NaCl for 2 min after the above Step 4. Nuclei
are then spun down at 800×g for 5 min and washed once with the
standard RSB buffer.

[0159] 5. Wash nuclei again with either RSB buffer or with 1×M.SssI
buffer. (It should be noted that epithelial nuclei tend to lyse during
centrifugation if washed with 1×M.SssI buffer, however fibroblast
nuclei stay intact with the 1×M.SssI buffer wash. Nuclei lysis is a
problem since the structure of the chromatin may not be maintained). Then
spin samples for 5 min at 800×g at 4° C. Discard
supernatant.

[0160] 6. Resuspend the nuclei in 1×M.SssI buffer so that there are
106 nuclei per 74.25 μL.

[0165] plus H2O to get to a 150 μl total volume [0166] Incubate
at 37° C. for 15 min. [0167] A no-M.SssI control should also be
included to measure endogenous methylation patterns. [0168] Adding 15
μL of 10× buffer results in a final concentration>1×,
however, the initial protocol was developed using 15 μL of buffer and
works well. [0169] When the procedure is done with a small number of
cells the amount of M.SssI used to treat the nuclei is adjusted
proportionally, while the reaction volume is kept at 150 μL.

[0175] This section describes a method to monitor DNA accessibility on in
vitro-reconstituted nucleosomes before and after reactions with
nucleosome remodeling factors. This approach allows dissection of the
effect of a given (set of) purified protein (s) on nucleosomal DNA
accessibility on single molecules and can, in principle, be extended to
analyzing any factor acting on chromatin. In this section, the optimal
M.SssI concentration necessary to efficiently methylate a chosen
nucleosomal template is determined in conditions analogous to a
nucleosome remodeling reaction. Next, the nucleosome remodeling
conditions are optimized. Then remodeling of the nucleosomal template is
performed and the remodeled templates are methylated using the optimized
conditions. Finally, the remodeled products are gel-purified and
subjected to a bisulfite conversion procedure in order to map the sites
of methylation and infer changes in DNA accessibility.

[0193] Nucleosomes are quantified here based on their DNA. Lower amounts
of nucleosomes may also be used if the whole reaction is analyzed without
an electrophoretic purification step (as long as about 50 ng of DNA are
retrieved after the DNA precipitation step, see below). To avoid
non-specific binding of proteins to the tubes, it is recommended to use
low-retention tubes.

[0195] This will allow for the titration of up to 9 μL of studied
enzyme. If your enzyme is in a different buffer, add 9 μL of that
buffer. The final salt concentration should ideally be around 50 mM-75 mM
of monovalent salt, as higher salt will affect the methylation reaction.

[0196] 3. Add 2 μL of 20 mM MgCl2 (in NRB buffer) to each tube.

[0197] 4. Add 1.1 μL of 200 mM ADP to each tube.

[0198] This step is to mimic the remodeling reaction conditions. Omit it
if you are planning on analyzing chromatin-binding proteins that are not
ATP-dependent remodeling factors.

[0199] 5. Add 4 μL of a mix containing 0.125 μL of SAM (160 μM
final) and increasing amounts of M.SssI (e.g. add 0.125 μL (2.5 U) to
one tube; 0.25 μL (5 U) to another tube; 0.5 μL (10 U) to the
remaining tube) in NRB. A no-M.SssI control should be included consisting
of just 4 μL of NRB in the tube.

[0200] Note that the density of CpG dinucleotides varies between DNA
templates. For this reason, the optimal M.SssI concentration has to be
determined empirically for each template.

[0201] 6. Incubate at 37° C. for 15 min.

[0202] Samples may be optionally subjected to electrophoresis after step 6
and processed beginning

[0203] step 16 in the section below.

[0204] 7. Stop the reaction by adding an equal volume of phenol/chloroform
to perform a DNA extraction followed by ethanol precipitation (UNIT
2.1A).

[0205] TE buffer and phenol/chloroform may be added to render the aqueous
and organic phase volumes more amenable to manipulations. For example,
the volume of the reaction can be adjusted to 100 quadraturel with TE
and 100 quadraturel phenol/chloroform added accordingly to perform the
DNA extraction.

[0211] Titrations may first be performed over a very broad range (e.g.
between 50 ng and 2 μg of studied enzyme) and refined to obtain a
titration producing little to complete change in nucleosome
electrophoretic mobility. Note that the latter case does not necessarily
mean that the end point of the reaction has been reached and it may just
represent a steady state.

[0213] Note that an additional 10 mM of MgCl2 is added in the
presence of ATP as it chelates Mg2+. The minus ATP control may be
carried out for only the highest concentration of remodeler once this
concentration has been determined.

[0214] 12. Incubate at 30° C. for 1 h.

[0215] 13. Add 1.1 μL of 200 mM ADP to inhibit the reaction and
incubate on ice for 10 min. [0216] The appropriate ADP:ATP ratio to
stop the reaction has to be determined empirically for each remodeler.

[0218] Use the optimal M.SssI concentration determined above in "M.SssI
treatment of in vitro-reconstituted nucleosomes"

[0219] 15. Incubate the reaction at 37° C. for 15 min. [0220]
Stop the reaction by adding phenol/chloroform and purify the DNA (UNIT
2.1A) if analysis of the whole reaction is to be performed. Otherwise
proceed to step 8 to resolve nucleosome subpopulations.

[0221] 16. Add about 2-3 μg (in about 1-3 quadraturel) of competitor
plasmid DNA (to compete the remodeler off of the nucleosomes) and
incubate on ice for 10 min.

[0222] Use a plasmid or a large DNA fragment that will not enter the gel
such that it will not interfere with the bands to be excised. Some
chromatin-binding proteins may require adding more competitor DNA.

[0223] Sample Resolution

[0224] 17. Load the samples onto a 4.5% PAA gel (UNIT 21.6, Support
Protocol 6) and run at 9-10 V/cm for about 2.5 hours.

[0225] Use a gel with large wells (e.g., 11-13 mm) as the reactions
contain a lot of DNA. The reactions also contain enough glycerol to be
loaded directly onto the gel. Pre-run the gel for 1 h and rinse wells
before loading samples. Include a lane with loading dyes such as orange G
and bromophenol blue in one well to monitor the migration. 100-bp DNA
Ladder (NEB) may be included.

[0226] 18. Disassemble the gel plates when the Orange G dye reaches the
bottom of the gel and carefully transfer the gel into a box containing
100 mL of de-ionized water. Add 0.5 μg/mL ethidium bromide and
incubate for 10 min.

[0227] The low percentage PAA gel can be more easily handled as a `roll`
by folding the sides of the gel twice towards the center.

[0228] 19. Briefly rinse the gel in a beaker containing de-ionized water
and lay the gel on top of a UV table covered with thin plastic wrap, and
visualize the bands to be excised using the 365 nm (lower energy)
wavelength lamp.

[0229] 20. Excise the bands of interest with a scalpel and transfer the
gel slices to individual tubes.

[0230] 21. Add 400 μL of TE per tube to elute the nucleosomes from the
gel overnight at 55° C.

[0234] Bisulfite Conversion can be done using different methods, two of
which are described below. The conventional method is described first,
while a more rapid method is detailed in the alternative protocol.

[0235] These following kits can be used in place of Basic Protocol 3. Most
of the bisulfite conversion methods are interchangeable, however some
genomic regions will only be converted using a particular method. It is
unknown why some methods are better than others for some genomic regions,
so if one particular method does not work, the others should be tried.
The kits include: 1. Epitect Bisulfite Conversion from Qiagen; 2. EZ
Methylation Kit from Zymo Reasearch; 3. methylSEQr bisulfite conversion
kit from Applied Biosystems; 4. MethylCode Bisulfite Conversion Kit from
Invitrogen

[0247] 1. Digest 2-4 μg of DNA with restriction enzymes in a total
volume of 20 μL.

[0248] Commonly used restrictions enzymes are HindIII, BamHI and EcoRI.
Make sure to choose an enzyme which does not cleave the sequence you want
to amplify by PCR,

[0249] 2. Denature DNA at 90° C. for 20 min.

[0250] 3. Add 5 μL of 3M NaOH to the denatured DNA and incubate at
45° C. for 20 min. [0251] The 3M NaOH is made fresh. NaOH will
react with the air over time resulting in the formation of NaCO3,
lowering the pH of the solution.

[0252] 4. Make a 0.1M hydroquinone solution by adding 0.11 g of
hydroquinone to water with a final volume of 10 mL.

[0253] 5. Make a 3.6 M sodium bisulfite solution by adding 3.76 g of
sodium bisulfite to 8.5 mL of water. Then pH solution with 3M NaOH to a
final pH of 5.0 (note: it takes approximately 1 mL of 3M NaOH to make the
pH 5.0). Bring the final volume to 10 mL with water.

[0260] 9. Spin samples in a microcentrifuge at 14,000×g for 20 min
at 4° C.

[0261] 10. Discard supernatant and wash the pellet once with 70% ethanol.

[0262] 11. Allow pellet to dry and then resuspend the pellet in 40 μL
of water.

[0263] Bisulfite converted DNA can now be stored at -20° C. for at
least 1 year.

[0264] Alternate Protocol-Rapid Bisulfite Conversion

[0265] Basic protocol 2 works best if used with this bisulfite conversion
method. Since protocol 2 uses a uniform population of DNA molecules they
tend to easily anneal together after denaturation. This prevents
efficient conversion. By performing the bisulfite conversion at
90° C. the DNA stays denatured during the reaction. This method
was first described by Shiraishi M. et al (Shiraishi and Hayatsu, 2004).

[0278] 1. If being used with genomic DNA, digest 100 ng-2 mg of DNA with a
restriction enzyme in a total volume of 20 μL. If starting with DNA
from basic protocol 2 then dilute 10 ng-50 ng of DNA in a final volume of
20 μL.

[0279] Commonly used restrictions enzymes are HindIII, BamHI and EcoRI.
Make sure to choose an enzyme which does not cleave the sequence you want
to amplify by PCR.

[0280] 2. Denature DNA at 90° C. for 20 min.

[0281] 3. Add 5 μL of 3M NaOH to the denatured DNA and incubate at
45° C. for 20 min. (The NaOH will help to further denature the
DNA). [0282] The 3M NaOH is made fresh. NaOH will react with the air
over time resulting in the formation of NaCO3, lowering the pH of
the solution.

[0283] 4. Meanwhile mix 2.08 g NaHSO3, 0.67 g
(NH4)2SO3.H2O and 5.0 mL of 50% (NH4)HSO3.
Then heat mixture at 90° C. to obtain a solution of pH 5.2-5.3
(This is the pH of the solution when it has cooled down to room
temperature. However, the solution should be added to sample when it is
at 90° C.). [0284] This solution should be made fresh.

[0295] PCR reactions are performed using bisulfite-specific primers. These
specific primers are designed so that they contain converted C's within
their sequence. These primers must not contain CpG sites in their
sequence as these sites will variably convert depending on their
methylation state.

[0296] 1. Design primers that are specific to bisulfite-converted DNA (See
FIG. 5) and encompass the region of interest. [0297] The sequence for
the forward primer should have all C's replaced by T's (if made from the
sense stand) and the reverse primer should have all the G's replaced by
A's (if made from the antisense strand). Neither primer sequence should
contain CpG sites. Primer set is more specific if one of the primers ends
in at least one converted C. Amplicons longer than 1 kb are inefficiently
amplified by PCR, likely due to breakage that occurs during the bisulfite
conversion. Optimal amplicons are approximately 600-bp.

[0298] 2. 1-2 μL of bisulfite converted DNA is usually used per PCR
reaction and each PCR is performed for 40 cycles when starting with
protocol 1 or 20 cycles when using basic protocol 2 (UNIT 15.1)

[0299] A Tag polymerase which adds 3'-A overhangs to the PCR product
should be used. This is necessary for cloning in the TOPO TA vector (see
step 3). In addition, PCR conditions will need to be optimized for each
primer set. For amplicons up to 700-bp, a 1 min extension time is usually
sufficient.

[0301] It is recommended that cloning is done immediately following PCR
amplification. Storage of PCR products prior to cloning results in the
loss of the A overhangs thereby decreasing cloning efficiency.

[0302] 4. Plasmid DNA can be amplified and/or purified by either minipreps
or templiphi (GE Healthcare) following the manufacturer's instructions.

[0303] 5. Sequence individual clones.

[0304] Analysis of Sequences

[0305] Many programs can be used to analyze sequences from bisulfite
converted DNA. Two are listed below.

[0337] Filtered with Steriflip (Millipore) can be stored for at least 1
year at 4° C.

[0338] In vitro-reconstituted nucleosomes (see UNIT 21.6).

[0339] dialyzed against NRB

[0340] Nucleosome remodeling enzyme (see Methods Enzymol. 2004; 377).

[0341] in BC 100 buffer

[0342] BC buffer

[0343] 10% Glycerol

[0344] 20 mM HEPES, pH 7.9

[0345] 0.4 mM EDTA

[0346] (BC 100 is supplemented with 100 mM NaCl)

[0347] can be stored for at least 1 year at 4° C.

[0348] TE (TRIS-EDTA)

[0349] 10 mM TRIS pH 8.0

[0350] 1 mM EDTA

[0351] can be stored indefinitely 1 year at room temperature

Critical Parameters and Troubleshooting:

[0352] a. Nuclei Purification [0353] Some nuclei are more fragile than
others and may lyse especially during the high salt treatment (this can
be checked by looking at a small aliquot under the microscope). However,
lysis may be overcome by incubating the nuclei in the high salt buffer
for a couple of minutes and then diluting the sample 10-20 fold with RSB
buffer before centrifuging. Nuclei can also be spun at a lower speed for
a longer (amount of time). In addition nuclei can be resuspended in RSB
buffer containing 200 mM NaCl and then an equal volume of RSB buffer
containing 600 mM NaCl can be carefully added.

[0354] b. M.SssI Treatment

[0355] If the M.SssI concentration used is too low, methylation will be
sporadic and protections larger than 170-bp will be observed while high
concentrations of M.SssI will cause methylation within the
nucleosome-protected DNA (mainly at the entry/exit points of the
nucleosomes). Although both varying M.SssI concentration and time of
incubation may be used to obtain ideal nucleosome footprints, using a
higher M.SssI concentration for a relatively short time (i.e. 15-20 min)
appears to be better than using little enzyme for longer time. Be aware
of incubating too long with M.SssI as chromatin structure may change over
time.

[0356] c. DNA Templates for In Vitro Remodeling Assay

[0357] It is recommended to use DNA sequences containing a high density of
CpG dinucleotides in order to obtain a high-resolution DNA accessibility
mapping. Since working with a homogeneous starting substrate facilitates
subsequent data analysis, it is also recommended to use DNA templates
containing nucleosome-positioning sequences (see commentary UNIT 21.6)

[0358] d. Conditions for Nucleosome Remodeling or Binding

[0359] Remodeling reactions need to be optimized. Partial remodeling may
result from both using insufficient amounts as well as a vast excess of
remodeler. Hence, the amount of protein that will produce a maximal
change in nucleosome electrophoretic mobility has to be determined
empirically. It will depend on many parameters such as the specific
activity of the tested protein (complex), the quality of the protein
preparation and the assay conditions (e.g. salt concentration, time and
temperature of incubation). The assay conditions may be changed, however
this may impact on methylation efficiency as the NRB was designed to be
similar to the 1×M.SssI buffer (NEB 2). Therefore optimization of
the methylation reaction would have to be repeated with the new
remodeling (or binding) conditions. Note that in this assay the
MgCl2 concentration was reduced compared to 1×NEB 2 buffer to
avoid nucleosome precipitation. Lastly, if you intend to analyze DNA
circles or plasmids assembled onto nucleosomes, it is noteworthy that
M.SssI has been reported to exhibit topoisomerase activity at MgCl2
concentrations above 3 mM (Renbaum et al., 1990).

[0360] e. Primer Design

[0361] In addition to conventional rules that apply to designing PCR
primers (see UNIT 15.1), it is important to make sure that primers are
designed to the converted sequence and do not contain CpG sites within
them. Make sure that at least one primer ends in a converted C. This will
make the primer more specific for the converted DNA. Primers should be
tested on unconverted DNA in order to make sure there is no
amplification.

[0362] f. PCR Amplification (see UNIT 15.1) [0363] Even when careful
consideration is taken to properly design primers; PCR amplification
might fail. It is important that every PCR is optimized for annealing
temperature. In addition, magnesium concentrations can be varied and/or
DMSO can be added to the reaction. Different Taq Polymerases can also be
tried. If all else fails, design new primers. Some primer pairs just
don't work well.

[0364] g. TA Cloning [0365] Sometimes many false positives may be
obtained after TA cloning. This can be due to primer dimers or other
non-specific PCR products formed during PCR amplification. If this is the
case, the PCR product can be gel-purified before cloning. (UNIT 2.6 or
Qiagen Gel Extraction kit).

[0366] h. DNA Sequencing Reveals Unconverted Sequences

[0367] Poorly converted amplicons will automatically be determined by BiQ
analyzer program. Proper conversion is defined by having at least 90% of
the Cs found in the amplicon which are not part of CpG sites converted to
Ts. If unconverted or partially converted DNA sequences are retrieved
then try a different bisulfite conversion method, as some methods are not
efficient at converting certain sequences. Alternatively, primers may
need to be redesigned.

[0368] i. DNA Sequences Appear to all have the Same Methylation Pattern

[0369] Caution should be taken to make sure that the results are not due
to the PCR amplification or sequencing of only a few strands of DNA. If
bisulfite-converted DNA is of poor quality or if low amounts of DNA are
being used as a template, then the PCR amplification will result in
amplification of only a few strands. This may be reflected by a weak PCR
product. In this case, the sequences obtained may all have the same
methylation pattern. The BiQ Analyzer allows for determining potential
duplicate sequences.

[0370] Anticipated Results

[0371] If using purified nuclei as a starting material, the number of
positive colonies obtained after TA cloning will vary relative to PCR
amplification efficiency (before cloning). This will vary from sequence
to sequence. After analysis of the sequencing data, protection patterns
of about 150-bp per nucleosome should be observed (FIG. 6). If
intermittent patterns are observed then experiment did not work correctly
and may need to be re-optimized for amount of M.SssI used and incubation
time. Smaller protection patterns may be observed for tightly bound
transcription factors.

[0372] For the M.SssI treatment of in vitro-reconstituted nucleosomes
20-100 ng of DNA should be recovered from the gel slices (as measured by
NanoDrop). After the TA cloning 50-100 positive colonies should be
obtained. After sequencing, about 90% of the DNA molecules should show a
nucleosomal protection between 146 and 170-bp (FIG. 6).

[0373] Time Consideration

[0374] If starting with basic protocol 1, the whole procedure up to the
sequencing of clones should take 4 days. On day 1 nuclei isolation and
M.SssI treatment should be completed with the proteinase K digestion
allowed to proceed overnight. On the second day the DNA can be purified
and the bisulfite conversion completed (if the conventional bisulfite
conversion method is used then this reaction can be allowed to proceed
overnight). PCR amplification and TA cloning can be completed on day 3
with the transformed colonies being allowed to grow overnight on LB
plates containing the correct selective antibiotic. On the fourth day
colonies can be screened and submitted for sequencing.

[0375] If starting with protocol 2, the whole procedure should take about
5 days (not including sequencing time). Since polymerization of the 4.5%
PAA gel takes about 1 h, it is better to pour the gel early during the
day or the day before doing the experiment (and keep the gel damp at
4° C.). Since the pre-run of the gel takes about 1 h, it can be
started before setting, up the remodeling reactions. Depending on the
number of samples and the number of bands to be excised, the whole
remodeling procedure may take about 5 h to 6 h. Together with the
overnight nucleosome gel-elution and the DNA extraction and
precipitation, count 2 days of work before subjecting the DNAs to the
bisulfite conversion, PCR and cloning.

Hypomethylation of a Line-1 Promoter Activates an Alternate Transcript of
the MET Oncogene in Bladders with Cancer

Introductory Remarks

[0407] Aberrant DNA methylation is involved in the initiation and
progression of carcinogenesis and includes both hypermethylation of CpG
islands at gene promoters and global hypomethylation. While a small
portion of hypomethylation occurs at gene promoters, resulting in
overexpression of certain oncogenes [1,2], the majority occurs at
repetitive elements, such as long interspersed nuclear elements (LINE-1s
or L1s) [3]. Since most of the 500,000 copies of L1 have become
nonfunctional over the course of human evolution [4] and can no longer
transpose, genome-wide hypomethylation at L1s during tumorigenesis is
thought to contribute mainly to chromosomal instability [5]. In mice
hypomethylation of transposable elements can lead to disruption of normal
gene function [6]. Viable yellow agouti (Avy) mice have a
retrotransposon inserted into one allele of the agouti locus and when
this retrotransposon is hypomethylated, which can occur in utero by
limiting the maternal intake of methyl donors, it acts as an alternate
promoter for agouti. Ectopic induction of the agouti gene results in
altered coat color, obesity, and an increased incidence of tumors [6].
While it is well known that repetitive elements are hypomethylated in
cancer, it has never been directly demonstrated that hypomethylation of a
retrotransposon leads to ectopic gene expression in humans.

[0408] A recent study has revealed that more than 30% of transcription
start sites in the human genome are located within repetitive elements,
with just over 7% in L1s [7]. A full length L1 sequence (6 Kb) has a
sense promoter driving transcription of its two open reading frames and
an antisense promoter driving transcription in the opposite direction
that can act as an alternate promoter for surrounding genes [8-10].
Almost 500 of these retrotransposons can induce ectopic gene expression
in embryonic and cancerous tissues, revealing their potential role during
both development and tumorigenesis [7]. However this study did not
address the potential mechanism of how repetitive elements become
transcriptionally active. Since the L1 promoter is a CpG island and
methylated in normal somatic tissues it seems likely that epigenetic
mechanisms are involved in its transcriptional silencing. There are many
layers of epigenetic regulation responsible for regulating expression of
single copy genes, including DNA methylation, histone modifications, and
nucleosome occupancy [11]. While it is known that unmethylated
retrotransposons in Arabidopsis [12] acquire the active histone variant
H2A.Z, the chromatin structure in humans of repetitive elements,
particularly active ones, has been largely ignored.

[0409] Until recently it has not been possible to study the promoters of
individual. Ms since the sequences are too similar to design primers for
one particular locus [13-15]. Therefore a direct correlation between the
epigenetic status of a specific L1 and expression of its associated
transcript has not been possible. For the first time to our knowledge, we
have elucidated the role of epigenetics in the transcriptional activity
of L1s by utilizing novel assays capable of examining the methylation
status and chromatin structure of specific Ms and expression of alternate
transcripts originating from the L1 promoters. In addition to L1s being
hypomethylated and transcriptionally active in bladder tumors we also
found that a specific L1 located within the MET oncogene is active across
entire bladders with cancer. The clinical implication of our finding is
that surgical excision of the tumor would leave behind large areas of the
bladder that remain epigenetically altered and express a potential
oncogene. We also provide evidence that an active acquires H2A.Z and
nucleosome free regions upstream of TSSs, which has only been described
previously at single copy genes, and undergoes chromatin remodeling from
an inactive tetranucleosomal structure to an active dinucleosomal
structure.

Discussion of Certain Results

[0410] The consequences of global hypomethylation at repetitive elements
in cancer has long been the subject of speculation regarding the
generation of genomic instability and potential activation of oncogenes.
While hypomethylation during tumorigenesis occurs quite frequently, a
direct demonstration of the impact of hypomethylation of repetitive
elements on gene expression has not been conducted. Using several
specific L1s we have demonstrated the mechanism of transcriptional
activation and, taken together with the results of Faulkner et al. [7],
our results highlight the previously underappreciated impact of
hypomethylation on ectopic gene expression, possibly contributing to
tumorigenesis in a synergistic or cooperative manner (see model in FIG.
13).

[0411] To elucidate the mechanism of transcriptional activation of
repetitive elements, we compare the epigenetic alterations, including
methylation status, histone modifications, and nucleosome positioning,
that occur at a single copy of an L1 between a transcriptionally inactive
and active state. Since current methods did not exist for such a study we
employ several novel assays, including using primers able to amplify
specific L1s, enabling methylation and ChIP assays to be performed on
single copies, and a modification of the method for determining
nucleosome positioning at a single molecule resolution, which allowed for
the determination of nucleosome positioning in a methylated region. We
were able to show that transcription from the L1 promoter is silenced by
DNA methylation, providing direct evidence that one function of DNA
methylation is to protect the human genome from retrotransposons.

[0412] Transcriptional activation of L1 promoters by hypomethylation
results in a chromatin structure similar to that of active single copy
genes such as p16, revealing that the features of active promoters, such
as acquisition of active histone marks, H2A.Z, and nucleosome free
regions upstream of TSSs, are not restricted to canonical gene promoters.
In addition, we found that the unique structure of the L1 promoter
results in two very stable nucleosome occupancy states, the inactive
tetranucleosome structure and the active dinucleosome structure, and that
hypomethylation could result in a switch between the two. It has been
demonstrated that tetranucleosomes form a compact chromatin fiber [37].
Therefore, the widespread chromatin remodeling due to global
hypomethylation of L1 promoters could contribute to chromosomal
instability through the loss of many stabilizing tetranucleosome
structures.

[0413] To our knowledge we have provided the first direct evidence that
transcriptional activation of repetitive elements is caused by
hypomethylation and chromatin remodeling at their promoters, occurs in a
human diseased state, and may play a role in disease predisposition.
Specifically, hypomethylation of a L1 promoter induces an alternate
transcript of the MET oncogene in bladder tumors and across the entire
urothelium of tumor-bearing bladders. The presence of L1-MET
hypomethylation across the entire urothelium of tumor-bearing bladders
has several possible explanations. Epigenetic alterations such as
hypemmethylation of tumor suppressor genes and hypomethylation of L1s
have been found in normal epithelia adjacent to several types of tumors,
including breast [38], esophageal [39], and colon [40,41], indicating the
presence of a "field defect". Our data supports the presence of an
epigenetic field defect in bladders with cancer, either due to
independent events across the urothelium or clonal expansion [42].
However, another possible explanation is that the loss of L1-MET
methylation occurred during early development before the bladder was
fully formed. While some evidence for such abnormal epigenetic
programming exists, as a recent study revealed that people who develop
bladder cancer have slightly lower levels of global DNA methylation in
their blood than healthy control cases [43], we did not find any evidence
of a loss of methylation at global L1s or specific L1s in our patient WBC
samples (FIG. 22). Another possibility, which cannot be ruled out by this
data, is that the presence of a tumor causes epigenetic changes across
the bladder.

[0414] Whatever the underlying mechanism, the modulation of gene
expression by hypomethylation of a retrotransposon such as what occurs at
the agouti locus in mice is also found in humans. This leads to the
activation of surrounding genes, which may contribute to tumorigenesis in
a synergistic or cooperative manner. Transurethral resection of bladder
tumors would leave behind large areas of epigenetically altered
urothelium, possibly contributing to the high level of recurrence of
bladder cancer. Fortunately, hypomethylation at specific L1s seems to
provide a valuable biomarker that has the potential to significantly
impact the diagnosis and treatment of bladder cancer.

Results Include:

[0415] Hypomethylation of specific L1s correlates with expression of
alternate gene transcripts. To elucidate the mechanism of transcriptional
activation of repetitive elements we used the sequence of the functional
promoter of L1s to identify specific promoters potentially capable of
expressing alternate transcripts of host genes. FIG. 14 contains the
genomic locations of the L1s, all of which are in an antisense
orientation to the host gene allowing for transcripts in sense
orientation to the gene's coding sequence. Interestingly, most these ESTs
are from tumor cells. One such L1 is located within the MET oncogene
(L1-MET) [8]. Since MET is known to be overexpressed in bladder cancer
[16-18], we characterized two L1-MET transcripts by sequencing EST clones
obtained from a bladder carcinoma cell line (GenBank accession no.
BF208095) and placenta (BX334980). Both transcripts have start sites
located in the L1 promoter, share the same reading frame as MET (FIG.
15A), and when transiently transfected into Hela cells result in
expression of truncated MET proteins (FIG. 15B). Several truncated forms
of the tyrosine kinase MET, which is the hepatocyte growth factor (HGF)
receptor, are constitutively active and promote invasion and migration
through activation of a variety of signal transduction pathways in
numerous types of carcinomas, including breast, prostate, colorectal, and
lung, in musculoskeletal sarcomas, and also in haematopoietic
malignancies [19,20]. Therefore hypomethylation of L1-MET could lead to
expression of a transcript that encodes a truncated and potentially
constitutively active MET protein.

[0416] To examine the methylation status at a specific L1 we designed
bisulfate-specific PCR primers with one located in the L1 promoter and
the other in the surrounding intronic region of the host gene (FIG. 7A).
The L1-MET promoter was highly methylated in normal cells and tissues,
whereas 18 out of 20 of the bladder carcinoma cell lines showed
significant hypomethylation (p<3.4×10-10) (FIG. 7B). We
also measured methylation of global L1s using the standard assay with two
primers that anneal within the L1 promoter (FIG. 7A). We found that
hypomethylation of L1s was significant (p<6.4×10-5) but not
as dramatic as L1-MET hypomethylation and that the methylation pattern
can be quite different between global L1s and a specific L1, such as in
the cell lines LD137, T24, and RT4 (FIG. 7B). This result clearly shows
that global L1 status does not represent the status at specific L1s.

[0417] The transcript from the L1-MET anti-sense promoter contains its own
exons 1 and 2, referred to as L1-MET exon 1 and L1-MET exon 2 (FIG. 7A).
We designed RT-PCR primers with one primer located in either the MET exon
2 or the L1-MET exon 1 and one primer located in the shared exon 3 to
examine the expression of the host gene MET and the alternate transcript
from L1-MET, respectively (FIG. 7A). We confirmed the transcription start
site of L1-MET by 5'RACE in the T24 bladder carcinoma cell line (FIG.
15C) in which the L1-MET promoter is completely unmethylated. The L1-MET
transcript was lowly expressed in one bladder fibroblast cell line
(LD419) and two non-tumorigenic urothelial cell lines, UROtsa [21] and
NK2426 [22], and highly expressed in most bladder carcinoma cell lines
(FIG. 7C). L1-MET was also not expressed in normal tissues except for
placenta (data not shown). Therefore L1-MET hypomethylation correlated
with the expression of the alternate transcript (FIG. 7C). Treatment of
LD419 with the demethylating agent 5-aza-deoxycytidine lead to expression
of L1-MET, suggesting that L1-MET is silenced by DNA methylation (FIG.
15D). We also designed bisulfite-specific PCR primers and RT-PCR primers
for two additional specific L1s from the list shown in FIG. 14, which
were randomly selected. One L1 was located within ACVR1c, a member of the
TGF-Beta family able to induce apoptosis [23], and the other located in
RAB3IP, and a protein whose exact function is unknown (FIGS. 16&17).
Hypomethylation of these specific L1s also correlated with expression of
their associated alternate transcripts, suggesting that DNA methylation
plays a role in transcriptional silencing of functional L1 promoters in
general (FIGS. 16&17).

[0418] DNA methylation, silences the L1-MET promoter. The data presented
thus far represents an association between hypomethylation of an L1
promoter and ectopic expression of an alternate transcript. To directly
demonstrate that DNA methylation represses transcription of the
bidirectional L1 promoter we utilized a luciferase promoter activity
assay with a pCpGL luciferase reporter construct that has been modified
to not contain any CpG sites [24]. Therefore, after insertion of the
promoter sequence of interest the plasmid can be treated with the CpG
methyltransferase M. SssI and the methyl donor S-adenosyl-methionine
(SAM), allowing the promoter to be methylated without affecting the
plasmid backbone. We created two plasmids, differing only the orientation
of the L1-MET promoter, allowing us to measure either the L1
transcriptional activity or the L1-MET activity transcriptional activity
(FIG. 8A). Activity in both directions was inhibited in the methylated
plasmid (FIG. 8B). To our knowledge these data show for the first time
that DNA methylation directly suppresses transcription from L1 promoter
in both directions, indicating that the ectopic transcripts from Ms found
in cancer [7] are a result of L1 hypomethylation. The relative activity
between the two different promoters indicates that the L1-MET promoter is
much weaker than the L1 promoter,

[0419] Chromatin remodeling accompanies transcriptional activation of L
promoters. In addition to DNA methylation, epigenetic regulation of gene
transcription also involves chromatin structure, specifically covalent
modifications of histones, incorporation of histone variants, and
nucleosome occupancy. In mice the chromatin structure of global L1s has
been studied, but not in the promoter region [25]. Very few studies have
addressed the chromatin structure at repetitive elements in humans. We
took advantage of our ability to examine specific Ms to analyze the
chromatin remodeling that occurs between the promoters of inactive and
active repetitive elements in humans. Using chromatin immunoprecipitation
(ChIP) we found that the level of DNA methylation at each specific L1 is
inversely proportional to the level of enrichment of active histone marks
(FIGS. 9A & FIG. 18), and the chromatin structure at global L1s did not
correlate with the specific Ms. Comparing the structure of the
unmethylated L1-MET promoter in T24 bladder carcinoma cells to the
methylated L1-MET promoter in UROtsa urothelial cells revealed a gain of
the active marks H3K4me3 and acetylated H3 and the histone variant H2A.Z
(FIG. 9A). Therefore transcriptional activation of a repetitive element
results in a similar pattern of chromatin remodeling found in active
single copy genes such as p16 (FIG. 9A) [12,26,27].

[0420] A switch from a tetranucleosome to dinucleosome structure
accompanies transcriptional activation of the L1-MET promoter.
Methylase-sensitive Single Promoter Analysis (M-SPA) has previously been
used to obtain single molecule resolution of nucleosome positioning at
unmethylated CpG island promoters [28]. Briefly, nuclei are isolated and
treated with the CpG methyltransferase M. SssI, followed by DNA
extraction, bisulfate conversion, and genomic sequencing of individual
clones. The resulting pattern of applied DNA methylation reveals patches
of protection, indicating the location of nucleosomes on individual
molecules. Previously, the main limitation of the M-SPA method was that
it could not be used to assess nucleosome positioning in an endogenously
methylated region. However, the enzyme M. CviPI, which methylates GpC
sites [29], can be used to avoid this problem since endogenous GpC sites
are not methylated in humans except in the context of a GpCpG. Therefore,
by modifying our M-SPA method by using a GpC methyltransferase we have
conducted the first single molecule analysis of nucleosome positioning at
a methylated promoter and, in combination with our ability to study
specific L1s, have shown the nucleosome occupancy at a single repetitive
element in both an active and inactive state.

[0421] The endogenously methylated L1-MET promoter in the UROtsa
immortalized urothelial cell line was completely occupied by nucleosomes,
revealing that the methylated L1-MET promoter exists in a
tetranucleosomal structure (FIG. 9B). GpCpG sites were excluded from
analysis since it is not possible to distinguish between endogenous CpG
methylation and enzyme-induced GpC methylation at such loci. When we
performed the same assay on T24 cells in which L1-MET is unmethylated we
found a nucleosome occupying the region downstream of each of the two
transcription start sites and no nucleosome upstream of either (FIG. 9C).
We were able to confirm the results in T24 cells using the CpG
methyltransferase M. SssI, since L1-MET was not endogenously methylated
(FIG. 19). However, the number and location of CpG sites limits the
resolution of this assay since the region upstream of the L1-MET start
site contains only one CpG site. Therefore, the GpC methyltransferase
allowed an increased resolution for this method. The unmethylated MLH1
promoter was used as a positive control for both CpG and GpC
methyltransferase activity and accessibility (data not shown).

[0422] Previous work on the MLH1 bidirectional promoter has demonstrated
that while each transcription start site loses the nucleosome directly
upstream when active (-1 nucleosome), the nucleosome directly downstream
is always maintained (+1 nucleosome) [27,30]. The L1 promoter is a
different type of bidirectional promoter that generates partially
overlapping sense and antisense transcripts, commonly referred to as an
antisense promoter (ASP). The L1 ASP has room for two nucleosomes between
the two transcription start sites, therefore each start site has its own
+1 nucleosome. These two +1 nucleosomes are maintained while the active
promoter loses the -1 nucleosome at both starts sites. Therefore the
inactive L1 promoter exists in a tetranucleosomal state (two +1 and two
-1 nucleosomes) while the active promoter exists in a dinucleosomal state
(two +1 nucleosomes). In addition, when DNA methylation levels are
reduced by knocking out expression of 2 of the 3 methyltransferases
responsible for maintaining DNA methylation, DNMT1 and DNMT3B [31,32], we
see acquisition of H2A.Z at L1-MET and global L1s (FIG. 9D) along with
induction of expression of L1-MET (data not shown) and nucleosome
eviction at the L1-MET promoter (FIGS. 9E&F), revealing that a switch
from an inactive tetranucleosomal structure to an active dinucleosomal
structure accompanies hypomethylation.

[0423] Many L1 promoters exist in an active chromatin structure. While a
single-molecule analysis of the nucleosome occupancy at the L1-MET
promoter confirmed that an active L1 promoter switches from a
tetranucleosomal structure to a dinucleosomal structure, we cannot
generalize that other L1s exist in these states. To do so we took a
cancer cell line that has a methylated and inactive L1-MET promoter, the
colon cancer cell line HCT116, and performed chromatin fractionation
using MNase digestion followed by sucrose gradient ultracentrifugation
[33]. The fractions were run on an agarose gel and a genomic Southern
using radioactively labeled input DNA was performed. Most of the DNA was
present in the mononucleosome and dinucleosome fractions (FIG. 10). When
the same blot was probed with the L1 promoter sequence, the distribution
of global L1 promoters showed enrichment in both the dinucleosome and
tetranucleosome fractions, indicating that other L1s besides L1-MET could
exist in an inactive tetranucleosome or active dinucleosome structure
(FIG. 10).

[0424] Hypomethylation of and expression from specific L1s occurs in
bladder tumors. Since bladder tumors display both hypomethylation of L1s
[34] and overexpression of MET [16-18], our next step was to determine
whether hypomethylation of the specific L1 promoters and their associated
alternate transcripts, including L1-MET, were present in uncultured
bladder tumors. We found high levels of methylation at L1-MET and low
expression in normal bladder epithelium obtained from age-matched cancer
free bladders (FIGS. 11A&B) and significant hypomethylation of, and
expression from, L1-MET in bladder tumors (FIGS. 11A&B). We also examined
the methylation and expression of two additional specific L1 promoters
located within host genes (FIG. 20). Hypomethylation of the L1-ACVR1c and
L1-RAB3IP promoters occurred in bladder tumors (FIG. 20). Therefore we
have provided the first clinical evidence that hypomethylation of
functional L1 promoters results in ectopic gene expression during
tumorigenesis.

[0425] Surprisingly, we also found hypomethylation and associated
alternate expression of L1-MET in the corresponding histologically normal
tissues from tumor-bearing bladders taken at least 5 cm away from the
tumor (p<0.0001) (FIGS. 11A&B). Hypomethylation and expression of
L1-MET was more prevalent in the corresponding normal tissues than
L1-ACVR1c, L1-RAB3IP (FIG. 20) [35]. Therefore, hypomethylation of L1-MET
and activation of alternate transcripts of MET occurs not only during
tumorigenesis but also in premalignant tissue. Receiver operating
characteristic (ROC) curves for L1-MET revealed an extraordinary degree
of both sensitivity and specificity for detecting bladder tumors (AUC of
0.97) and premalignant tissue (AUCs of 0.89) (FIG. 21). Since aberrant
methylation in bladder tumors can be detected in urine sediments [36] and
we are able to detect hypomethylation of L1-MET in urine sediments of
bladder cancer patients (FIG. 22), a noninvasive urine test has the
potential to be developed into an assay for tumor detection and
prediction of high-risk patients.

[0426] As expected, the expression of the host gene MET was not correlated
with hypomethylation of the L1-MET promoter, since the expression of MET
is regulated by its endogenous promoter and not by the specific L1
promoter (FIGS. 11A&C). It has previously been shown that overexpression
of MET is correlated with global L1 hypomethylation in chronic myeloid
leukemia (CML) [14]. The biological mechanism behind this correlation is
unclear, as MET is expressed from an entirely different promoter than
L1-MET and we have shown that global L1 methylation does not correlate
with specific L1 methylation. Further, we did not find overexpression of
MET in bladder tumors, suggesting that it may be L1-MET that is
overexpressed instead since many primers used to detect expression can
amplify both products.

[0427] Hypomethylation and expression of L1-MET occurs across the
urothelium of tumor-bearing bladders. Since we observed hypomethylation
at L1-MET in bladder tissues taken at least 5 cm from tumors we collected
histologically normal tissue samples from five tumor-bearing bladders
taken at various distances and directions from the tumors to determine
whether distance has any effect on the level of hypomethylation (FIG.
12A). When compared to the average level of methylation in normal tissues
from cancer-free bladders, L1-MET was dramatically hypomethylated in
normal-appearing tissues across each of the tumor-bearing bladders
independent of the distance from the site of the tumor (FIG. 12B).
However the normal-appearing tissues were not significantly
hypomethylated at L1-ACVR1c, L1--RAB3IP, and global L1 (FIGS. 23, & FIG.
12C). Bisulfite sequencing of L1-MET in the urothelium of patients
without bladder cancer revealed only fully methylated strands while in a
patient with bladder cancer fully unmethylated strands were present in
the tumor and the corresponding normal urothelial tissue independent of
the distance from the tumor (FIGS. 12D & FIG. 24). A plot of the
distribution of DNA strands versus the percent of methylated sites
reveals a biphasic distribution in the patient with bladder cancer, with
the majority of strands either fully methylated or fully unmethylated
(FIG. 24). Our in vitro results (FIGS. 8&9) suggest that these fully
unmethylated strands found in tumor-bearing bladders have undergone
chromatin remodeling involving a switch from a tetranucleosome to a
dinucleosome structure and are transcriptionally active. To our knowledge
this is the first alteration, either epigenetic or genetic, that has been
found across an entire tumor-bearing organ.

Materials and Methods

[0428] Cell Lines. The non-tumorigenic human urothelial cell lines UROtsa
and NK2426 and the normal fibroblast cell line LD419 have been described
previously [21, 22, 36]. Human bladder carcinoma cell lines were obtained
commercially (T24, J82, HT1376, SCaBER, UM-UC-3, TCCSUP, and RT4;
American Type Culture Collection, Manassas, Va.) or derived in our
laboratory (prefix LD). Cell culture, DNA and RNA purification were
performed as previously described [36]. RNA was reverse-transcribed as
previously described [36]. 5'-Rapid Amplification of cDNA Ends (RACE) to
determine the 5' end of the primary transcript of L1-MET was performed
using the RLM-RACE kit (Ambion) according to the manufacturer's
instruction. See Table 1 for primer sequences.

[0429] Tissue Collection. Tumor tissue samples were collected from the
patients undergoing cystectomy or TURBT for bladder cancer. Normal
bladder epithelium was obtained from 12 patients undergoing radical
prostatectomy for prostate cancer (aged from 50 to 80) and 7 autopsy
patients aged from 34 to 82, 5 of which were from non-cancer related
deaths and 2 from deaths due to cancers other than bladder). All of these
collections took place at Norris Cancer Hospital in IRB-approved
protocols with patients' consent. Hematoxylin and eosin (H&E) sections
marked with the location of the adjacent urothelium or tumor were used to
guide in microdissection. DNA was bisulfite treated as previously
described [44]. RNA extraction was done using a RNAeasy Micro Kit
(Qiagen, Crawley, UK).

[0430] Quantitation of DNA Methylation. Methylation-sensitive single
nucleotide primer extention (MS-SNuPE) was performed as previously
described [44]. See Table 1 for primer sequences. In order to allow for a
higher throughput in methylation analysis pyrosequencing was also
performed as described previously [45]. Testing both methods on the same
set of 66 samples yielded a correlation in the methylation levels of
R=0.91 (FIG. 25). For pyrosequencing, PCR was performed on bisulfite
converted DNA using a biotin-labeled 3' primer to enable purification and
denaturation of the product by Streptavidin Sepharose beads and was
followed by annealing of a sequencing primer to the single-stranded PCR
product. Pyrosequencing was performed using the PSQ HS96 Pyrosequencing
System and the degree of methylation was expressed for each DNA locus as
percentage methylated cytosines over the sum of methylated and
unmethylated cytosines. See Table 1 for primer sequences. To analyze the
methylation status of individual DNA molecules, we cloned bisulfite PCR
fragments into the pCR2.1 vector using the TOPO-TA cloning kit
(Invitrogen, Carlsbad, Calif.). Individual colonies were screened for the
insert and the region of interest was sequenced using M13 primers. See
Table 1 for primer sequences.

[0431] Quantitative RT-PCR. Expression was determined by quantitative
RT-PCR as described previously [27]. See Table 1 for primer sequences.

[0432] Luciferase assay. The L1-MET and L1 promoters were cloned into the
pCpGL luciferase vector [24]. The portion of the L1-MET promoter cloned
was 555 bp, with 535 bp within the L1 and 20 bp within the MET gene
(ch7:116364010-564). These experiments were performed as described
previously [24].

[0433] Chromatin immunoprecipitation. ChIP was performed as described
previously [27]. Briefly, chromatin was isolated from cells and
crosslinked with formaldehyde. The chromatin was then sonicated to less
than 500 bp in length and immunoprecipitated with an antibody to the
histone modification of interest. Enrichment was determined by RT-PCR of
the pulled down DNA. See Table 1 for primer sequences.

[0434] Methylation-dependent single promoter analysis. M-SPA was performed
as described previously [28]. Briefly, chromatin was isolated from
250,000 cells and treated for 15 minutes with 50 U of M. SssI. DNA was
isolated, bisulfite converted, and PCR fragments were cloned for
sequencing of individual molecules. In order to examine endogenously
methylated promoters and increase the resolution of this method,
chromatin from 250,000 cells was treated with the enzyme M. CviPI, which
methylates GpC sites [29], for 15 minutes with 100 U.

[0435] MNase digestion and Southern blot. MNase digestion and sucrose
density gradient centrifugation were performed as described previously
[33]. See Table 1 for primer sequences for the LINE-1 promoter probe.

[0436] Statistical Analyses. Significant differences in methylation and
expression levels in normal, corresponding normal, and tumor tissues were
determined using a Mann-Whitney test.

The Methods of the Present Invention can Accurately Footprint Open
Chromatin Structures without Generating Aberrant Accessibility in
Occupied and Methylated Sequences

[0483] The methods and kits of the present invention can be used to
identify distinct chromatin configurations associated with specific
histone modifications and promoter types. We examined specific promoter
classifications as determined by Hawkins et al (Hawkins, 2010).
Consistent with their active status, H3K4me3 marked promoters are
unmethylated, show a distinct Nucleosome Depleted Regon (NDR) upstream of
the Transcription Start Site (TSS) and at least four well-positioned
nucleosomes downstream of the TSS (FIG. 3). In contrast H3K27me3 marked
promoters are unmethylated but nucleosome occupied, as indicated by
inaccessibility to M.CviPI. DNA methylated promoters are nucleosome
occupied. We next investigated the chromatin configurations of CpG island
and non-CpG island promoters (FIG. 3B,C). In general, CpG island
promoters are unmethylated, show a distinct NDR upstream of the TSS and
wellpositioned nucleosomes downstream of the TSS. Separating CpG island
promoters into those that are methylated and unmethylated reveals that
the CpG island promoter pattern is largely driven by unmethylated CpG
island promoters (11,165 promoters) and the few CpG island promoters that
are methylated (781 promoters) do not show an NDR. In general, non-CpG
island promoters are endogenously methylated and nucleosome occupied.
Separating non-CpG island promoters into those that are methylated and
unmethylated reveals that unmethylated non-CpG island promoters also have
an NDR upstream of the TSS and a nucleosome immediately downstream of the
TSS while methylated non-CpG island promoters do not show an NDR.
Methylated non-CpG island promoters show a relative decrease in
endogenous methylation immediately upstream of the TSS which can also be
seen in the overall pattern for non-CpG island promoters.

[0484] We next examined the correlation between chromatin configurations
determined by GNOMe-seq and transcription level (Supplemental FIG. 1). We
divided promoters into quartiles based on their expression level
{Hawkins, 2010}. GNOMe-seq shows that promoters in the lowest bin (0-25%)
are nucleosome occupied with intermediate DNA methylation levels, likely
reflecting that inactive promoters can be silenced through DNA dependent
and independent mechanisms. With increasing expression quartiles the NDR
upstream of the promoter and the positioning of the nucleosomes after the
TSS become more apparent. Interestingly, there is a relative increase in
DNA methylation immediately upstream of the TSS in the 50% most expressed
genes.

[0485] As shown in FIG. 27, the methods and kits of the present invention
are able to reveal distinct chromatin configurations associated with
specific histone modifications and promoter types. (A) GNOMe-seq
demonstrates that H3K4me3 marked promoters are unmethylated and contain
an NDR upstream and well positioned nucleosomes after the TSS. H3K27me3
marked promoters are unmethylated and nucleosome occupied as indicated by
M.CviPT inaccessibility. Methylated promoters are nucleosome occupied.
(B) CpG island promoters are characterized by a lack of CpG methylation,
an upstream NDR and well positioned nucleosomes after the TSS. The
majority of CpG island promoters are unmethylated (11,165) and display
the same pattern, while methylated CpG island promoters (781) are
nucleosome occupied and inaccessible to M.CviPI. (C) Non-CpG island
promoters are generally characterized by CpG methylation and
inaccessibility to M.CviPI, indicating nucleosome occupancy. The few
unmethylated non-CpG island promoters (1397) are depleted of nucleosomes
upstream of the TSS, while the majority of non-CpG island promoters
(4668) are nucleosome occupied and inaccessible to M.CviPT. M.CviPI
inaccessibility is plotted (1-GCH) in teal and CpG methylation (CGH) in
black.

[0486] The methods and kits of the present invention are able to identify
differences in chromatin configurations based on gene expression level as
shown in FIG. 28. Gene promoters were divided into quartiles based on
transcription level and the corresponding M.CviPI inaccessibility (1-GCH,
teal line) and DNA methylation (CGH, black line) is plotted.

[0487] The methods and kits of the present invention are also able to
footprint nucleosomes surrounding transcription factor binding sites. As
shown in FIG. 29A-D, the methods and kits of the present invention are
able to identify different chromatin configurations surrounding various
transcription factor binding sites. Reads were aligned to the center of
transcription factor binding consensus sequences. Data is plotted as
M.CviPI inaccessibility (1-GCH, gray line) and DNA methylation (CGH,
black line)

[0488] We found variable chromatin configurations surrounding specific
transcription factor binding sites. (A) At AP-1 binding sites there is
low levels of DNA methylation and nucleosome depletion, while at (B) NF1
binding sites there is also a dip in DNA methylation levels but the sites
are nucleosome occupied. (B) At E2F binding sites there is a peak in
methylation that corresponds to nucleosome occupancy. Interestingly, at
CREB binding sites there is a peak in DNA methylation that corresponds to
a dip in nucleosome occupancy.

[0489] All publications cited herein are expressly incorporated herein by
reference in their entirety.

[0490] With respect to the use of substantially any plural and/or singular
terms herein, those having skill in the art can translate from the plural
to the singular and/or from the singular to the plural as is appropriate
to the context and/or application. The various singular/plural
permutations may be expressly set forth herein for sake of clarity.

[0491] While various aspects and embodiments have been disclosed herein,
other aspects and embodiments will be apparent to those skilled in the
art. The various aspects and embodiments disclosed herein are for
purposes of illustration and are not intended to be limiting, with the
true scope and spirit being indicated by the following claims. Those
skilled in the art will recognize, or be able to ascertain using no more
than routine experimentation, many equivalents to the specific
embodiments of the method and compositions described herein. Such
equivalents are intended to be encompassed by the following claims.