This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Large-scale copy number variants (CNVs) have recently been recognized to play a role in human genome variation and disease. Approaches for analysis of CNVs in small samples such as microdissected tissues can be confounded by limited amounts of material. To facilitate analyses of such samples, whole genome amplification (WGA) techniques were developed. In this study, we explored the impact of Phi29 multiple-strand displacement amplification on detection of CNVs using oligonucleotide arrays. We extracted DNA from fresh frozen lymph node samples and used this for amplification and analysis on the Affymetrix Mapping 500k SNP array platform. We demonstrated that the WGA procedure introduces hundreds of potentially confounding CNV artifacts that can obscure detection of bona fide variants. Our analysis indicates that many artifacts are reproducible, and may correlate with proximity to chromosome ends and GC content. Pair-wise comparison of amplified products considerably reduced the number of apparent artifacts and partially restored the ability to detect real CNVs. Our results suggest WGA material may be appropriate for copy number analysis when amplified samples are compared to similarly amplified samples and that only the CNVs with the greatest significance values detected by such comparisons are likely to be representative of the unamplified samples.

INTRODUCTION

Initial analysis of the human genome identified single nucleotide polymorphisms (SNPs) as the primary source of genotypic and phenotypic variation among humans. However, subsequent studies identified large-scale copy number variants (CNV) that apparently impacted millions of nucleotides (1–6). These large-scale variants included polymorphic deletions and duplications that are present in >1% of the population and therefore meet the traditional definition of polymorphism (2). As of November 2007, 4878 CNV loci impacting 808 Mbp of DNA sequence have been identified and these are listed in the Database for Genomic Variants (http://projects.tcag.ca/variation/). CNVs are also features of several human diseases including Alzheimer disease (7), Cri du chat syndrome (8), mental retardation (9) and cancer (10,11). As robust array-based methods for copy number detection continue to mature, increasing numbers of these variants are being identified (2).

Current whole-genome methods to detect CNVs require relatively large input quantities of DNA that are difficult or impossible to obtain from rare cell populations such as biopsies and microdissected tissues. To address this challenge, whole genome amplification (WGA) techniques were developed that increase the amount of DNA for analysis. For example, multiple-strand displacement amplification (MDA) using Phi29 DNA polymerase was used to generate microgram quantities of high molecular weight DNA (>30 kb) from nanograms of high quality input material (12,13). A recent report described a protocol for amplification of picogram quantities of DNA from single cells (14), further expanding the applications for this technique.

The replication fidelity of WGA techniques have been investigated (15–20). Estimates of base-pair incorporation errors resulting from Phi29-mediated amplification have ranged from 2.2 × 10−5 (21) to 9.5 × 10−6 (16) and the concordance of genotypes between unamplified and amplified samples were reported to be >99.8% (16,19). Recurrent WGA-induced copy number biases were observed in previous studies (15–20), and were associated with sequence repeats and proximity to chromosome ends (17–20), increased GC content (17,20), and annotated CNVs (17). Many of these associations were explored descriptively without statistical analysis and there was no consensus on the 92 recurrent regions of bias explicitly defined by three of these studies (16,17,20). A recent study of 532 samples subjected to WGA and subsequent analysis using the Affymetrix 10k Mapping array identified a median of 438 WGA-induced copy number artifacts in comparisons between amplified samples and an unamplified reference set (15). While there is a consensus that at least partial compensation of systematic biases can be achieved through the use of an amplified reference (16–20), it is unknown to what degree such comparisons can capture real CNVs detected using more sensitive, higher resolution platforms.

Recently, bias induced by a number of whole genome amplification protocols was examined using a high-throughput, massively parallel whole genome pyrosequencing technique (22). In this comparison, which involved sequencing two bacterial genomes, Phi29 MDA-based approaches generated the most complete genome coverage (50–99%), and introduced the least bias compared to other PCR-based techniques. DNA sequences generated from Phi29-amplified material were 2.9–3.8% lower in GC-content than those from the unamplified material, suggesting a relationship between amplification bias and GC-content. However, over-amplification of certain sequences could not be explained by any of the previously mentioned sources of bias suggesting a need to directly investigate the nature of regions prone to over- or under-amplification. Although the study was of high resolution, direct comparison of the results from this study with those using human samples is difficult due to differences in chromosome organization, size and composition.

In this study, we investigated amplification bias resulting from whole genome amplification on DNA from fresh-frozen human tissues using the Affymetrix 500k Mapping Array Set. We quantified the effects of WGA on microarray signal and background noise, localized and statistically analysed genomic regions of WGA-induced bias, and directly compared the ability to resolve CNVs in comparisons of unamplified and amplified material.

MATERIALS AND METHODS

Tissue material and DNA extraction

Normal lymph nodes from three individuals were fresh frozen in Optimal Cutting Temperature (OCT; Sakura Finetek, Torrance, CA) compound and stored at −80°C by the service pathology laboratory at the BC Cancer Agency. Genomic DNA was extracted from these sources using the Gentra PureGene DNA purification kit (Gentra Systems, Minneapolis, MN). Prior to labelling and microarray hybridization, the genomic DNA was quantified using a NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, DE). Prior to whole genome amplification, the genomic DNA was diluted to ~1.5 ng/μl and quantified using a PicoGreen assay (Invitrogen, Carlsbad, CA). To ensure consistent DNA quality across all samples, the DNA was visualized on an agarose gel to confirm the presence of undegraded, predominantly high molecular weight (>10 kb) DNA.

Whole genome amplification

We used Qiagen's Repli-G Mini whole genome amplification kit and protocol (QIAgen, Valencia, CA) to amplify 7 ng of PicoGreen-quantified DNA from fresh frozen samples to generate >10 µg of high molecular weight DNA. We performed the isothermal amplification reaction in 1.5 ml microcentrifuge tubes incubated in a 30°C water bath for 18 h and inactivated the enzyme by incubating the tubes in a 65°C water bath for 3 min. The amplified products were purified and quantified as described in the previous section and the amplification products were visualized on a 0.8% agarose gel stained with SYBR Green (Invitrogen, Carlsbad, CA).

Labelling and hybridization to the Affymetrix 500k array

500 ng samples of DNA were processed following the instructions in the GeneChip Mapping 500K manual (Affymetrix, Santa Clara, CA). Briefly, 250 ng of DNA was digested using one of two restriction enzymes, Nsp I or Sty I, and ligated to Nsp I or Sty I adaptors. These adaptor-ligated fragments were amplified by PCR and the purified products quantified using a Bio-Tek PowerWave X spectrophotometer and the concentration normalized to 2 µg/µl. The normalized products were then fragmented and labelled as described in the manual. Samples were hybridized to the GeneChip Human Mapping 250K Nsp or Sty array in an Affymetrix Hybridization Oven 640. Washing and staining of the arrays were performed using an Affymetrix Fluidics Station 450. Images of the arrays were obtained using an Affymetrix GeneChip Scanner 3000.

Sample preparation for NimbleGen 385k CGH array

Samples of >2.5 µg of DNA were prepared following the instructions provided by NimbleGen Systems Inc. (NimbleGen Systems Inc, Madison, Wisconsin). Briefly, purified samples were concentrated to 250 ng/µl and analysed for quality on an agarose gel. Samples were then shipped on ice to NimbleGen for subsequent labelling and hybridization to the 385k Human Whole-Genome CGH array.

Genotype and copy number analysis

Genotype calls were derived from microarray images using the GTYPE v4.0 software program (Affymetrix, Santa Clara, CA). We detected CNVs in individual samples using comparisons to a common reference data set and comparisons between pre- and post-amplification sample pairs (Figure 1). These were performed using a software pipeline (Figure 1) that utilizes the Affymetrix Chromosome Copy Number Analysis Tool (CNAT) version 4.0 (Affymetrix, Santa Clara, CA) and an exhaustive t-score optimization algorithm.

Experimental design. (A) In this study, we aimed to assess the impact of WGA on the detection of CNVs, to explore copy number biases induced by this technique, and to assess the use of pair-wise analysis to address such biases. To this end, DNA samples...

To analyse sample pairs on the Affymetrix platform, we used CNAT to perform quantile normalization of probe intensities from the samples and calculated log2 intensity ratios for each probe set on the array. For unpaired analysis of individual samples against a common reference set, we used a set of average probe intensities from the reference set in place of the second sample. The reference set used for this purpose, referred to hereafter as the ‘Affy48 reference set’, was downloaded from the Affymetrix website (http://www.affymetrix.com/support/technical/sample_data/500k_data.affx) and consisted of 48 samples representing five HapMap CEPH trios, five HapMap Yoruban trios, three other non-HapMap trios, and nine unrelated HapMap Asian samples. To analyse sample pairs on the NimbleGen platform, we used qspline normalized data and log2 intensity ratios provided by NimbleGen for each probe on the array.

To identify significant deviations in the log2 ratio data from both platforms, the following t-score optimization algorithm was used. First, log2 ratios were sorted by genome coordinate and moving windows representing a number of adjacent probes were subjected to a t-test against the rest of the data outside of the window on the same chromosome. This was done across the entire genome for all window sizes from 3 to 30 probe sets for the Affymetrix and NimbleGen data. To establish a comparison-specific false-positive threshold, the order of log2 ratios was then randomized and moving window t-tests were recalculated. Two t-score thresholds, one for amplifications and one for deletions, were then defined at which no amplifications or deletions were identified in the randomized data. These thresholds were then applied to the t-scores derived from the original data and regions with t-scores exceeding these thresholds were identified. To identify apparent variants impacting regions larger than our largest moving window size, t-scores were optimized for aberrations encompassing more than 27 probe sets using larger and larger windows until a local maximum t-score was found. As no CNVs met the false positive thresholds set for the NimbleGen data, a 50 probe window was used to detect statistically significant CNVs and a comparison-specific false positive threshold was not applied.

In the analysis of recurrent WGA-induced artifacts, several sets of genomic coordinates were defined based on the human genome reference sequence Build 36/hg18 (released March, 2006) downloaded from the NCBI website (http://www.ncbi.nlm.nih.gov/). To define a set of regions that were consistently over- or under-amplified by the whole genome amplification technique, we analysed apparent variants arising from our comparison of matched pre- and post-WGA samples for overlapping genomic coordinates across all three comparisons and defined minimal overlapping regions (Supplementary Tables 1 and 2). These minimal overlapping regions were defined as the smallest region overlapped by a WGA-induced variant in all three comparisons. To define a subset of recurrently under-amplified chromosome ends, the first or last 2.5% of the reference genome sequence of any chromosome was recorded if it was impacted by a region consistently under-amplified by the WGA technique. To serve as reference sets representing the remainder of the human genome, random sets of coordinates were generated with equivalent size distributions for the regions consistently over- or under-amplified by the whole genome amplification technique and for the subset of recurrently biased regions affecting chromosome ends. In these reference sets, 10 random segments were generated with sizes corresponding to each entry in the list of regions affected by WGA-induced bias (i.e. 1900 amplifications and 750 deletions). The GC and repeat content of each entry in the above sets of coordinates were calculated in the following manner. For each set, the genomic sequence for each coordinate was downloaded from the Ensembl database (http://www.ensembl.org). To calculate the GC content of the sequence, the number of Gs and Cs in the sequence was counted and that number divided by the total length of the sequence. To calculate the repeat content of the sequence, the coordinates of the UCSC Genome Browser ‘Simple Repeats’ track generated by Tandem Repeats Finder (23) was used to identify base pairs belonging to repeat sequences. The number of these base pairs was then divided by the total length of the sequence to give the percentage of repeat sequence in the region. As most of the sets were not normally distributed in GC or repeat content as found by the Jarque-Bera test, the two-sample Kolmogorov-Smirnov test (KS test) was used to test whether these sets differed in their distribution of these two parameters.

RESULTS

Array noise and CNV in samples pre- and post-WGA

To establish a base line for array noise and CNV detection prior to amplification, each unamplified DNA sample was compared to the Affy48 reference set (Methods; Figure 1b) and candidate CNVs were identified. This comparison versus the Affy48 set was then repeated using amplified samples. As a measure of array noise, we quantified the distribution of log2 ratios resulting from these comparisons by calculating the mean, standard deviation (SD), and interquartile range (IQR) (Table 1, Figure 2). As expected due to normalization by CNAT4, the mean log2 ratios from both unamplified and amplified samples were very close to zero. The SDs and IQRs of log2 ratios from amplified samples were nearly twice those of the unamplified samples suggesting an increase in array noise using WGA material.

Boxplots comparing the spread of log2 ratios in unamplified and amplified samples. The log2 ratios resulting from comparison of each sample against the Affy48 reference set were plotted using a standard box and whisker plot displaying a five number summary:...

Distribution of log2 ratios from comparison of unamplified and amplified samples versus a common reference set of 48 individuals

To compare the CNVs detected pre- and post-WGA, we counted apparent CNVs with p-values more significant than each comparison's false-positive detection limit (Table 1, Figure 3). The analysis of unamplified samples detected 13 candidate CNVs, 11 of which overlapped the coordinates of genomic variants listed in the Database of Genomic Variants (http://projects.tcag.ca) (5) (Table 2). In contrast, the analysis of the amplified samples identified 1572 apparent CNVs, an approximately 100-fold increase in the number of apparently significant amplifications and deletions versus the unamplified samples (Table 1). These artifactual CNVs are likely the result of WGA-induced biases.

Apparent CNVs in unamplified and amplified samples. The number of variants detected in unamplified and amplified samples from comparison against the Affy48 reference set were counted. The amplified samples appear to contain hundreds of CNVs not seen in...

Apparent amplifications and deletions detected prior to amplification through comparison with a reference set of 48 individuals

To assess experimental variation prior to amplification, each unamplified and amplified sample was subjected to a pair-wise comparison against an experimental replicate of itself (Table 3). The lack of fluctuation in mean, SD and IQR in the log2 ratios from unamplified replicates suggests a high degree of reproducibility of the array method used. Similarly, while still elevated relative to unamplified samples, there is no major fluctuation in these values between amplified replicates further supporting the notion that the WGA method behaves consistently. However, the values obtained from unamplified samples versus values obtained from amplified samples, using the Affy48 reference set, showed a substantial decrease in SDs and IQRs. This indicates that amplified samples produce different signal intensity distributions than unamplified samples, suggesting that comparison of amplified to unamplified data sets is potentially problematic.

Distribution of log2 ratios from pair-wise comparison of experimental replicates of unamplified and amplified samples

CNVs induced by whole genome amplification

To identify apparent CNVs arising from non-uniform amplification bias in the WGA technique, data from paired pre- and post-WGA samples were directly compared to each other (Figure 1b). Our analysis identified apparent WGA-induced over- and under-amplifications in each of the three comparisons of amplified versus unamplified material. In sample 1, we detected 502 amplifications (P-value threshold of detection, P < 1.68 × 10−6) and 580 deletions (P < 1.71 × 10−8). In sample 2, we detected 467 amplifications (P < 1.68 × 10−6) and 202 deletions (P < 1.64 × 10−8). In sample 3, we detected 546 amplifications (P < 1.68 × 10−6) and 259 deletions (P < 3.45 × 10−8). Our analysis also revealed a set of 265 recurrent apparent WGA-associated aberrations that were detected in all three comparisons. This set consisted of 190 over-amplifications (Supplementary Table 1) and 75 under-amplifications (Supplementary Table 2). 39 of these regions overlapped one of the 92 regions of bias (31 of 62 over-amplifications, 8 of 30 under-amplifications) identified by three previous studies (16,17,20). 110 of the regions we identified overlapped genomic regions with known CNVs (2) (64 over-amplifications, 46 under-amplifications) but there was no correlation between regions susceptible to WGA-associated bias and known CNVs (P = 1.00). In a set of 2650 random genomic coordinates with the same size distribution as the WGA-induced artifacts, 36.26% overlapped a known CNV, a proportion near the 41.51% overlap observed with the set of WGA-induced biases.

The minimal overlapping regions (see Methods) of WGA-induced over-amplifications ranged from 2207 bp to 357 399 bp with a median size of 58 961 bp, an IQR of 66 524 bp and encompassed 13.6 Mbp of the reference human genome sequence. These recurrently over-amplified sites were distributed throughout the genome and had a statistically significant increase in GC content relative to a set of 1900 random genomic segments with identical size distribution (P = 8.36 × 10−40). These over-amplified sites were also enriched for repeat sequences relative to the set of 1900 random genomic segments (P = 1.76 × 10−6). These results are compatible with the notion that over-amplification by the WGA technique is related to the GC and repeat content of the underlying sequence.

The minimal overlapping regions of the recurrent WGA-induced under-amplifications ranged from 5206 bp to 1.93 Mbp with a median size of 75 698 bp, an IQR of 64 619 and encompassed 8.37 Mb of the reference human genome sequence. These regions of under-amplification appeared to fall into two groups: those near chromosome ends and those distributed throughout the genome. Comparison of the 54 under-amplified sites distributed throughout the genome with a set of 540 random genomic segments with identical size distribution found no statistically significant difference in GC content (P = 0.0796) or repeat sequences (P = 0.1901). However, the under-amplifications were greatly depleted for GC-rich regions compared to the over-amplifications (P = 1.93 × 10−5) which supports the notion that WGA amplification efficiency is related to the GC content of the underlying sequence. A plot of GC content versus copy number shows a trend of increasing amplification magnitude (i.e. increasing copy number) with increasing GC content (Figure 4).

Copy number distribution and GC content of WGA-induced CNVs. The number of variants and percentage GC content were plotted against copy number magnitude for all of the CNVs detected by comparisons of each pre- and post-WGA sample pair. There appears to...

Of the 39 chromosome ends (see Methods) assayed by probe sets, 15 contained regions of under-amplification (Table 4). Only three chromosome ends contained over-amplifications, suggesting that under-representation of chromosome ends is a consistent result of whole genome amplification. The set of chromosome end under-amplifications impacted 2.547 Mbp of the reference human genome sequence and the GC content was statistically greater than that of a set of 150 random genomic segments with identical size distribution (P = 1.12 × 10−6). However, there was no statistical difference in GC content been the under-amplified chromosome ends and the 25 appropriately amplified chromosome ends (P = 0.8215). This suggests that amplification bias due to GC content does not play a role in under-amplification of specific subtelomeric regions. Under-amplified chromosome ends were enriched for repetitive sequences (see Methods) relative to both a set of 150 random genomic segments with identical size distribution (P = 1.52 × 10−9) and the 25 assayed chromosome ends that were not under-amplified (P = 0.0022) suggesting that increased repeat content of specific chromosome ends may result in their under-amplification.

To assess WGA-induced CNV artifacts using a second array platform, we compared pre- and post-amplification sample pairs in three comparative genome hybridization (CGH) experiments using the NimbleGen 385k array. The log2 ratios from these experiments were widely distributed (average SD = 0.378, average IQR = 0.457) and while several thousand CNVs were detected, none were identified with p-values passing the stringent false positive thresholds set by our algorithm due to the high level of noise in this data (P < 3.51 × 10−7 for over-amplifications, P < 3.30 × 10−11 for under-amplifications). Analysis of this data using a 50 probe moving window without filtering for false positives detected 2116 WGA-induced CNVs (466 over-amplifications, 1650 under-amplifications) of which 141 occurred in all three comparisons (29 over-amplifications, 112 under-amplifications). Despite their relatively large size (average = 1.06 Mb, median = 0.36 Mb, SD = 4.10 Mb), only 28 of these overlapped recurrent artifacts detected by the Affymetrix comparisons (17 of 190 over-amplifications, 11 of 75 under-amplifications). This amount of overlap is similar to that seen with a random set of 2116 random genomic coordinates with the same size distribution as the CNVs detected by the NimbleGen platform of which 65 overlapped a WGA-induced CNV detected by the Affymetrix platform. These results suggest that these are artifacts resulting from the difficulty in distinguishing real CNVs from background noise when co-hybridizing amplified and unamplified samples even when a large moving window of 50 probes is used.

Use of amplified material for pair-wise copy number comparisons

To assess the use of WGA material in pair-wise comparisons, each sample was compared to the other samples one-by-one and relative differences in copy number in the three samples assessed using: (i) unamplified samples versus unamplified samples, (ii) amplified samples versus unamplified samples, and (iii) amplified samples versus amplified samples (Figure 1d). An example of the output from one such set of comparisons is illustrated in Figure 5.

Example of how a pair-wise comparison of amplified material can partially compensate for WGA-induced bias. Shown is the output of three copy number analyses conducted using our CNV discovery software pipeline. Copy number, calculated directly from log...

The unamplified versus unamplified comparisons identified 21 apparent differences in copy number among the three samples (Tables 5 and ​and6).6). These pair-wise comparisons identified 5 of 13 apparent differences expected from the individual comparisons of samples to the Affy48 reference set. Twelve of these apparent differences, including the five differences expected from comparison with the Affy48 set, overlap variants listed in the Database of Genomic Variants (http://projects.tcag.ca). The amplified versus unamplified comparisons identified 3207 apparent differences in copy number among the three samples (Table 5). Only seven of these apparent differences were detected by both unamplified/amplified and amplified/unamplified comparisons suggesting that systematic WGA-induced variants and random WGA-reaction variability mask real events.

Copy number variants detected by pair-wise comparisons of unamplified and amplified sample sets

The amplified versus amplified comparisons identified 275 apparent differences in copy number among the three samples (Table 5). These amplified versus amplified comparisons identified 2 of the 12 apparent amplifications and 5 of the 9 apparent deletions seen in the unamplified comparisons (Table 6), suggesting that pair-wise comparisons of material where both samples have been subjected to WGA can partially compensate for reproducible WGA-induced bias (Figure 5). The most significant deletion identified by each unamplified comparison was recapitulated as the most significant deletion identified by the corresponding amplified comparison (Table 6). This was also true of the most significant amplification in two of the three comparisons (Table 6). The list of variants detected at lower levels of significance than these top scoring events may still contain real CNVs although it is difficult to isolate these from the remaining artifactual events resulting from random experimental variation without independent validation of each one.

Validation of WGA pair-wise comparisons for copy number detection

To determine the extent to which amplified pair-wise comparisons mask known, validated CNVs, DNA from the blood of three father/child pairs with previously described CNVs (9) were subjected to WGA and copy number analysis using the 250k Nsp chip of the Affymetrix 500k set. The original analysis of unamplified DNA performed using the Affymetrix Mapping 100k SNP array set (9) identified a total of 32 CNVs within the three father/child pairs of which five (two amplifications, three deletions) were validated by conventional cytogenetic analysis (Table 7).

The amplified child versus amplified father comparisons identified 63 CNVs in copy number in total within the three pairs. Analysis of amplified family pair #8379 identified 41 copy number differences (13 relative amplifications P < 3.48 × 10−6, 28 relative deletions P < 8.38 × 10−8), analysis of amplified family pair #1280 identified six copy number differences (two relative amplifications P < 2.14 × 10−6, four relative deletions P < 1.05 × 10−8), and analysis of amplified family pair #3476 identified 16 copy number differences (six relative amplifications P < 2.07 × 10−6, 10 relative deletions P < 6.09 × 10−9). These copy number differences were then ranked by P-value (most significant to least) and the coordinates compared to those of the validated aberrations. The amplified versus amplified comparisons identified four of the five CNVs (two amplifications, two deletions) validated by FISH (9) and each received the lowest P-value for its comparison (Table 7). The single validated CNV that was not detected by the amplified comparisons may have been missed due to a difference in array coverage at this site. On the 250k Nsp array, this region was covered by three probe sets (10 683 bp/probe set) compared to six probe sets (5341 bp/probe set) on the 100k array. This was also the smallest feature of the set of validated CNVs (0.03 Mb) and may reflect a decrease in detection sensitivity when using amplified comparisons. Among the top-ranked variants (i.e. those with the most significant P-values), six variants were identified by the 250k WGA experiment that were not detected by the original experiments. Five of these are covered by six or fewer probe sets (5743–93 452 bp/probe set, one with no probes) on the 100k array. In addition to the possibility of an increased false positive rate due to increased array noise, differences in each array's probe coverage may explain why these regions were only detected by the experiment using amplified samples.

Genotype fidelity

To compare the fidelity of genotype calls derived from WGA product to those from corresponding unamplified samples, data from matched pairs of these sources were compared. Average genotype call rates (±1 SD) were 96.74 ± 1.14% from the unamplified samples and 93.14 ± 2.68% from the WGA samples, suggesting a modest degree of information loss following amplification. Of the SNPs which were unsuccessfully called in the amplified samples, only 2% were common to all three samples and only one of these fell within a region of WGA-induced bias (an over-amplification). Genotype concordance was 98.57 ± 0.53% between calls successfully made from both amplified and unamplified samples in each matched pair. There was very little overlap in the coordinates of SNPs with non-concordant genotypes and regions of recurrent WGA-induced bias. Of the non-concordant calls, 58.77% were called heterozygotes in the unamplified sample and homozygotes in the amplified sample (i.e. AB called as AA or BB) and 0.2% of these were located in regions of WGA-induced over-amplification while none were in regions of WGA-induced under-amplification, 40.66% were called homozygotes in the unamplified sample and heterozygotes in the amplified sample (i.e. AA or BB called as AB) of which none were located in regions of WGA-induced bias, and 0.57% were incorrectly called homozygotes (i.e. AA called as BB or BB called as AA) of which none were located in regions of WGA-induced bias. Twelve regions each containing 3–7 SNPs were identified as displaying loss of heterozygosity (LOH) in total from the three pre- and post-amplification comparisons. Three of the LOH regions had an allele-specific copy number of 3 while the others had a copy number of 2. These regions impacted a total of 58 SNPs, 0.01% of all of the SNPs assayed, and none overlapped a region recurrently over- or under-amplified by WGA. These results suggest that increased random array noise is likely a greater source of genotype non-concordance than systematic allele-specific amplification bias or polymerase error.

DISCUSSION

The ability to discover CNVs in unamplified human DNA using data generated by the Affymetrix Mapping SNP array platform has been previously demonstrated by our group and others (1–3,9). However, with small amounts of DNA, from tumour biopsies for example, amplification of the starting material prior to discovery of CNVs is often necessary to generate enough material to conduct such analyses. We aimed to assess the nature of biases that are introduced by this amplification, and to determine their impact on copy number detection and whether pair-wise comparisons could compensate for these biases. For the first time, we have used a high resolution microarray platform to explicitly define regions susceptible to WGA-induced bias, statistically assessed the sequence features underlying these biases, and demonstrated an ability to correct for these biases and resolve real CNVs. In this study, three unamplified DNA samples were used to establish a base line for array noise and CNV detection. These were compared to the same DNA samples that were amplified in duplicate using a WGA technique. The apparent CNVs we detected by comparing unamplified samples to the unamplified Affy48 reference set were likely real events, as the variants were relatively large, statistically significant, and 11 of the 13 CNVs corresponded to previously documented genomic variants (5). While our variant detection approach adjusts its threshold of significance based on the level of noise of each array, comparisons using amplified samples still identified hundreds of apparent CNVs not seen in the unamplified comparisons on the Affymetrix array platform. Since these comparisons were performed against an unamplified reference, it is likely that these artifactual apparent CNVs were the result of preferentially amplifying of regions of the genome and not due to an increased level of array noise. The data from the NimbleGen platform appeared to have a high level of noise that affected our ability to detect WGA-induced CNVs when co-hybridizing unamplified and amplified samples. Our results suggest that amplified and unamplified samples cannot be directly compared to uncover WGA-induced artifacts using the NimbleGen CGH array. However, this should not preclude the comparison of similarly amplified samples on this platform as we have shown using Affymetrix arrays that the biases are largely systematic and the noise is reduced substantially when comparing two amplified samples.

To explore the nature of this bias, we directly compared Affymetrix data from pre- and post-amplification sample pairs and observed a set of regions apparently over- or under-amplified in all three samples. These regions impacted a total of 21.97 Mb of sequence, consisted of 190 over-amplifications and 75 under-amplifications, and overlapped 39 of 92 regions of WGA-induced bias identified by other studies (16,17,20). The low amount of overlap is perhaps due to differences in genome coverage by the arrays used in these studies, particularly as there was no previous consensus on any region being susceptible to WGA-induced bias. Results reported are for DNA amplified using the QIAgen Mini kit and it is conceivable that DNA amplified using different protocols will exhibit different bias. While the lack of a correlation between regions of WGA-induced bias and known CNVs is different from a previous observation (17), we have demonstrated that the degree of overlap of the amplification biases we identified with known CNVs is only slightly greater than would be expected by chance. The amount of overlap observed is likely due to the fact that documented CNVs are generally large, 165 kb on average, and, in total, impact ~27% of the genome.

The difference in size and size distribution of the over- and under-amplifications that we identified suggests focal over-amplification of specific sequences and broader under-representation of others. We observed a direct relationship between amplification efficiency and GC-content as over-amplified regions had a statistically significant increase in GC content relative to the deletions (P = 1.93 × 10−5) and the magnitude of over-amplification appeared to scale directly with GC richness (Figure 4). These results are consistent with the notion that WGA-induced over-amplification bias is related to the increased binding affinity of GC-rich hexamers relative to AT rich hexamers and not a shortage of hexamers corresponding to repetitive regions in the genome. There is also the possibility that, unlike many polymerases, Phi29 polymerase is more efficient in synthesizing GC-rich sequences, thereby resulting in over-amplification of these regions. These effects likely also contribute to under-amplification of GC-poor regions distributed throughout the genome but not likely the loss of chromosome ends. The lack of a relationship between regions of WGA-induced bias and the presence of known CNVs suggests that different mechanisms account for these phenomena.

The loss of chromosome ends appears to be a consistent result of the WGA procedure as 15 of the 39 ends assayed were under-amplified in all samples compared to only three that were over-amplified. Relative to chromosome ends that were not affected by bias, the under-amplified ends were enriched for repetitive sequences (P = 0.0022) but did not have a statistically significant difference in GC content (P = 0.8215). These results suggest that the source of amplification bias at chromosome ends is different from GC-content-derived biases affecting the rest of the genome. One possible explanation is the positional effect of having fewer overlapping amplification products at the ends of linear stands of DNA than in the middle. However, if this were the case then all chromosome ends should be similarly under-amplified which they are not. Another possible explanation is that the limited quantities of hexamers corresponding to subtelomeric repeats result in fewer priming events in these regions. This may account for the loss of repetitive chromosome ends more frequently than less repetitive ends.

We found that samples subject to Phi29-based WGA can be used for accurate genotyping, albeit with some data loss. From the WGA samples, we consistently observed a decrease in the average number of genotype calls and a wider range of call rates compared to those from the unamplified samples. However, of the genotype calls that were made, over 98% were concordant between amplified and unamplified sample pairs. The less than 2% non-concordant calls were 99.43% discrepant heterozygotes (i.e. AB called as AA or BB, AA or BB called as AB), rather than incorrectly called homozygotes, and nearly none (<0.12%) were located in regions of WGA-induced bias. This discrepancy rate is very near that observed between unamplified replicates on the Affymetrix 500k array (24). It is likely that the source of genotype call non-concordance is related to the genotyping accuracy of the array in the presence of increased noise due to WGA and not truly genotype changes induced by WGA through allele-specific amplification or polymerase error.

Regardless of the source of the systematic biases induced by WGA, we have shown that pair-wise analysis of amplified samples is a viable strategy for CNV detection, albeit with an appropriate threshold of significance to filter the number of low-significance random artifacts induced by this technique. While the greater number of apparent copy number differences detected using amplified samples has the potential to mask real events, we observed that pair-wise comparisons of such samples can detect real differences between samples. On comparing amplified samples to amplified samples, the number of artifactual copy number differences is reduced by an order of magnitude relative to comparisons of amplified versus unamplified samples due to the systematic nature of the bias induced by the technique. Conceivably, the use of a large, amplified reference set would be a practical alternative to pair-wise comparisons for larger batches of amplified samples requiring a universal reference. Of the apparent copy number differences detected by the three pair-wise comparisons using unamplified material, all of the top deletions and two of the three top amplifications were identified as the most significant by the corresponding comparisons using amplified material. By applying this technique to paired child/father samples with known, validated copy number differences (9), four of the five validated differences detected by the original study using unamplified DNA were the most significant in the same comparisons using amplified DNA. The only validated CNV that was missed using WGA material was due to a difference in coverage by the array platforms used. A similar difference in coverage partially explains the presence of six high confidence CNVs detected by the WGA experiments not seen in the original study as one of these has recently been observed in the unamplified material using a higher resolution platform. Therefore, when evaluating the results from amplified comparisons, CNVs with the top ranked significance are more likely to be real CNVs in the unamplified sample.

SUPPLEMENTARY DATA

ACKNOWLEDGEMENTS

The authors gratefully acknowledge the expert technical assistance of Susanna Chan, Jennifer Asano and Adrian Aly of the Affymetrix Array group at the Genome Sciences Centre, BC Cancer Agency. Support for this work and funding to pay the Open Access publication charges for this article was received from the BC Cancer Foundation, the Canada Foundation for Innovation, and Hoffmann-La Roche Ltd. of Canada. TJP is a Senior Graduate Trainee of the Michael Smith Foundation for Health Research (MSFHR) and the BC Cancer Foundation and has been supported by fellowship grants from Eli Lilly and the University of British Columbia. MG is a Senior Graduate Trainee of the MSFHR and Genome BC and is supported by a fellowship from the Natural Sciences and Engineering Research Council. MAM is a senior scholar of the MSFHR and a Terry Fox Young Investigator.