Author affiliations

Citation and License

BMC Genetics 2011, 12:29
doi:10.1186/1471-2156-12-29

Published: 7 March 2011

Abstract

Background

Array-based detection of copy number variations (CNVs) is widely used for identifying
disease-specific genetic variations. However, the accuracy of CNV detection is not
sufficient and results differ depending on the detection programs used and their parameters.
In this study, we evaluated five widely used CNV detection programs, Birdsuite (mainly
consisting of the Birdseye and Canary modules), Birdseye (part of Birdsuite), PennCNV,
CGHseg, and DNAcopy from the viewpoint of performance on the Affymetrix platform using
HapMap data and other experimental data. Furthermore, we identified CNVs of 180 healthy
Japanese individuals using parameters that showed the best performance in the HapMap
data and investigated their characteristics.

Results

The results indicate that Hidden Markov model-based programs PennCNV and Birdseye
(part of Birdsuite), or Birdsuite show better detection performance than other programs
when the high reproducibility rates of the same individuals and the low Mendelian
inconsistencies are considered. Furthermore, when rates of overlap with other experimental
results were taken into account, Birdsuite showed the best performance from the view
point of sensitivity but was expected to include many false negatives and some false
positives. The results of 180 healthy Japanese demonstrate that the ratio containing
repeat sequences, not only segmental repeats but also long interspersed nuclear element
(LINE) sequences both in the start and end regions of the CNVs, is higher in CNVs
that are commonly detected among multiple individuals than that in randomly selected
regions, and the conservation score based on primates is lower in these regions than
in randomly selected regions. Similar tendencies were observed in HapMap data and
other experimental data.

Conclusions

Our results suggest that not only segmental repeats but also interspersed repeats,
especially LINE sequences, are deeply involved in CNVs, particularly in common CNV
formations.