A new algorithm can be submitted to peter_park@harvard.edu The algorithm should be able to take parameters either from command line or as formData and write results in a file. We will not be responsible for debugging the code if the algorithm does not work. The author is also requested to submit a brief note about the algorithms' memory requirement during runtime. Any questions regarding the algorithm submission can be directed to CGHweb.contact@gmail.com

Select our 3 preferred algorithms

Our three
preferred algorithms are
(1) Circular Binary Segmentation
(Olshen et al, /Biostatistics, /2004)
(2) Gaussian model with adaptive penalty
(Picard et al, /BMC Bioinformatics, /2005) and
(3) Fused Lasso
(Tibshirani and Wang, /Biostatistics /2007).
The first two were chosen based on their excellent performance
on our comparative study (Lai etal, /Bioinformatics, /2005),
the third is a more recent method that appears to work well.

Select all algorithms

Clear selected algorithms

BioHMM

BioHMM uses a heterogeneous hidden Markov model. By default, it considers the distance between probes when estimating its parameters and gives higher probabilities to probes that are further apart than others. This algorithm is called from the snapCGH package, which is available in BioConductor.

Parameters:Use clone distances: tells the algorithm either to consider probe distances in its calculations or assume a homogeneous hidden Markov model instead.Enabled by default.

CBS estimates the location of change-points by calculating a likelihood-ratio statistic for each probe and assessing its significance by permutation. This algorithm is called from the DNAcopy package, which is available in BioConductor.

FASeg uses lowess to find the location of possible breakpoints and conducts local ANOVA to identify significant breakpoints. This algorithm is called from the FASeg package, which is available from http://www.sph.emory.edu/bios/FASeg/

cghFLasso smoothes the data with the fused lasso, a spatial smoothing technique. This algorithm is called from the cghFLasso package, which is available from http://www-stat.stanford.edu/~tibs/cghFLasso.html. Because of cghFLasso's memory requirements, this website will divide the chromosome into smaller pieces if it has more than 10000 probes.

Parameters:FDR: False discovery rate (the proportion of true null hypotheses among those called significant).
Default: 0.05 Range: min > 0.0 max < 1.0 Use this value: tells the algorithm to use FDR to determine significant segments.
Disabled by default Recalculate Segment Means: A post-processing step (not part of cghFLasso) to recalculate the segment means after finding the breakpoints with cghFLasso.
Disabled by default.

CGHseg estimates breakpoints by making a cost matrix, finding all possible breakpoints from this matrix, and selecting the most likely number of breakpoints with adaptive penalty. The algorithm has been rewritten in C based on the MATLAB code provided by Picard et al. on their website (http://www.inapg.fr/ens_rech/maths/outil_A.html).
Because of CGHseg's memory requirements, this website will divide
the chromosome into smaller pieces if it has more than 10000 probes..

GLAD smoothes the data with likelihood-based adaptive weights smoothing, removes extraneous breakpoints with a penalized likelihood, and groups the segments with unsupervised clustering. This algorithm is called from the GLAD package, which is available in BioConductor.

LOWESS smoothes the data with robust weighted local polynomial fitting. Probes inside the smoothing window are weighted according to their distance from the center, with the more distant probes having less weight. This algorithm is called from the stats package of R.

Parameters:Width: the number of probes to use when calculating the weights around each probe in the lowess smoother.
Default: 15 Range: min 5 max 50

Citation: Cleveland, W. S. (1981). LOWESS: A program for smoothing scatterplots by robust locally weighted regression. The American Statistician
35:54.

Width

( min 5 max 50)

Wavelet smoothing

Wavelet smoothing smoothes the data by transforming the data into frequency components with maximal overlap discrete wavelet transform. The transformed data are filtered through soft SURE thresholding and then transformed back to the time domain to get the smoothed data. This approach is similar to the procedure described in Hsu et al. (2005). This algorithm uses functions from the waveslim package, which is available in the Comprehensive R Archive Network.

Quantile smoothing uses penalized quantile regression to find trends in the data. The code follows closely to the R code outlined in the Eilers and de Menezes (2005) paper, except we use the sparse implementation of the Frisch-Newton interior-point algorithm. The results represent the 50th quantile. This algorithm uses functions
from the quantreg package, which is available in the Comprehensive R Archive Network.

This method takes the average of probe values inside a smoothing window. The code is written in C.

ParametersWidth: The number of probes to use around a probe when calculating their means
Default: 15 Range: min 5 max 50

Citation:NONE

Width

( min 5 max 50)

Specify plotting parameters

These parameters adjust the minimum and maximum log-ratios to show in the graphical results. Log-ratios below the minimum are drawn at the minimum value. Log-ratios above the maximum are drawn at the maximum value.

This is a simple way to identify gains and losses in the processed data. Gains are called for regions above the positive threshold. Losses are called for regions below the negative threshold.

Threshold

( min 0.01 max 0.5)

Email Address [required]

Please enter a valid email address. A link to your results will be sent via email

Data File

Upload your data file

Upload example data file

DataFile to upload:

Multiple Array File

You can upload data from multiple arrays using this feature. Once the computation is finished, you will receive an emails with the link to the profiles of each array. Because this batch job imposes a substantial load on our server, it will be processed when the server load is low. Please make sure that the data file has been formatted as specified here..