Perform Forward-Backward Fragment-Annealing segmentation

faseg(FASeg)

R Documentation

Perform Forward-Backward Fragment-Annealing segmentation

Description

Given a matrix of chromosome location, and signal strength (can be multiple experiments), this function will perform test-based segmentation for each experiment, and return a same-sized matrix with fitted values.

Usage

faseg(x, sig = 1e-4, delta = 0.2, smooth.range = 75, no.change = NA, show.progress = FALSE, log.transform = TRUE, fine.tune = FALSE, gap = Inf)

Arguments

`x`	a matrix with ncol>=3. The first column is chromosome label; the second column is the physical location on that chromosome; the third column (and beyond) is signal strength.
`sig`	The significance cutoff value for LRT in segmentation.
`delta`	A change above this value will qualify a point as possible edge.
`smooth.range`	A parameter for the initial smoothing. The range of loess smoothing. It is expressed in terms of number of probes. This number divided by the total number of probes in the chromosome will be used as the span parameter in smoothing.
`no.change`	The target value for no copy number change. The default for it is NA. If provided, the fitted data will be re-scaled based on it.
`show.progress`	If true, the initial fragmentation and further eliminating of edges will be shown graphically.
`log.transform`	If the data fed to this program is the copy number, it is recommended that a log2 transformation is done. However, feeding already log transformed data and asking for log transform again may result in errors.
`fine.tune`	If true, after edge selection, each edge is moved in a small neighborhood to find optimal position. This is useful when large smooth.range is used and edge positions for small segments becomes inaccurate.
`gap`	If two neighboring probesets have a distance that is larger than this value, the data is separated at this gap and analyzed as if from two different chromosomes.

Details

Fitting is done one experiment (column), one chromosome at a time.

First, a loess fit is performed, using span equal to the edge.span parameter divided by total number of probes on that chromosome. Then the delta between consecutive points are taken. If there is a run of deltas with the same sign, they will be merged onto the center point. Then all deltas smaller than the "delta" parameter are assigned zero. All positions corresponding to non-zero deltas are considered edges.

Second, at each edge, find the ANOVA p-value for merging at this edge, for the data between the previous and the next edge.

Third, start merging edges from the one with the largest p-value. After merging at each edge, all the neighboring edges that were previously merged are re-examined. If the p-value is smaller than the one currently being merged, that edge will be reinstated. The p-value of the two neighboring edges are also recalculated. The procedure is iterated until all remaining p-values are smaller than the sig parameter.

Value

A matrix (or data frame, depending on the input) is returned. It has the same attributes as the input matrix. The first two columns are the same as the input matrix. All the other columns are filled with fitted values.

Note

Author(s)

Tianwei Yu (tyu8@sph.emory.edu)

References

Examples

data(sample.orig)
sample.fitted<-faseg(sample.orig,sig=1e-6,no.change=2)

[Package FASeg version 2.0 Index]