getPotentialDIMP {MethylIT}R Documentation

Potential methylation signal

Description

This function perform a selection of the cytosine sites carrying the potential methylation signal. The potential signals from controls and treatments are used as prior classification in further step of signal detection.

Usage

getPotentialDIMP(LR, nlms = NULL, div.col, dist.name = "Weibull2P",
  absolute = FALSE, alpha = 0.05, tv.col = NULL, tv.cut = NULL,
  min.coverage = NULL, hdiv.col = NULL, hdiv.cut = NULL)

Arguments

LR

An object from 'InfDiv' class. This obejct is previously obtained with function estimateDivergence.

nlms

A list of distribution fitted models (output of 'fitNonlinearWeibullDist' function) or NULL. If NULL, then empirical cumulative distribution function is used to get the potential DIMPs.

div.col

Column number for divergence variable is located in the meta-column.

dist.name

name of the distribution to fit: Weibull2P (default: "Weibull2P"), Weibull three-parameters (Weibull3P), gamma with three-parameter (Gamma3P), gamma with two-parameter (Gamma2P), generalized gamma with three-parameter ("GGamma3P") or four-parameter ("GGamma4P"), the empirical cumulative distribution function (ECDF) or "None".

absolute

Logic (default, FALSE). Total variation (TV, the difference of methylation levels) is normally an output in the downstream MethylIT analysis. If 'absolute = TRUE', then TV is transformed into |TV|, which is an information divergence that can be fitted to Weibull or to Generalized Gamma distribution. So, if the nonlinear fit was performed for |TV|, then absolute must be set to TRUE.

alpha

A numerical value (usually alpha < 0.05) used to select cytosine sites k with information divergence (DIV_k) for which Weibull probability P[DIV_k > DIV(alpha)].

tv.col

Column number for the total variation to be used for filtering cytosine positions (if provided).

tv.cut

If tv.cut and tv.col are provided, then cytosine sites k with abs(TV_k) < tv.cut are removed before to perform the ROC analysis.

min.coverage

Cytosine sites with coverage less than min.coverage are discarded. Default: 0

hdiv.col

Optional. A column number for the Hellinger distance to be used for filtering cytosine positions. Fedault is NULL.

hdiv.cut

If hdiv.cut and hdiv.col are provided, then cytosine sites k with hdiv < hdiv.cut are removed.

Details

The potential signals are cytosine sites k with information divergence (DIV_k) values greater than the DIV(alpha = 0.05). The value of alpha can be specified. For example, potential signals with DIV_k > DIV(alpha = 0.01) can be selected. For each sample, cytosine sites are selected based on the corresponding fitted Weilbull distribution model that has been supplied.

Value

A list of GRanges objects, each GRanges object carrying the selected cytosine sites and and the Weibull probability P[DIV_k > DIV(alpha)].

Examples

num.points <- 1000
HD <- GRangesList( sample1 = makeGRangesFromDataFrame(
        data.frame(chr = "chr1", start = 1:num.points, end = 1:num.points,
            strand = '*',
            hdiv = rweibull(1:num.points, shape = 0.75, scale = 1)),
        keep.extra.columns = TRUE))
nlms <- nonlinearFitDist(HD, column = 1, verbose = FALSE)
getPotentialDIMP(LR = HD, nlms = nlms, div.col = 1, alpha = 0.05)


[Package MethylIT version 0.3.1 ]