This function perform a selection of the cytosine sites carrying the potential methylation signal. The potential signals from controls and treatments are used as prior classification in further step of signal detection.

```
getPotentialDIMP(
LR,
nlms = NULL,
div.col,
dist.name = "Weibull2P",
absolute = FALSE,
alpha = 0.05,
pval.col = NULL,
tv.col = NULL,
tv.cut = NULL,
idv.col = NULL,
idv.cut = NULL,
min.coverage = NULL,
hdiv.col = NULL,
hdiv.cut = NULL,
pAdjustMethod = NULL
)
```

- LR
An object from 'InfDiv' or 'testDMP' class. These objects are previously obtained with function

`estimateDivergence`

or`FisherTest`

.- nlms
A list of distribution fitted models (output of

`gofReport`

function) or NULL. If NULL, then empirical cumulative distribution function is used to get the potential DMPs.- div.col
Column number for divergence variable is located in the meta-column.

- dist.name
Name of the fitted distribution. This could be the name of one distribution or a characters vector of length(nlms). Default is the two parameters Weibull distribution: 'Weibull2P'. The available options are:

**"Weibull2P"**Weibull with two-parameters.

**"Weibull3P"**Weibull with three-parameters.

**"Gamma2P"**Gamma with two-parameters.

**"Gamma3P"**Gamma with three-parameters.

**"GGamma3P"**Generalized gamma with three-parameters.

**"GGamma4P"**Generalized gamma with four-parameters.

**"ECDF"**The empirical cumulative distribution function.

**"None"**No distribution.

If

**dist.name != 'None'**, and**nlms != NULL**, then a column named 'wprob' with a probability vector derived from the application of model 'nlms' will be returned.- absolute
Logic (default, FALSE). Total variation (TV, the difference of methylation levels) is normally an output in the downstream MethylIT analysis. If 'absolute = TRUE', then TV is transformed into |TV|, which is an information divergence that can be fitted to Weibull or to Generalized Gamma distribution. So, if the nonlinear fit was performed for |TV|, then absolute must be set to TRUE.

- alpha
A numerical value (usually \(\alpha \leq 0.05\)) used to select cytosine sites \(k\) with information divergence (\(DIV_k\)) for which the the probabilities hold: \(P(DIV_k > DIV(\alpha))\).

- pval.col
An integer denoting a column from each GRanges object from LR where p-values are provided when

**dist.name == 'None'**and**nlms == NULL**. Default is NULL. If NUll and**dist.name == 'None'**and**nlms == NULL**, then a column named**adj.pval**will used to select the potential DMPs.- tv.col
Column number for the total variation to be used for filtering cytosine positions (if provided).

- tv.cut
If tv.cut and tv.col are provided, then cytosine sites k with \(abs(TV_k) < tv.cut\) are removed before to perform the ROC analysis.

- min.coverage
Cytosine sites with coverage less than min.coverage are discarded. Default: 0

- hdiv.col
Optional. A column number for the Hellinger distance to be used for filtering cytosine positions. Default is NULL.

- hdiv.cut
If hdiv.cut and hdiv.col are provided, then cytosine sites \(k\) with hdiv < hdiv.cut are removed.

- pAdjustMethod
method used to adjust the p-values from other approaches like Fisher's exact test, which involve multiple comparisons Default is NULL. Do not apply it when a probability distribution model is used (

**when nlms is given**), since it makes not sense.- idiv.col
Optional. A column number for any of the available information divergences: \(TV, bay.TV, hdiv, or jdiv\). used for filtering cytosine positions. Default is NULL.

- idiv.cut
If hdiv.cut and hdiv.col are provided, then cytosine sites \(k\) with hdiv < hdiv.cut are removed.

A list of GRanges objects, each GRanges object carrying the selected cytosine sites and the probabilities that the specified divergence values can be greater than the critical value specified by \(\alpha\): \(P(DIV_k > DIV(\alpha))\).

The potential signals are cytosine sites k with information divergence (DIV_k) values greater than the DIV(alpha = 0.05). The value of alpha can be specified. For example, potential signals with DIV_k > DIV(alpha = 0.01) can be selected. For each sample, cytosine sites are selected based on the corresponding nonlinear fitted distribution model that has been supplied.

```
## Get a dataset of Hellinger divergency of methylation levels and their
## corresponding best nonlinear fit distribution models from the package
data(HD, gof)
PS <- getPotentialDIMP(LR = HD, nlms = gof$nlms, dist.name = gof$bestModel,
div.col = 9L, alpha = 0.05)
```