Given a GRanges object with the methylated and unmethylated read counts for control and treatment in its metacolumn, Fisher's exact test is performed for each cytosine site.
FisherTest(
LR,
count.col = c(1, 2),
control.names = NULL,
treatment.names = NULL,
pooling.stat = "sum",
tv.cut = NULL,
hdiv.cut = NULL,
hdiv.col = NULL,
pAdjustMethod = "BH",
pvalCutOff = 0.05,
saveAll = FALSE,
num.cores = 1L,
tasks = 0L,
verbose = FALSE,
progressbar = TRUE,
...
)
A list of GRanges, a GRangesList, a CompressedGRangesList object, or an object from Methyl-IT downstream analyses: 'InfDiv' or 'pDMP' object. Each GRanges object from the list must have two columns: methylated (mC) and unmethylated (uC) counts. The name of each element from the list must coincide with a control or a treatment name.
2d-vector of integers with the indexes of the read count
columns. If not given, then it is assumed that the methylated and
unmethylated read counts are located in columns 1 and 2 of each GRanges
metacolumns. If object LR is the output of Methyl-IT function
estimateDivergence
, then columns 1:4 are the read
count columns: columns 1 and 2 are methylated and unmethylated read
counts from the reference group, while columns 3 and 4 are methylated and
unmethylated read counts from the treatment group, respectively. In this
case, if the requested comparison is reference versus treatment, then no
specification is needed for count.col. The comparison control versus
treatment can be obtained by setting count.col = 3:4 and providing
control.names and treatment.names.
Names/IDs of control and treatment samples, which must be included in the variable GR at the metacolumn. Default NULL. If provided the Fisher's exact test control versus treatment is performed. Default is NULL. If NULL, then it is assumed that each GRanges object in LR has four columns of counts. The first two columns correspond to the methylated and unmethylated counts from control/reference and the other two columns are the methylated and unmethylated counts from treatment, respectively.
statistic used to estimate the methylation pool: row sum, row mean or row median of methylated and unmethylated read counts across individuals. If the number of control samples is greater than 2 and pooling.stat is not NULL, then they will pooled. The same for treatment. Otherwise, all the pairwise comparisons will be done.
A cutoff for the total variation distance (TVD; absolute value of methylation levels differences) estimated at each site/range as the difference of the group means of methylation levels. If tv.cut is provided, then sites/ranges k with \(|TV_k| < tv.cut\) are removed before performing the regression analysis. Its value must be NULL or a number \(0 < tv.cut < 1\).
An optional cutoff for the Hellinger divergence (hdiv). If
the LR object derives from the previous application of function
estimateDivergence
, then a column with the hdiv values is
provided. If combined with tv.cut, this permits a more effective
filtering of the signal from the noise. Default is NULL.
Optional. Columns where hdiv values are located in each GRanges object from LR. It must be provided if together with hdiv.cut. Default is NULL.
method used to adjust the results; default: BH
cutoff used then a p-value adjustment is performed
if TRUE all the temporal results are returned
The number of cores to use, i.e. at most how many child processes will be run simultaneously (see bpapply function from BiocParallel).
integer(1). The number of tasks per job. value must be a scalar integer >= 0L. In this documentation a job is defined as a single call to a function, such as bplapply, bpmapply etc. A task is the division of the X argument into chunks. When tasks == 0 (default), X is divided as evenly as possible over the number of workers (see MulticoreParam from BiocParallel package).
if TRUE, prints the function log to stdout
logical(1). Enable progress bar
Additional parameters for function
uniqueGRanges
.
The input GRanges object with the columns of Fisher's exact test p-value, total variation (difference of methylation levels), and p-value adjustment.
Samples from each group are pooled according to the statistic selected (see parameter pooling.stat) and a unique GRanges object is created with the methylated and unmethylated read counts for each group (control and treatment) in the metacolumn. So, a contingency table can be built for range from GRanges object.
## Get a dataset of Hellinger divergency of methylation levels
## from the package
data(HD)
### --- To get the read counts
hd <- lapply(HD, function(hd) {
hd = hd[1:10,3:4]
colnames(mcols(hd)) <- c('mC', 'uC')
return(hd)
})
FisherTest(LR = hd, pooling.stat = NULL, control.names = 'C1',
treatment.names = 'T1', pAdjustMethod='BH', pvalCutOff = 0.05,
num.cores = 1L, verbose=FALSE)
#>
|
| | 0%
|
|====== | 8%
|
|============ | 17%
|
|================== | 25%
|
|======================= | 33%
|
|============================= | 42%
|
|=================================== | 50%
|
|========================================= | 58%
|
|=============================================== | 67%
|
|==================================================== | 75%
|
|========================================================== | 83%
|
|================================================================ | 92%
|
|======================================================================| 100%
#>
#> list object of length: 1
#> -------
#> $C1.T1
#> GRanges object with 4 ranges and 7 metadata columns:
#> seqnames ranges strand | c1 t1 c2 t2
#> <Rle> <IRanges> <Rle> | <numeric> <numeric> <numeric> <numeric>
#> [1] 1 1 + | 57 7 43 21
#> [2] 1 6 - | 64 49 0 113
#> [3] 1 7 - | 41 0 10 31
#> [4] 1 9 - | 0 165 162 3
#> TV pvalue adj.pval
#> <numeric> <numeric> <numeric>
#> [1] -0.218750 4.84887e-03 1.45466e-02
#> [2] -0.566372 3.16957e-25 1.90174e-24
#> [3] -0.756098 6.01609e-14 2.40644e-13
#> [4] 0.981818 1.61719e-92 1.94062e-91
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
#>
#> ...
#> <0 more GRanges elements>
#> -------