classPerform {MethylIT.utils} | R Documentation |
The classification performance based on an information
divergence (e.g., Hellinger divergence) carried in a list of GRanges
objects. The total variation distance (TVD, absolute difference of
methylation levels) is used as pivot to specify the cytosine sites
considered as true positives and true negatives. Function
confusionMatrix
from package "caret" is applied to
get the classification performance.
classPerform(LR, min.tv = 0.25, tv.cut, cutoff, tv.col, div.col = NULL, pval.col = NULL, stat = 1)
LR |
A list of GRanges, a GRangesList, a CompressedGRangesList object. Each GRanges object from the list must have two columns: methylated (mC) and unmethylated (uC) counts. The name of each element from the list must coincide with a control or a treatment name. |
min.tv |
Minimum value for the total variation distance (TVD; absolute value of methylation levels differences, TVD = abs(TV)). Only sites/ranges k with TVD_{k} > min.tv are analyzed. Defaul min.tv = 0.25. |
tv.cut |
A cutoff for the total variation distance to be applied to each site/range. If tv.cut is provided, then sites/ranges k with TVD_{k} < tv.cut are considered TRUE negatives and TVD_{k} > tv.cut TRUE postives. Its value must be NULLor a number 0 < tv.cut < 1. |
cutoff |
A divergence of methylation levels or a p-value cutoff-value for the the magnitude given in div.col or in pval.col, respectively (see below). The values greater than 'cutoff' are predicted TRUE (positives), otherwise are predicted FALSE (negatives). |
tv.col |
Column number for the total variation distance (TVD; absolute value of methylation levels differences, TVD = abs(TV)). |
div.col |
Column number for divergence variable used in the performance analysis and estimation of the cutpoints. Default: NULL. One of the parameter values div.col or pval.col must be given. |
pval.col |
Column number for p-value used in the performance analysis and estimation of the cutpoints. Default: NULL. One of the parameter values div.col or pval.col must be given. |
stat |
An integer number indicating the statistic to be used in the testing. The mapping for statistic names are: 0 = "All" 1 = "Accuracy", 2 = "Sensitivity", 3 = "Specificity", 4 = "Pos Pred Value", 5 = "Neg Pred Value", 6 = "Precision", 7 = "Recall", 8 = "F1", 9 = "Prevalence", 10 = "Detection Rate", 11 = "Detection Prevalence", 12 = "Balanced Accuracy". |
Samples from each group are pooled according to the statistic selected (see parameter pooling.stat) and a unique GRanges object is created with the methylated and unmathylated read counts for each group (control and treatment) in the metacolumn. So, a contingence table can be built for range from GRanges object.
A list with the classification repformance results
Robersy Sanchez
# load simulated data of potential methylated signal data(sim_ps) classPerform(LR = PS, min.tv = 0.25, tv.cut = 0.4, cutoff = 68.7, tv.col = 7L, div.col = 9, stat = 0)