Internal function to estimate cutpoint following a machine-learning approach

mlCutpoint(
  LR,
  control.names,
  treatment.names,
  column,
  div.col,
  tv.col = NULL,
  tv.cut,
  post.cut = 0.5,
  classifier1,
  classifier2 = NULL,
  interactions = NULL,
  n.pc,
  prop = 0.6,
  center = FALSE,
  scale = FALSE,
  stat = 0L,
  cut.values = NULL,
  maxnodes = NULL,
  ntree = 400,
  nsplit = 1L,
  num.cores = 1L,
  tasks = 0L,
  ...
)

Arguments

LR, res, control.names, treatment.names, column, div.col

Same as in estimateCutPoint

column, div.col, tv.col, tv.cut, cut.values, stat

Same as in estimateCutPoint

classifier1, classifier2, n.pc, prop, post.cut

Same as in estimateCutPoint

interactions

If a logistic classifier is used. Variable interactions to consider in a logistic regression model. Any pairwise combination of the variable 'hdiv', 'TV', 'wprob', and 'pos' can be provided. For example: 'hdiv:TV', 'wprob:pos', 'wprob:TV', etc.

center

A logical value indicating whether the variables should be shifted to be zero centered.

scale

A logical value indicating whether the variables should be

maxnodes, ntree

Only for Random Forest classifier (randomForest, 'random_forest'). Maximum number \(maxnodes\) of terminal nodes trees in the forest can have. If not given, trees are grown to the maximum possible. Parameter \(ntree\) stands for the number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times.

nsplit

Only for Random Forest classifier. The Random Forest (randomForest, 'random_forest') package uses a C+Fortran implementation which only supports integer indexes, so any dataframe/data table/matrix with >2^31 elements (limit for integers) gives an error. The option nsplit is applied to train \(ntrees=floor(ntree/nsplit)\) models with \(rep(ntrees,nsplit)\) which are finally combined to obtain a forest with \(ntree\). Each model in this would contain \(ntrees\).

...

Additional arguments for evaluateDIMPclass function

cutp_data, num.cores, tasks

Same as in estimateCutPoint

Value

Specified in function estimateCutPoint for parameter setting simple = FALSE

Details

This function is called by function estimateCutPoint.

Author

Robersy Sanchez (https://genomaths.com).

Examples

## Get a set of potential DMPS (PS)
data(PS, package = 'MethylIT')

cutp <- mlCutpoint(LR = PS,
                 column = c(hdiv = TRUE, TV = TRUE,
                            wprob = TRUE, pos = TRUE),
                 classifier1 = 'qda', n.pc = 4,
                 control.names = c('C1', 'C2', 'C3'),
                 treatment.names = c('T1', 'T2', 'T3'),
                 tv.cut = 0.68, prop = 0.6,
                 div.col = 9L)
cutp$testSetPerformance
#> Confusion Matrix and Statistics
#> 
#>           Reference
#> Prediction  CT  TT
#>         CT  53   0
#>         TT   0 610
#>                                      
#>                Accuracy : 1          
#>                  95% CI : (0.9945, 1)
#>     No Information Rate : 0.9201     
#>     P-Value [Acc > NIR] : < 2.2e-16  
#>                                      
#>                   Kappa : 1          
#>                                      
#>  Mcnemar's Test P-Value : NA         
#>                                      
#>             Sensitivity : 1.0000     
#>             Specificity : 1.0000     
#>          Pos Pred Value : 1.0000     
#>          Neg Pred Value : 1.0000     
#>              Prevalence : 0.9201     
#>          Detection Rate : 0.9201     
#>    Detection Prevalence : 0.9201     
#>       Balanced Accuracy : 1.0000     
#>                                      
#>        'Positive' Class : TT         
#>