pcaLogisticR {MethylIT}R Documentation

Logistic Classification Model using Principal Component Analysis (PCA)


Principal components (PCs) are estimated from the predictor variables provided as input data. Next, the individual coordinates in the selected PCs are used as predictors in the logistic regresson.

Logistic regression using Principal Components from PCA as predictor variables


pcaLogisticR(formula = NULL, data = NULL, n.pc = 1, scale = FALSE,
  center = FALSE, tol = 1e-04, max.pc = NULL)

## S3 method for class 'pcaLogisticR'
predict(object, newdata, type = c("class",
  "posterior", "pca.ind.coord", "all"), ...)



Same as in 'glm' from pakage 'stats'.


Same as in 'glm' from pakage 'stats'.


Number of principal components to use in the logistic.


Same as in 'prcomp' from pakage 'prcomp'.


Same as in 'prcomp' from pakage 'prcomp'.


Same as in 'prcomp' from pakage 'prcomp'.


Same as in paramter 'rank.' from pakage 'prcomp'.


To use with function 'predict'. A 'pcaLogisticR' object containing a list of two objects: 1) an object of class inheriting from "glm" and 2) an object of class inheriting from "prcomp".


To use with function 'predict'. New data for classification prediction


To use with function 'predict'. The type of prediction required: "class", "posterior", "pca.ind.coord", or "all". If type = 'all', function 'predict.pcaLogisticR' ('predict') returns a list with: 1) 'class': individual classification. 2) 'posterior': probabilities for the positive class. 3) 'pca.ind.coord': PC individual coordinate. Each element of this list can be requested independently using parameter 'type'.


Not in use.


The principal components (PCs) are obtained using the function prcomp, while the logistic regression is performed using function glm, both functions from R package 'stats'. The current application only use basic functionalities from the mentioned functions. As shown in the example, 'pcaLogisticR' function can be used in general classification problems.


Function 'pcaLogisticR' returns an object ('pcaLogisticR' class) containing a list of two objects:

  1. 'logistic': an object of class 'glm' from package 'stats'.

  2. 'pca': an object of class 'prcomp' from package 'stats'.

  3. reference.level: response level used as reference.

  4. positive.level: response level that corresponds to a "positive" result. When type = "response", the probability vector returned correspond to the probabilities of each individual to be a result, i.e., the probability to belong to the class of positive level.

For information on how to use these objects see ?glm and ?prcomp.


data <- iris[ iris$Species != "virginica", ]
data$Species <- droplevels(data$Species)
formula <- Species ~ Petal.Length + Sepal.Length + Petal.Width
pca.logistic <- pcaLogisticR(formula = formula,
                            data = data, n.pc = 2, scale = TRUE,
                            center = TRUE, max.pc = 2)
newdata <- iris[sample.int(150, 40), 1:4]
newdata.prediction <- predict(pca.logistic, newdata, type = "all")

[Package MethylIT version 0.3.1 ]