pcaLogisticR {MethylIT}R Documentation

Logistic Classification Model using Principal Component Analysis (PCA)

Description

Principal components (PCs) are estimated from the predictor variables provided as input data. Next, the individual coordinates in the selected PCs are used as predictors in the logistic regresson.

Logistic regression using Principal Components from PCA as predictor variables

Usage

pcaLogisticR(formula = NULL, data = NULL, n.pc = 1, scale = FALSE,
  center = FALSE, tol = 1e-04, max.pc = NULL)

## S3 method for class 'pcaLogisticR'
predict(object, newdata, type = c("class",
  "posterior", "pca.ind.coord", "all"), ...)

Arguments

formula

Same as in 'glm' from pakage 'stats'.

data

Same as in 'glm' from pakage 'stats'.

n.pc

Number of principal components to use in the logistic.

scale

Same as in 'prcomp' from pakage 'prcomp'.

center

Same as in 'prcomp' from pakage 'prcomp'.

tol

Same as in 'prcomp' from pakage 'prcomp'.

max.pc

Same as in paramter 'rank.' from pakage 'prcomp'.

object

To use with function 'predict'. A 'pcaLogisticR' object containing a list of two objects: 1) an object of class inheriting from "glm" and 2) an object of class inheriting from "prcomp".

newdata

To use with function 'predict'. New data for classification prediction

type

To use with function 'predict'. The type of prediction required: "class", "posterior", "pca.ind.coord", or "all". If type = 'all', function 'predict.pcaLogisticR' ('predict') returns a list with: 1) 'class': individual classification. 2) 'posterior': probabilities for the positive class. 3) 'pca.ind.coord': PC individual coordinate. Each element of this list can be requested independently using parameter 'type'.

...

Not in use.

Details

The principal components (PCs) are obtained using the function prcomp, while the logistic regression is performed using function glm, both functions from R package 'stats'. The current application only use basic functionalities from the mentioned functions. As shown in the example, 'pcaLogisticR' function can be used in general classification problems.

Value

Function 'pcaLogisticR' returns an object ('pcaLogisticR' class) containing a list of two objects:

  1. 'logistic': an object of class 'glm' from package 'stats'.

  2. 'pca': an object of class 'prcomp' from package 'stats'.

  3. reference.level: response level used as reference.

  4. positive.level: response level that corresponds to a "positive" result. When type = "response", the probability vector returned correspond to the probabilities of each individual to be a result, i.e., the probability to belong to the class of positive level.

For information on how to use these objects see ?glm and ?prcomp.

Examples

data(iris)
data <- iris[ iris$Species != "virginica", ]
data$Species <- droplevels(data$Species)
formula <- Species ~ Petal.Length + Sepal.Length + Petal.Width
pca.logistic <- pcaLogisticR(formula = formula,
                            data = data, n.pc = 2, scale = TRUE,
                            center = TRUE, max.pc = 2)
set.seed(123)
newdata <- iris[sample.int(150, 40), 1:4]
newdata.prediction <- predict(pca.logistic, newdata, type = "all")


[Package MethylIT version 0.3.1 ]