Beta-binomial Posterior Methylation Levels — beta_bin

This a function is addressed to estimate the posterior probabilities of methylation levels, assuming that the methylation levels follows Beta distribution and taking abventage that ehe Beta distribution is a conjugate prior for Binomial distribution.

beta_bin_meth(x, ...)

# S4 method for matrix_OR_data.frame
beta_bin_meth(
  x,
  init.pars = NULL,
  via.optim = TRUE,
  loss.fun = c("linear", "huber", "smooth", "cauchy", "arctg"),
  verbose = TRUE,
  ...
)

# S4 method for GRanges
beta_bin_meth(
  x,
  init.pars = NULL,
  via.optim = TRUE,
  loss.fun = c("linear", "huber", "smooth", "cauchy", "arctg"),
  verbose = TRUE,
  ...
)

# S4 method for GRangesList
beta_bin_meth(
  x,
  init.pars = NULL,
  via.optim = TRUE,
  loss.fun = c("linear", "huber", "smooth", "cauchy", "arctg"),
  num.cores = multicoreWorkers(),
  tasks = 0L,
  verbose = TRUE,
  ...
)

Arguments

x

A GRanges-class carrying a matrix of counts on its metacolumn, with the counts of methylated (mC) and unmethylated (uC) cytosines for at least 10 or more cytosine sites. Alternatively, it can be just the 'matrix' or 'data.frame' of counts.

init.pars

initial parameter values. Defaults is NULL and an initial guess is estimated using optim function. If the initial guessing fails initial parameter values are to alpha = 1 & beta = 1, which imply the parsimony pseudo-counts greater than zero.

via.optim

Logical. Whether to estimate beta distribution parameters via optim or nls.lm. If any of this approaches fail then parameters used init.pars will be returned.

loss.fun

Loss function(s) used in the estimation of the best fitted model to beta distribution (only applied when Bayesian=TRUE; see (Loss function)). This fitting uses the approach followed in in the R package usefr. After \(z = 1/2 * sum((f(x) - y)^2)\) we have:

"linear": linear function which gives a standard least squares: \(loss(z) = z\).
"huber": Huber loss, \(loss(z) = ifelse(z \leq 1, z, sqrt(z) -1)\).
"smooth": Smooth approximation to the sum of residues absolute values: \(loss(z) = 2*(sqrt(z + 1) - 1)\).
"cauchy": Cauchy loss: \(loss(z) = log(z + 1)\).
"arctg": arc-tangent loss function: \(loss(x) = atan(z)\).

Loss 'linear' function works well for most of the methylation datasets with acceptable quality.

Author

Robersy Sanchez (https://genomaths.com)