The parameter estimation is accomplished using a count data matrix. The estimation is based on the fact that if a variable \(x = (x_1, x_2, ...x_n)\) follows Dirichlet Distribution with parameters \(\alpha = \alpha_1, ... , \alpha_n\) (all positive reals), in short, \(x ~ Dir(\alpha)\), then \(x_i ~ Beta(\alpha_i, \alpha_0 - \alpha_i)\), where Beta(.) stands for the Beta distribution and \(\alpha_0 = \sum \alpha_i\).

Dirichlet distribution is a family of continuous multivariate probability distributions, a multivariate generalization of the Beta distribution.

estimateDirichDist(
  x,
  start = NULL,
  num.cores = 1L,
  tasks = 0L,
  seed = 123,
  refit = TRUE,
  verbose = TRUE,
  ...
)

Arguments

x

A matrix or a data.frame object carrying count data.

start

Initial parameter values for \(\alpha = \alpha_1, ... , \alpha_n\) (all positive reals). Defaults is NULL.

num.cores, tasks

Parameters for parallel computation using BiocParallel-package: the number of cores to use, i.e. at most how many child processes will be run simultaneously (see bplapply and the number of tasks per job (only for Linux OS).

verbose

if TRUE, prints the function log to stdout and a progress bar

...

Further arguments for betaDistEstimation function.

Value

A vector of estimated parameter values

Details

As any non-linear fitting, results strongly depends on the start parameter values.

Author

Robersy Sanchez <https://genomaths.com>

Examples

#' ## A random generation numerical vectors with
x <- rdirichlet(n = 1000, alpha = c(2.1, 3.1, 1.2))
head(x)
#>             a1        a2         a3
#> [1,] 0.4168350 0.5197227 0.06344232
#> [2,] 0.1273661 0.5615770 0.31105695
#> [3,] 0.2937281 0.5000729 0.20619906
#> [4,] 0.1761859 0.5912126 0.23260156
#> [5,] 0.6001897 0.3025950 0.09721535
#> [6,] 0.2455860 0.4635076 0.29090639

estimateDirichDist(x)
#> $alpha
#> [1] 2.211949 3.462360 1.231335
#> 
#> $marginals
#> $marginals$a1
#>        Estimate  Std.Error  t_value Pr(>|t|) Adj.R.Square       rho R.Cross.val
#> shape1 2.211949 0.01144309 193.2999   <1e-16    0.9999984 0.9999984   0.9996673
#> shape2 4.722073 0.03163860 149.2504   <1e-16           NA        NA          NA
#>              AIC      BIC    n
#> shape1 -6935.663 -6920.94 1000
#> shape2        NA       NA   NA
#> 
#> $marginals$a2
#>        Estimate Std.Error  t_value Pr(>|t|) Adj.R.Square       rho R.Cross.val
#> shape1 3.462360 0.0165572 209.1151   <1e-16    0.9999983 0.9999983   0.9997052
#> shape2 3.578023 0.0316386 113.0904   <1e-16           NA        NA          NA
#>             AIC       BIC    n
#> shape1 -7035.97 -7021.247 1000
#> shape2       NA        NA   NA
#> 
#> $marginals$a3
#>        Estimate   Std.Error  t_value Pr(>|t|) Adj.R.Square       rho
#> shape1 1.231335 0.007393049 166.5531   <1e-16    0.9999993 0.9999993
#> shape2 5.262551 0.031638600 166.3332   <1e-16           NA        NA
#>        R.Cross.val       AIC       BIC    n
#> shape1   0.9997919 -7374.791 -7360.067 1000
#> shape2          NA        NA        NA   NA
#> 
#> 
#> attr(,"class")
#> [1] "DirchModel" "list"