R: Nonlinear fit of a commulative distribution function

fitCDF {usefr}

R Documentation

Nonlinear fit of a commulative distribution function

Description

Usually the parameter estimation of a cummulative distribution function (*CDF*) are accomplished using the corresponding probability density function (*PDF*). DIfferent optimization algorithms can be used to accomplished this task and different algorithms can yield different esitmated parameters. Hence, why not try to fit the CDF directly?

Usage

fitCDF(varobj, distNames, plot = TRUE, plot.num = 1,
  only.info = FALSE, maxiter = 10^4, maxfev = 1e+05, ptol = 1e-12,
  verbose = TRUE)

Arguments

`varobj`	A a vector containing observations, the variable for which the CDF parameters will be estimated.
`distNames`	a vector of distribution numbers to select from the listed below in details section, e.g. c(1:10, 15)
`plot`	Logic. Default TRUE. Whether to produce the plots for the best fitted CDF.
`plot.num`	The number of distributions to be plotted.
`only.info`	Logic. Default TRUE. If true, only information about the parameter estimation is returned.
`maxiter, maxfev, ptol`	Parameters to ontrol of various aspects of the Levenberg-Marquardt algorithm through function `nls.lm.control` from minpack.lm package.
`verbose`	Logic. If TRUE, prints the function log to stdout

Details

The nonlinear fit (NLF) problem for CDFs is addressed with Levenberg-Marquardt algorithm implemented in function nls.lm from package *minpack.lm*. This function is inspired in a script for the function fitDistr from the package propagate [1]. Some parts or script ideas from function fitDistr are used, but here we to estimate CDF and not the PDF as in the case of "fitDistr. A more informative are also incorporated. The studentized residuals are provided as well. The list (so far) of possible CDFs is:

Normal (Wikipedia)
Log-normal (Wikipedia)
Half-normal (Wikipedia). An Alternatively using a scaled precision (inverse of the variance) parametrization (to avoid issues if σ is near zero), obtained by setting θ=sqrt(π)/σ*sqrt(2).
Generalized Normal (Wikipedia)
T-Generalized Normal [2].
Laplace (Wikipedia)
Gamma (Wikipedia)
3P Gamma [3].
Generalized 4P Gamma [3] (Wikipedia)
Generalized 3P Gamma [3].
Weibull (Wikipedia)
3P Weibull (Wikipedia)
Beta (Wikipedia)
3P Beta (Wikipedia)
4P Beta (Wikipedia)
Beta-Weibull ReliaWiki
Generalized Beta (Wikipedia)
Rayleigh (Wikipedia)
Exponential (Wikipedia)
2P Exponential (Wikipedia)

Value

After return the plots, a list with following values is provided:

aic: Akaike information creterion
fit: list of results of fitted distribution, with parameter values
bestfit: the best fitted distribution according to AIC
fitted: fitted values from the best fit
rstudent: studentized residuals
residuals: residuals

After x = fitCDF( varobj, ...), attributes( x$bestfit ) yields: $names [1] "par" "hessian" "fvec" "info" "message" "diag" "niter" "rsstrace" "deviance" $class [1] "nls.lm" And fitting details can be retrived with summary(x$bestfit)

Author(s)

Robersy Sanchez (https://genomaths.com).

References

Andrej-Nikolai Spiess (2014). propagate: Propagation of Uncertainty. R package version 1.0-4. http://CRAN.R-project.org/package=propagate
Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover. Chapter 6: Gamma and Related Functions.
Hand-book on STATISTICAL DISTRIBUTIONS for experimentalists (pag 73) by Christian Walck. Particle Physics Group Fysikum. University of Stockholm (e-mail: walck@physto.se).

Examples

set.seed(1230)
x1 <- rnorm(10000, mean = 0.5, sd = 1)
cdfp <- fitCDF(x1, distNames = "Normal", plot = FALSE)
summary(cdfp$bestfit)

[Package usefr version 0.1.0 ]