This function performs the nonlinear fit of GGamma CDF of a variable x

fitGGammaDist(
x,
parameter.values,
location.par = FALSE,
sample.size = 20,
npoints = NULL,
maxiter = 1024,
ftol = 1e-12,
ptol = 1e-12,
maxfev = 1e+05,
nlms = FALSE,
verbose = TRUE,
...
)

## Arguments

x

numerical vector

parameter.values

initial parameter values for the nonlinear fit. If the locator parameter is included (mu != 0), this must be given as parameter.values = list(alpha = 'value', scale = 'value', mu = 'value', psi = 'value') or if mu = 0, as: parameter.values =list(alpha = 'value', scale = 'value', psi = 'value'). If not provided, then an initial guess is provided.

location.par

whether to consider the fitting to generalized gamma distribution (GGamma) including the location parameter, i.e., a GGamma with four parameters (GGamma4P).

sample.size

Minimum size of the sample.

npoints

number of points used in the fit. If the number of points if greater than 10^6, then the fit is automatically set to npoints = 999999. However, the reported values for R.Cross.val, AIC, and BIC are computed taking into account the whole set of points.

maxiter

positive integer. Termination occurs when the number of iterations reaches maxiter. Default value: 1024.

ftol

non-negative numeric. Termination occurs when both the actual and predicted relative reductions in the sum of squares are at most ftol. Therefore, ftol measures the relative error desired in the sum of squares. Default value: 1e-12

ptol

non-negative numeric. Termination occurs when the relative error between two consecutive iterates is at most ptol. Therefore, ptol measures the relative error desired in the approximate solution. Default value: 1e-12.

maxfev

integer; termination occurs when the number of calls to fn has reached maxfev. Note that nls.lm sets the value of maxfev to 100*(length(par) + 1) if maxfev = integer(), where par is the list or vector of parameters to be optimized.

nlms

Logical. Whether to return the nonlinear model object nls.lm. Default is FALSE.

verbose

if TRUE, prints the function log to stdout

...

arguments passed to or from other methods.

## Value

Model table with coefficients and goodness-of-fit results: Adj.R.Square, deviance, AIC, R.Cross.val, and rho, as well as, the coefficient covariance matrix. If nlms = TRUE, then a list with nonlinear model object nls.lm is returned.

## Details

The script algorithm tries to fit the three-parameter GGamma CDF ('GGamma3P') or the four-parameter GGamma ('GGamma4P') using a modification of Levenberg-Marquardt algorithm implemented in function 'nls.lm' from 'minpack.lm' package that is used to perform the nonlinear fit. Cross-validations for the nonlinear regressions (R.Cross.val) were performed in each methylome as described in reference (1). In addition, Stein's formula for adjusted R squared (rho) was used as an estimator of the average cross-validation predictive power (1).

If the number of values to fit is >10^6, the fitting to a GGamma CDF would be a time consuming task. To reduce the computational time, the option summarized.data' can be set 'TRUE'. If npoint != NULL, the original variable values are summarized into 'npoint' bins and their midpoints are used as the new predictors. In this case, only the goodness-of-fit indicators AIC and R.Cross.val are estimated based on all the original variable x values.

## References

1. Stevens JP. Applied Multivariate Statistics for the Social Sciences. Fifth Edit. Routledge Academic; 2009.

## Author

Robersy Sanchez - 06/03/2016

## Examples

set.seed(123)
## Fitting GGamma3P
x <- rggamma(3000, alpha = 1.03, psi = 0.75, scale = 1.1)
#>
#> *** Trying nonlinear fit to a generalized 3P Gamma distribution model 3000 values...
#> *** Performing nonlinear regression model  crossvalidation...
#>        Estimate  Std. Error  t value Pr(>|t|))      Adj.R.Square
#> alpha 1.0646011 0.001915538 555.7713         0 0.999999998182352
#> scale 1.1382887 0.002872487 396.2729         0
#> psi   0.7383895 0.001898632 388.9061         0
#>                     rho       R.Cross.val                  DEV
#> alpha 0.999999998178102 0.999951555970568 4.54108853939111e-07
#> scale
#> psi
#>                     AIC               BIC     COV.alpha     COV.scale
#> alpha -26558.6966173308 -26534.6711470602  3.669287e-06  5.367186e-06
#> scale                                      5.367186e-06  8.251182e-06
#> psi                                       -3.602364e-06 -5.426154e-06
#>             COV.psi COV.mu    N    model
#> alpha -3.602364e-06     NA 3000 GGamma3P
#> scale -5.426154e-06     NA 3000
#> psi    3.604802e-06     NA 3000
## Fitting GGamma4P
x <- x + 1
#>
#> *** Trying nonlinear fit to a generalized 4P Gamma distribution model 3000 values...
#> *** Performing nonlinear regression model  crossvalidation...
#>        Estimate   Std. Error    t value Pr(>|t|))      Adj.R.Square
#> alpha 1.0322905 0.0022133696   466.3887         0 0.999999998586538
#> scale 1.0778304 0.0037442874   287.8600         0
#> mu    0.9981787 0.0000679242 14695.4801         0
#> psi   0.7795591 0.0026167686   297.9091         0
#>                     rho       R.Cross.val                  DEV
#> alpha 0.999999998582287 0.999955221091228 3.53012009550101e-07
#> scale
#> mu
#> psi
#>                     AIC              BIC     COV.alpha     COV.scale
#> alpha -26809.4738999022 -26779.442062064  4.899005e-06  8.156174e-06
#> scale                                     8.156174e-06  1.401969e-05
#> mu                                        8.893850e-08  1.675482e-07
#> psi                                      -5.736776e-06 -9.779601e-06
#>             COV.psi        COV.mu    N    model
#> alpha  8.893850e-08 -5.736776e-06 3000 GGamma4P
#> scale  1.675482e-07 -9.779601e-06 3000
#> mu     4.613696e-09 -1.174772e-07 3000
#> psi   -1.174772e-07  6.847478e-06 3000