This function performs the nonlinear fit of GGamma CDF of a variable x

```
fitGammaDist(
x,
probability.x,
parameter.values,
location.par = FALSE,
sample.size = 20,
npoints = NULL,
maxiter = 1024,
ftol = 1e-12,
ptol = 1e-12,
maxfev = 1e+05,
nlms = FALSE,
verbose = TRUE
)
```

- x
numerical vector

- probability.x
probability vector of x. If not provided, the values are estimated using the empirical cumulative distribution function ('ecdf') from 'stats' R package.

- parameter.values
initial parameter values for the nonlinear fit. If the locator parameter is included (mu != 0), this must be given as parameter.values = list(shape = 'value', scale = 'value', mu = 'value') or if mu = 0, as: parameter.values = list(shape = 'value', scale = 'value'). If not provided, then an initial guess is provided.

- location.par
whether to consider the fitting to generalized gamma distribution (Gamma) including the location parameter, i.e., a Gamma with four parameters (GGamma3P).

- sample.size
size of the sample.

- npoints
number of points used in the fit.

- maxiter
positive integer. Termination occurs when the number of iterations reaches maxiter. Default value: 1024.

- ftol
non-negative numeric. Termination occurs when both the actual and predicted relative reductions in the sum of squares are at most ftol. Therefore, ftol measures the relative error desired in the sum of squares. Default value: 1e-12

- ptol
non-negative numeric. Termination occurs when the relative error between two consecutive iterates is at most ptol. Therefore, ptol measures the relative error desired in the approximate solution. Default value: 1e-12.

- maxfev
integer; termination occurs when the number of calls to fn has reached maxfev. Note that nls.lm sets the value of maxfev to 100*(length(par) + 1) if maxfev = integer(), where par is the list or vector of parameters to be optimized.

- nlms
Logical. Whether to return the nonlinear model object

`nls.lm`

. Default is FALSE.- verbose
if TRUE, prints the function log to stdout

Model table with coefficients and goodness-of-fit results: Adj.R.Square, deviance, AIC, R.Cross.val, and rho, as well as, the coefficient covariance matrix.

The algorithm tries to fit the two-parameter Gamma CDF ('Gamma2P') or the three-parameter Gamma ('Gamma3P') using a modification of Levenberg-Marquardt algorithm implemented in function 'nls.lm' from 'minpack.lm' package that is used to perform the nonlinear fit. Cross-validations for the nonlinear regressions (R.Cross.val) were performed in each methylome as described in reference (1). In addition, Stein's formula for adjusted R squared (rho) was used as an estimator of the average cross-validation predictive power (1).

If the number of values to fit is >10^6, the fitting to a GGamma CDF would be a time consuming task. To reduce the computational time, the data can be 'summarized' into 'npoints' (bins) and used as the new predictors.

Stevens JP. Applied Multivariate Statistics for the Social Sciences. Fifth Edit. Routledge Academic; 2009.

```
set.seed(126)
x <- rgamma(1000, shape = 1.03, scale = 2.1)
fitGammaDist(x)
#>
#> *** Trying nonlinear fit to a 2P Gamma distribution model ...
#> *** Performing nonlinear regression model crossvalidation...
#> Estimate Std. Error t value Pr(>|t|)) Adj.R.Square
#> shape 0.9746242 0.001629900 597.9657 0 0.999999913159578
#> scale 2.0760284 0.005052791 410.8676 0
#> rho R.Cross.val DEV
#> shape 0.999999912723457 0.999728239112834 7.22945067894529e-06
#> scale
#> AIC BIC COV.shape COV.scale COV.mu
#> shape -7092.63101078795 -7072.99998967202 2.656574e-06 -7.735164e-06 NA
#> scale -7.735164e-06 2.553070e-05 NA
#> df model
#> shape 998 Gamma2P
#> scale 998
```