R/bicopulaGOF.R
bicopulaGOF.Rd
Goodness-of-fit (GOF) tests for a two-dimensional copula
based, by default, on the knowledge of the marginal probability
distributions. Several functionalities/tools from
copula-package
are integrated to perform the GOF of
copulas that includes specific margin parameter settings. In terms of
copula-package
vocabulary, these are GOF for copula
objects from class
Mvdc
(also called non-free copulas).
bicopulaGOF(
x,
y,
copula = NULL,
margins = NULL,
paramMargins = NULL,
sample.size = NULL,
nboots = 10,
approach = c("adchisq", "adgamma", "chisq", "rmse", "Sn", "SnB", "SnC"),
Rosenblatt = FALSE,
breaks = 12,
method = "ml",
num.cores = 1L,
tasks = 0,
seed = 123,
verbose = TRUE,
...
)
Numerical vector with the observations from the first marginal distribution.
Numerical vector with the observations from the second margin distribution.
A copula object from class Mvdc
or
string specifying all the name for a copula from package
copula-package
.
A character vector specifying all the parametric marginal distributions. See details below.
A list whose each component is a list (or numeric vectors) of named components, giving the parameter values of the marginal distributions. See details below.
The size of the samples used for each sampling. It is not required for the approaches: "Sn", "SnB", and "SnC"; see below.
The number of booststrap resampling to perform.
a character string specifying the goodness-of-fit test
statistic to be used, which has to be one (or a unique abbreviation) of
following: "adchisq", "adgamma", "Sn", "SnB", "SnC", "chisq", and "rmse".
With the exception of chisq and rmse, all the other
statistics are the same as in functions gofTstat
and
gofCopula
. The test using chisq implement the
approach described in reference [1].
The Anderson–Darling statistic approach using Rosenblatt
transformation is normally used for the GOF in function
gofCopula
from copula-package
package. since, the current function applies a parametric bootstrap
approach generating random variate from the analytic expression for the
margin CDFs, the test does not depend on the theoretical distribution of
the Anderson–Darling statistic. Simulations suggest, so far, that the
application of Rosenblatt transformation may not be needed in this case.
Hence, the desicion on whether to apply the Rosenblatt transformation
(computational expensive for big datasets) is left to the users. Function
cCopula
is used to computes the Rosenblatt transform.
So, the application of Rosenblatt transform is limited to those copulas for
which cCopula
is implemented.
A single number giving the number of bins for the computation of the Pearson's Chi-squared statistic as suggested in reference [1]. Bascally, it is used to split the unit square [0, 1]^2 into bins/regions.
A character string specifying the estimation method to be
used to estimate the dependence parameter(s) (if the copula needs to be
estimated) see fitCopula
.
Paramaters for parallele computation using package
BiocParallel-package
: the number of cores to
use, i.e. at most how many child processes will be run simultaneously (see
bplapply
and the number of tasks per job (only
for Linux OS).
An integer used to set a 'seed' for random number generation.
if verbose, comments and progress bar will be printed.
The statistic value estimated for the observations, and the estimated bootstrap p.value.
Notice that copula-package
already have
function gofCopula
to perform GOF. However, this
function does not support bivariate distributions constructed from copula
with known margins. In addition, its use can be computational expensive for
big datasets.
Jaworski, P. Copulae in Mathematical and Quantitative Finance. 213, d (2013).
Wang, Y. et al. Multivariate analysis of joint probability of different rainfall frequencies based on copulas. Water (Switzerland) 9, (2017).
require(stats)
set.seed(12)
margins <- c("norm", "norm")
## Random variates from normal distributions
X <- rnorm(2 * 1e3, mean = 0, sd = 10)
Y <- rnorm(2 * 1e3, mean = 0, sd = 10)
parMargins <- list(
list(mean = 0, sd = 10),
list(mean = 0, sd = 10)
)
bicopulaGOF(
x = X, y = Y, copula = "normalCopula", sample.size = 1e2,
margins = margins, paramMargins = parMargins, nboots = 99,
Rosenblatt = TRUE, approach = "adgamma", num.cores = 1L,
verbose = FALSE
)
#> $gof
#> AD.stat mc_p.value sample.size num.sampl
#> 0.2074162 0.9400000 100.0000000 99.0000000
#>
#> $copula
#> Multivariate Distribution Copula based ("mvdc")
#> @ copula:
#> Normal copula, dim. d = 2
#> Dimension: 2
#> Parameters:
#> rho.1 = 0.01905212
#> @ margins:
#> [1] "norm" "norm"
#> with 2 (not identical) margins; with parameters (@ paramMargins)
#> List of 2
#> $ :List of 2
#> ..$ mean: num 0
#> ..$ sd : num 10
#> $ :List of 2
#> ..$ mean: num 0
#> ..$ sd : num 10
#>
bicopulaGOF(
x = X, y = Y, copula = "normalCopula", sample.size = 1e2,
margins = margins, paramMargins = parMargins, nboots = 99,
Rosenblatt = FALSE, approach = "adgamma", num.cores = 1L,
verbose = FALSE
)
#> $gof
#> AD.stat mc_p.value sample.size num.sampl
#> 0.2383179 0.8500000 100.0000000 99.0000000
#>
#> $copula
#> Multivariate Distribution Copula based ("mvdc")
#> @ copula:
#> Normal copula, dim. d = 2
#> Dimension: 2
#> Parameters:
#> rho.1 = 0.01905212
#> @ margins:
#> [1] "norm" "norm"
#> with 2 (not identical) margins; with parameters (@ paramMargins)
#> List of 2
#> $ :List of 2
#> ..$ mean: num 0
#> ..$ sd : num 10
#> $ :List of 2
#> ..$ mean: num 0
#> ..$ sd : num 10
#>
## --- Non-parallel expensive computation ---- -
if (FALSE) {
U <- pobs(cbind(X, Y)) #' # Compute the pseudo-observations
fit <- fitCopula(normalCopula(), U, method = 'ml')
U <- cCopula(u = U, copula = fit@copula) ## Rosenblatt transformation
parMargins <- list(
list(mean = 0, sd = 10),
list(mean = 0, sd = 10)
)
ptm <- proc.time()
gof <- gofCopula(copula = fit@copula, x = U, N = 99, method = "Sn",
simulation = "pb")
(proc.time() - ptm)[3]/60 # in min
gof
## --- Parallel computation with 2 cores ---- -
## Same algorithm as in 'gofCopula' adapted for parallel computation
ptm <- proc.time()
system.time(
gof <- bicopulaGOF(x = X, y = Y, copula = "normalCopula",
margins = margins, paramMargins = parMargins,
nboots = 99, approach = "Sn", seed = 12,
num.cores = 2L,
verbose = FALSE)
)
(proc.time() - ptm)[3]/60 # in min
gof
}