bicopulaGOF {usefr}R Documentation

Goodness of fit for Bidimensional Copula with Known Margins

Description

Goodness-of-fit (GOF) tests for a two-dimensional copula based, by default, on the knowledge of the marginal probability distributions. Several functionalities/tools from copula-package are integrated to perform the GOF of copulas that includes specific margin parameter settings. In terms of copula-package vocabulary, these are GOF for copula objects from class Mvdc (also called non-free copulas).

Usage

bicopulaGOF(x, y, copula = NULL, margins = NULL, paramMargins = NULL,
  sample.size = NULL, nboots = 10, approach = c("adchisq", "adgamma",
  "chisq", "rmse", "Sn", "SnB", "SnC"), Rosenblatt = FALSE,
  breaks = 12, method = "ml", num.cores = 1L, tasks = 0,
  seed = 123, verbose = TRUE, ...)

Arguments

x

Numerical vector with the observations from the first margin distribution.

y

Numerical vector with the observations from the second margin distribution.

copula

A copula object from class Mvdc or string specifying all the name for a copula from package copula-package.

margins

A character vector specifying all the parametric marginal distributions. See details below.

paramMargins

A list whose each component is a list (or numeric vectors) of named components, giving the parameter values of the marginal distributions. See details below.

sample.size

The size of the samples used for each sampling. It is not required for the approaches: "Sn", "SnB", and "SnC"; see below.

nboots

The number of booststrap resampling to perform.

approach

a character string specifying the goodness-of-fit test statistic to be used, which has to be one (or a unique abbreviation) of following: "adchisq", "adgamma", "Sn", "SnB", "SnC", "chisq", and "rmse". With the exception of chisq and rmse, all the other statistics are the same as in functions gofTstat and gofCopula. The test using chisq implement the approach described in reference [1].

Rosenblatt

The Anderson–Darling statistic approach using Rosenblatt transformation is normally used for the GOF in function gofCopula from copula-package package. since, the current function applies a parametric bootstrap approach generating random variates from the analytical expression for the margin CDFs, the test does not depend on the theoretical distribution of the Anderson–Darling statistic. Simulations suggest, so far, that the application of Rosenblatt transformation may not be needed in this case. SO, the desicion on whether to apply the Rosenblatt transformation (computational expensive for big datasets) is left to the users.

breaks

A single number giving the number of bins for the computation of the Pearson's Chi-squared statistic as suggested in reference [1]. Bascally, it is used to split the unit square [0, 1]^2 into bins/regions.

method

A character string specifying the estimation method to be used to estimate the dependence parameter(s); see fitCopula.

num.cores, tasks

Paramaters for parallele computation using package BiocParallel-package: the number of cores to use, i.e. at most how many child processes will be run simultaneously (see bplapply and the number of tasks per job (only for Linux OS).

seed

An integer used to set a 'seed' for random number generation.

verbose

if verbose, comments and progress bar will be printed.

Details

Notice that copula-package already have function gofCopula to perform GOF. However, its use can be computational expensive for big datasets.

Value

The statistic value estimated for the observations, and the estimated bootstrap p.value.

Author(s)

Robersy Sanchez (https://genomaths.com).

References

  1. Jaworski, P. Copulae in Mathematical and Quantitative Finance. 213, d (2013).

  2. Wang, Y. et al. Multivariate analysis of joint probability of different rainfall frequencies based on copulas. Water (Switzerland) 9, (2017).

See Also

ppCplot, gofCopula, fitCDF, fitdistr, and fitMixDist

Examples

require(stats)

set.seed(12)
margins = c("norm", "norm")
## Random variates from normal distributions
X <- rnorm(2*1e3, mean = 0, sd = 10)
Y <- rnorm(2*1e3, mean = 0, sd = 10)

parMargins = list( list(mean = 0, sd = 10),
                   list(mean = 0, sd = 10))

bicopulaGOF(x = X, y = Y, copula = "normalCopula", sample.size = 1e2,
            margins = margins, paramMargins = parMargins, nboots = 999,
            Rosenblatt = TRUE, approach = "adgamma", num.cores = 1L)

bicopulaGOF(x = X, y = Y, copula = "normalCopula", sample.size = 1e2,
            margins = margins, paramMargins = parMargins, nboots = 999,
            Rosenblatt = FALSE, approach = "adgamma", num.cores = 1L)

## --- Non-parallel expensive computation ---- -
# require(copula)
#
# U <- pobs(cbind(X, Y)) #' # Compute the pseudo-observations
# fit <- fitCopula(normalCopula(), U, method = 'ml')
# U <- cCopula(u = U, copula = fit@copula) #' # Rosenblatt transformation
#
# set.seed(123)
# system.time(
#   gof <- gofCopula(copula = fit@copula, x = U, N = 99, method = "Sn",
#             simulation = "pb")
# )
# gof
## About
##    user  system elapsed
## 103.370   0.613 105.022
#
## --- Parallel computation with 2 cores ---- -
## Same algorithm as in 'gofCopula' adapted for parallel computation
# system.time(
#   gof <- bicopulaGOF(x = X, y = Y, copula = "normalCopula",
#               margins = margins, paramMargins = parMargins, nboots = 99,
#               Rosenblatt = TRUE, approach = "Sn", seed = 123,
#               num.cores = 2L)
# )
# gof
## About
##  user  system elapsed
## 2.491   0.100  51.185
##
## Same algorithm as in 'gofCopula' adapted for parallel computation and
## Rosenblatt = FALSE
# system.time(
#   gof <- bicopulaGOF(x = X, y = Y, copula = "normalCopula",
#               margins = margins, paramMargins = parMargins, nboots = 99,
#               Rosenblatt = FALSE, approach = "Sn", seed = 123,
#               num.cores = 2L)
# )
# gof

[Package usefr version 0.1.0 ]