R/mutualinf.R
mutualinf.Rd
Computes the mutual information for pairwise x and y marginal values based on their multivariate distribution constructed from a copula.
mutualinf(
x,
y,
copula = NULL,
margins = NULL,
paramMargins = NULL,
method = "ml",
ties.method = "max"
)
marginal variates
A copula object from class Mvdc
or
string specifying all the name for a copula from package
copula-package
.
A character vector specifying all the parametric marginal distributions. See details below.
A list whose each component is a list (or numeric vectors) of named components, giving the parameter values of the marginal distributions. See details below.
A character string specifying the estimation method to be used
to estimate the dependence parameter(s) (if the copula needs to be
estimated) see fitCopula
.
A list with a data frame carrying the estimated mutual information for each (x, y) pair, the joint and marginal probabilities, and the "mvdc" copula object.
The mutual information of a pairwise x and y marginal values is defined as:
$$I{x, y} = log(P(x,y)) - (log(P_1(x)) + log(P_2(y)))$$
where P(x,y) is the multivariate distribution constructed from a copula, and P_1(x) and P_2(y) are the marginal CDFs.
The values \(I{x, y}\) expresses a measurement of the relative dependece/independece of x and y at the specified point value.
Notice that the above definition expresses the differences between two uncertainty variations. So, for values \(I{x, y} > 0\), we shall say that at point (x, y) there is a gain of information for the association of the subjacent stochastic processes generating x and y in respect to the independent processes. Otherwise, for values \(I{x, y} < 0\) we shall say that at point (x, y) there is a loss of information for the association of the subjacent stochastic process generating x and y in respect to the independent processes. Or, equivallently, there is a gain of information for the independent processes in respect to their association.
ppCplot
, bicopulaGOF
,
gofCopula
, fitCDF
,
fitdistr
, and fitMixDist
.
require(stats)
set.seed(12) # set a seed for random number generation
## Random generation of a Normal distributed marginal variate
X <- rnorm(2000, mean = 1, sd = 0.2)
## Random generation of a Weibull-3P distributed marginal variate
Y <- X + rweibull3p(2000, shape = 2, scale = 0.85, mu = 1)
## Correlation test
cor.test(X, Y, method = "spearman")
#>
#> Spearman's rank correlation rho
#>
#> data: X and Y
#> S = 742966548, p-value < 2.2e-16
#> alternative hypothesis: true rho is not equal to 0
#> sample estimates:
#> rho
#> 0.4427749
#>
## Non-linear model fit for 'Y' distribution values
fitY <- fitCDF(Y, distNames = 12) # 3P Weibull distribution model
#>
#> *** Fitting 3P Weibull distribution ...
#> .Fitting Done.
#> ** Done ***
coefs <- coef(fitY$bestfit) # model coefficients
## Goodness-of-fit test for the Weibull-3P distribution model
mcgoftest(
varobj = Y, distr = "weibull3p", pars = coefs, num.sampl = 99,
sample.size = 1999, stat = "chisq", num.cores = 4, breaks = 200,
seed = 123
)
#> *** Permutation GoF testing based on Pearson's Chi-squared statistic ( parametric approach ) ...
#>
|
| | 0%
|
|================== | 25%
|
|=================================== | 51%
|
|===================================================== | 76%
|
|======================================================================| 100%
#>
#> Chisq mc_p.value sample.size num.sampl
#> 246.6798 0.1000 1999.0000 99.0000
## Settngs to estimate the Mutual information
margins <- c("norm", "weibull3p")
parMargins <- list(
list(mean = 1, sd = 0.2),
as.list(coefs)
) # Notice "as.list" is used here, not "list"
## Finally estimation of the mutual information
mutual.Inf <- mutualinf(
x = X, y = Y, copula = "normalCopula",
margins = margins, paramMargins = parMargins
)
head(mutual.Inf$stat)
#> jprob p1 p2 x y mInf
#> 1 0.012212519 0.06936092 0.03654972 0.7038865 2.063813 2.268233724
#> 2 0.935912349 0.94262173 0.99623362 1.3154339 4.097047 -0.004861521
#> 3 0.026703146 0.16934812 0.05253993 0.8086511 2.106610 1.585531548
#> 4 0.065589060 0.17878501 0.15761051 0.8159990 2.297089 1.218865734
#> 5 0.006073589 0.02287774 0.04545271 0.6004716 2.088608 2.546166746
#> 6 0.321124429 0.39269720 0.65771400 0.9455408 2.902408 0.314182830
## The fitted copula is also returned, so, it can be used in downstream
## analyses
mutual.Inf$copula@copula
#> Normal copula, dim. d = 2
#> Dimension: 2
#> Parameters:
#> rho.1 = 0.4647337