A \(d\)-dimensional copula is a multivariate cumulative distribution function
\(C:[0,1]^d\rightarrow[0,1]\) with marginals which are uniformly distributed on the unit interval:
\begin{equation}
C(u_1, ..., u_d)=\mathbb{P}(U_1\le u_1, ..., U_d\le u_d),
\end{equation}
where \(U_j\sim\mathcal{U}[0,1], \ \forall j\).
Definition 2 — Copula density
Given a differentiable copula \(C\) the copula density \(c\) is defined by:
\begin{equation}
c(u_1, ..., u_d) = \frac{\partial^dC(u_1, ..., u_d)}{\partial u_1...\partial u_d}.
\label{eq:copula_density}
\end{equation}
It follows that \(C(u_1, ..., u_d)\) is monotonically
increasing in each component \(u_i=C(1,...,u_i, ..., 1)\).
Moreover, the copula function \(C\) is isotonic,
that is \(u_j\le v_j\ \forall j \implies C(u_1, ..., u_d)\le C(v_1, ..., v_d)\).
Finally, notice that (under mild regularity conditions) one has the following proposition.
Proposition 3
Given a \(d\)-dimensional copula \(C\) one obtains
the conditional cumulative distribution function
\(F(\mathbf{u}|U_j=u_j)=\mathbb{P}(U_1\le u_1, ..., U_d\le u_d\ |\ U_j=u_j) \) as
\begin{equation}
F(\mathbf{u}|U_j=u_j)=\frac{\partial}{\partial u_j} C(u_1, ..., u_d).
\label{eq:conditional_copula}
\end{equation}
In order to better understand the significance of Definition 1
let us consider a two-dimensional random vector \((X_1, X_2)\)
with \(X_j\sim F_j\) and denoting the joint CDF by \(F\).
Applying the probability integral transform
one can find two marginals \(U_j=F_j(X_j)\)
uniform on the unit interval so that one can write
\begin{equation}
F(x_1, x_2)=C(F_1(x_1), F_2(x_2)).
\label{eq1}
\end{equation}
Therefore, a copula is the recipe needed to link
the marginals to the joint distribution.
Exercise 1
Show that \(U_j=F_j(X_j)\) implies \(U_j\sim\mathcal{U}[0,1]\).
Start by considering the CDF \(F_{U_j}\) of \(U_j\). Then:
\begin{align*}
F_{U_j}(u_j) &= \mathbb{P}(U_j\le u_j) \newline
&= \mathbb{P}(F_j(X_j)\le u_j) \newline
&= \mathbb{P}(X_j\le F_j^{-1}(u_j)) \newline
&= F_j(F_j^{-1}(u_j)) = u_j.
\end{align*}
Because \(F_j(X_j)\in [0, 1]\) one can see that \(F_{U_j}(u_j) = u_j\) is the CDF of a
uniform distribution on the unit interval.
Therefore \(U_j\sim\mathcal{U}[0,1]\).
Consider two uniform random variables \(U_1\) and \(U_2\) with known differentiable copula \(C\).
Assuming \(U_1=u_1\) is observed, prove \eqref{eq:conditional_copula} in Proposition 3
for this \(d=2\) case.
Equation \eqref{eq1} can be formalised into the following theorem.
Theorem 4 — Sklar
Given a \(d\)-dimensional cumulative distribution function \(F\), with marginals \(F_j\),
there exists a copula \(C\) such that
\begin{equation}
F(x_1, ..., x_d) = C(F_1(x_1), ..., F_d(x_d)),
\label{eq:sklar}
\end{equation}
for all \(x_j\in\mathbb{R}\). Similarly, given a copula \(C\)
and a set of \(d\) univariate cumulative distribution functions labelled \(F_1, ..., F_d\),
equation \eqref{eq:sklar} uniquely defines a joint multivariate
cumulative distribution function \(F\) with marginals \(F_1, ..., F_d\).
One has the following corollaries.
Corollary 4.1
From Sklar's theorem it follows that given a joint
cumulative distribution function \(F\) and its marginals, the copula is defined by
\begin{equation}
C(u_1, .., u_d) = F\left(F^{-1}_1(u_1), ..., F^{-1}_d(u_d)\right),
\end{equation}
where \(F^{-1}_j(u_j) = \inf\{x: F_j(x)\ge u_j\}\).
Corollary 4.2
From Sklar's theorem it follows that given the
marginal cumulative distribution functions \(F_i\) and the associated
marginal probability density functions \(f_i\),
the joint probability density function \(f\) can be written as
\begin{equation}
f(x_1, .., x_d) = c\left(F_1(x_1), ..., F_d(x_d)\right)\prod_{i=1}^df_i(x_i),
\label{eq:density_from_sklar}
\end{equation}
where \(c\) is the copula density defined in \eqref{eq:copula_density}.
Fundamental copulae
Let us now consider three fundamental examples corresponding to the cases of
perfect dependence and independence. As we shall discuss,
these are extreme cases which provide bounds to any other copula, and constitute therefore
preliminary definitions for some results which will be presented later.
First we consider the case of independent random variables.
Definition 5 — Independence copula
The joint distribution of \(d\) independent random variables is defined by the
independence copula
\begin{equation}
C_{\text{ind}}(u_1, ..., u_d) = \prod_{i=1}^d u_i.
\label{eq:independence_copula}
\end{equation}
When all the random variables considered are deterministically related \(X_j = T_{ji}(X_i)\)
for some set of strictly increasing transformations \(T_{ji}=T_{ij}^{-1}\), one can derive a relation between each pair
of cumulative distribution functions as
which for all \(i,j\) implies
\[U_i = F_i(X_i) = F_j(T_{ji}(X_i)) = F_j(X_j) = U_j.\]
Similarly, consider two random variables such that
\(X_2=D(X_1)\); for any strictly decreasing transformation \(D\) one has
This motivates the definition of the following comonotonicity and counter-monotonicity copulae.
Definition 6 — Comonotonicity copula
The copula associated with \(d\) equal uniformly distribute variables \(U_i=U_j,\ \forall\ i,j\),
is called a comonotonicity copula and takes the form
\begin{equation}
C_{\text{co}}(u_1, ..., u_d) = \min\{u_1, ..., u_d\}.
\label{eq:comonotonicity_copula}
\end{equation}
Definition 7 — Counter-monotonicity function
The copula associated with \(d=2\) uniformly distributed random variables \(U_2=1-U_1\)
is called a counter-monotonicity copula as takes the form
\begin{equation}
W_{\text{counter}}(u_1, u_2) = \max\{u_1 + u_2 -1, 0\}.
\label{eq:counter_monotonicity_copula}
\end{equation}
A generalisation of \eqref{eq:counter_monotonicity_copula} to arbitrary \(d\) is given
by the counter-monotonicity function
\begin{equation}
W_{\text{counter}}(u_1, ..., u_d) = \max\{\sum_{i=1}^d u_i - d + 1, 0\};
\label{eq:counter_monotonicity_function}
\end{equation}
notice however that \eqref{eq:counter_monotonicity_function} is not a copula.
Exercise 4
Show that given two independent random variables \(X_1\) and \(X_2\)
the associated joint multivariate cumulative distribution function takes
indeed the form \eqref{eq:independence_copula} from Definition 3.
The joint multivariate cumulative distribution funtion is \(F(x_1, x_2) = C(F_1(x_1), F_2(x_2))\). The independence of the two random variables \(X_1\perp\!\!\!\perp X_2\)
also implies the associated uniform random variables \(U_j=F_j(X_j)\) are also independent:
\(U_1 \perp\!\!\!\perp U_2\).
Therefore, one has:
\begin{align*}
C(u_1, u_2) &= \mathbb{P}(U_1 \leq u_1, U_2 \leq u_2) \newline
&= \mathbb{P}(U_1 \leq u_1) \mathbb{P}(U_2 \leq u_2) \newline
&= u_1 u_2.
\end{align*}
Hence, \(F(x_1, x_2) = F_1(x_1)F_2(x_2)\).
◻
Exercise 5
Show that Definition 6 indeed is the copula associated with perfectly correlated random variables
\(U_1=U_2\) for the case \(d=2\).
As mentioned at the beginning of this section, these are extreme cases which
are fundamental as they provide bounds to any other copula. This concept is formalised
by the following theorem.
Theorem 8 — Fréchet-Hoeffding bounds
Given a copula
\(C(\mathbf{u}) = C(u_1,..., u_d)\)
one always has
\begin{equation}
W_{\text{counter}}(\mathbf{u})\le C(\mathbf{u}) \le C_{\text{co}}(\mathbf{u}).
\label{eq:Frechet_Hoeffding_bounds}
\end{equation}
Fig.1: The bivariate countermonotonicity, independence, and comonotonicity copulae.
All copulae exist as intermediate cases between these three.
Finally, let us consider the following proposition.
Proposition 9
Consider a random vector \(\mathbf{X}\) with copula \(C\),
and let \(T_1, ..., T_d\) be a set of \(d\) strictly increasing transformations.
Then, the copula of the transformed vector \((T_1(X_1), ..., T_d(X_d))\) is also \(C\).
Parametric families of copulae
The following table provides some examples of notable parametric families.
We can recognise two main classes of copulae: elliptical and archimedean.
Elliptical copulae are implicit copulae arising from
elliptical distributions
(such as the normal, Student-t, and mixtures of normal distribution)
via Sklar's theorem. Elliptical copulae have the advantage of allowing for
flexible modelling of pairwise dependency, are simple to sample from, and usually densities are known explicitely.
On the other hand, the copula itself is not explicit, and is characterised by
radial symmetry, which implies identical lower and upper tail behaviour,
as it we shall discuss.
Archimedean copulae instead are explicit copulae which can be written
in terms of some monotone generator
\(\psi:[0,\infty[\rightarrow [0,1]\) such that \(\psi(0)=1\), \(\psi(\infty)=0\).
The associated archimedean copula is then written as
\begin{equation}
C(u_1, ..., u_d) = \psi\left(\sum_{i=1}^d\psi^{[-1]}(u_i)\right),
\end{equation}
with the pseudo-inverse \(\psi^{[-1]}(x)=\psi^{-1}(x)\mathbb{1}_{x\le\psi(0)}\).
As an example, the generators of the archimedean copulae in the table above are:
\begin{align*}
\psi_{\text{Clayton}}(x) &= \theta^{-1}(x^{-\theta}-1),\newline
\psi_{\text{Gumbel}}(x) &= (-\log x)^\theta,\newline
\psi_{\text{Joe}}(x) &= -\log(1-(1-x)^\theta),\newline
\psi_{\text{Frank}}(x) &= -\log\frac{e^{\theta x}-1}{e^{-\theta}-1}.
\end{align*}
The explicit nature of Archimedean copulae makes them particularly suitable for
closed form calculations, allowing for many of their properties to be
explicitely computed in term of the generator \(\psi\). Moreover they are not
anymore restricted to radial symmetries.
Fig.2: Bivariate probability density functions with normal marginals and selected copulae.
import numpy as np, matplotlib.pyplot as plt, seaborn as sns
import scipy.stats as st
sns.set()
n = 1000
x1 = x2 = np.linspace(-2.7, 2.7, n)
dx = x1[-1] - x2[-2]
u1, u2 = st.norm.cdf(x1), st.norm.cdf(x2)
# INDEPENDENCE COPULA
def Ind_cop(u1, u2):
U1, U2 = np.meshgrid(u1, u2)
return U1 * U2
# GAUSSIAN COPULA
def Gauss_cop(u1, u2, rho=0.5):
"""Rho in [0, 1]"""
iU1, iU2 = np.meshgrid(st.norm.ppf(u1), st.norm.ppf(u2))
jnt = st.multivariate_normal.cdf(np.array([iU1, iU2]).T,
mean=[0, 0], cov=[[1, rho], [rho, 1]])
return jnt
# CLAYTON COPULA
def Clayton_cop(u1, u2, theta=1.):
"""Theta in [-1, inf[ \ {0}"""
U1, U2 = np.meshgrid(u1, u2)
jnt = (U1**(-theta) + U2**(-theta) - 1).clip(min=0) ** (-1/theta)
return jnt
# GUMBEL COPULA
def Gumbel_cop(u1, u2, theta=1.):
"""Theta in [1, inf["""
U1, U2 = np.meshgrid(u1, u2)
jnt = np.exp(-((-np.log(U1))**theta
+(-np.log(U2))**theta)**(1/theta))
return jnt
# JOE COPULA
def Joe_cop(u1, u2, theta=1.):
"""Theta in [1, inf["""
U1, U2 = np.meshgrid(u1, u2)
jnt = 1 - ((1-U1)**theta + (1-U2)**theta - ((1-U1)*(1-U2))**theta) ** (1/theta)
return jnt
# FRANK COPULA
def Frank_cop(u1, u2, theta=1.):
"""Theta in [-inf, inf[ \ {0}"""
U1, U2 = np.meshgrid(u1, u2)
jnt = -(1/theta) * np.log(1+((np.exp(-theta*U1) - 1)*(np.exp(-theta*U2) - 1))/(np.exp(-theta) - 1))
return jnt
J = Clayton_cop(u1, u2, theta=5)
dJ = (np.diff(np.diff(J).T).T).clip(min=1e-10)
plt.contour(x1[1:], x2[1:], dJ, 10)
plt.show()
Exercise 7
Given a vector \((X_1, X_2)\) from the standard bivariate
normal distribution with correlation \(\rho\), derive the gaussian copula
presented in the table above.
The associated copula is given by
\begin{align*}
C(u_1, u_2) &= \mathbb{P}(U_1 \leq u_1, U_2 \leq u_2) \newline
&= \mathbb{P}(X_1 \leq \Phi^{-1}(u_1), X_2 \leq \Phi^{-1}(u_2)) \newline
&= \Phi_2(\Phi^{-1}(u_1), \Phi^{-1}(u_2); \rho).
\end{align*}
◻
Exercise 8
Show that the limiting copula of the Clayton copula \(C_\theta^{\text{Clayton}}(u_1, u_2)\)
for \(\theta\rightarrow\infty\) is the comonotonicity copula \(C_{\text{co}}\) and
for \(\theta\rightarrow -1\) is the countermonotonicity copula \(W_{\text{counter}}\).
Moreover show that for \(\theta\rightarrow 0\) Clayton's copula tends to the independence copula.
The case \(\theta\rightarrow -1\) is trivial.
◻
Let's now consider the case \(\theta\rightarrow \infty\).
Assume \(u_1 \le u_2 \) in \([0,1]\). Then
\[
u_1^{-\theta} \le \max\left\{u_1^{-\theta}+u_2^{-\theta} - 1,\ 0\right\} \le 2u_1^{-\theta}.
\]
This in turns implies
\[
2^{-1/\theta}u_1\le C_\theta^{\text{Clayton}}(u_1, u_2) \le u_1.
\]
Hence, one has
\[
\lim_{\theta\rightarrow \infty}C_\theta^{\text{Clayton}}(u_1, u_2) = \text{min}\{u_1, u_2\}
=C_{\text{co}}
\]
◻
Let's now consider the case \(\theta\rightarrow 0\). For \(u_1, u_2 \in [0,1]\) one has
\[
u_j^{-\theta} = e^{-\theta\log u_j} \xrightarrow[]{\theta\rightarrow 0} 1-\theta\log u_j +o(\theta).
\]
Hence
\begin{align*}
\log C_\theta^{\text{Clayton}}(u_1, u_2)
&= \frac{1}{\theta}\log(1-\theta\log(u_1u_2) + o(\theta)) \newline
&\xrightarrow[]{\theta\rightarrow 0}\log(u_1u_2)
\end{align*}
◻
Simulating from Copulae
Let us now consider the essential recipes for drawing
uniform random vectors \((U_1, ..., U_d)\) with a desired copula and uniform marginals,
which can be useful e.g. to perform Monte Carlo simulations. Clearly,
arbitrary marginals can be obtained via the quantile transformation as dictated
by the probability integral transform discussed earlier.
Implicit copulae
For implicit copulae derived from a multivariate distribution such as
the Gaussian and Student-t the procedure is quite straightforward.
Algorithm 10 — Gaussian Copula
Given a correlation matrix \(\Pi\) perform a Cholesky decomposition, that is,
find \(\Lambda\) such that \(\ \Pi = \Lambda^T\Lambda\);
Generate iid standard normals \(\mathbf{G}= (G_1, ..., G_d)^T\);
Compute \(\ \mathbf{X} = \Lambda\mathbf{G}\);
Return the desired vector of \(\ U_i = \Phi(X_i)\),
with \(\Phi\) the standard normal cumulative distribution function.
Algorithm 11 — Student-t Copula
Given a correlation matrix \(\ \Pi\), generate a normal vector \(\mathbf{X}\)
performing steps (1) to (4) from Algorithm 10;
Generate \(\xi=\sum_{i=1}^\nu Y_i^2\),
with \(Y_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0,1)\),
so that \(\xi\stackrel{\text{iid}}{\sim}\chi_\nu^2\);
Return the desired vector \(\ U_i = t_\nu(X_i/\sqrt{\xi/\nu})\),
with \(t_\nu\) the univariate Student-t cumulative distribution function.
Explicit copulae
The Archimedean copulae we have considered are actually part of a class of
copulae known as Laplace Transform Archimedean Copulae (LT-Archimedean for short)
where the inverse of the generator \(\phi^{[-1]}\) can be represented as a Laplace
transform of some function \(G\):
\begin{equation}
\phi^{[-1]}(x) = \int_0^\infty e^{-tx}\text{d}G(t).
\label{eq:LT-A}
\end{equation}
More specifically, one can show [4] that all completely monotone
Archimedean generators are Laplace-Stieltjes transforms
of distribution functions on \(\mathbb{R}^+\).
With this in mind, we are now ready to consider an algorithm to sample from such
copulae.
Algorithm 12 — LT-Archimedean Copulae
Given an LT-Archimedean copula with generator \(\phi\),
Find a CDF \(G\) which satisfies \eqref{eq:LT-A};
Generate a variate \(V\sim G\);
Generate uniform iid \(X_1, ..., X_d\);
Return the vector with elements
\(U_i = \phi^{[-1]}\left(-\frac{\log X_i}{V}\right)\).
For the special case of \(d=2\) one has however a simpler approach.
Algorithm 13 — 2D Explicit Copulae
Consider the conditional distribution
\(F_{1}=\mathbb{P}(U_1\le u_1 | U_2=u_2)\).
Recalling \eqref{eq:conditional_copula} from Proposition 3,
this is equivalent to
\begin{equation*}
F_{1} = c_{1}(u_1, u_2) = \frac{\partial}{\partial u_2} C(u_1, u_2).
\end{equation*}
Therefore one obtains the desired
\((U_1,\ U_2)\) via the following algorithm:
Sample \(u_1, t \overset{\text{iid}}{\sim} \mathcal{U}[0,1]\);
Compute \(u_2 = c_{1}^{[-1]}(u_1, t)\);
Example 1: gaussian copula and gaussian marginals
The simplest example is that in which we sample
from a gaussian joint distribution, since in most programming
languages this is often well implemented in libraries.
Fig.3 below presents an example where the marginals follow standard
normal distributions and the gaussian copula has linear correlation
\(\rho=0.75\).
Fig.3: Joint distribution with gaussian copula
and standard normal marginals.
import pandas as pd, matplotlib.pyplot as plt, seaborn as sns
import scipy.stats as st
N = 1_000_000
mvnorm = st.multivariate_normal(mean=[0, 0], cov=[[1., 0.75],
[0.75, 1.]])
X = pd.DataFrame(mvnorm.rvs(N), columns=['X1', 'X2'])
jnt_kw = dict(height=5, ratio=5, space=0,
color="tab:blue", alpha=0)
h = (sns.jointplot(data=X, x='X1', y='X2', marginal_kws=dict(bins=30),
xlim=(-3,3), ylim=(-3,3), kind='hex', gridsize=2, **jnt_kw)
.plot_joint(sns.kdeplot, shade=False, cmap='viridis'))
plt.show()
Example 2: gaussian copula and non-gaussian marginals
In order to produce samples from a joint with gaussian copula
but non-gaussian marginals we first need to obtaint the copula's
uniform marginals \(U_i = \Phi(X_i)\).
Fig.4: Gaussian copula
with its uniform marginals.
import pandas as pd, matplotlib.pyplot as plt, seaborn as sns
import scipy.stats as st
N = 1_000_000
mvnorm = st.multivariate_normal(mean=[0, 0], cov=[[1., 0.75],
[0.75, 1.]])
X = pd.DataFrame(mvnorm.rvs(N), columns=['X1', 'X2'])
jnt_kw = dict(height=5, ratio=5, space=0,
color="tab:blue", alpha=0)
U = pd.DataFrame(st.norm().cdf(X), columns=['U2', 'U1'])
h = (sns.jointplot(data=U, x='U1', y='U2', marginal_kws=dict(bins=30),
xlim=(0,1), ylim=(0,1),kind='hex', gridsize=2, **jnt_kw)
.plot_joint(plt.hexbin, gridsize=30, cmap='Blues'))
plt.show()
Finally, the copula's uniforms can be transformed to the desired marginals.
Take as an example \(\widetilde{X}_1\sim \text{Beta}(\alpha=3, \beta=10)\),
and \(\widetilde{X}_2\sim \text{Gumbel}(\mu=0, \beta=1)\): Figure 5 below
plots the joint density with these marginals
and gaussian copula with \(\rho=0.75\),
while Figure 6, plots these two marginals on the independence copula.
Fig.5: Gaussian copula
with \(\text{Beta}(\alpha=3, \beta=10)\) and
\(\text{Gumbel}(\mu=0,\beta=1)\)
marginals.Fig.6: Independence copula
with \(\text{Beta}(\alpha=3, \beta=10)\) and
\(\text{Gumbel}(\mu=0,\beta=1)\)
marginals.
import pandas as pd, matplotlib.pyplot as plt, seaborn as sns
import scipy.stats as st
N = 1_000_000
mvnorm = st.multivariate_normal(mean=[0, 0], cov=[[1., 0.75],
[0.75, 1.]])
X = pd.DataFrame(mvnorm.rvs(N), columns=['X1', 'X2'])
jnt_kw = dict(height=5, ratio=5, space=0,
color="tab:blue", alpha=0)
U = pd.DataFrame(st.norm().cdf(X), columns=['U2', 'U1'])
# With Gaussian Copula
x1_t = st.beta(a=3, b=10).ppf(U.U1)
x2_t = st.gumbel_l().ppf(U.U2)
X_trans = pd.DataFrame([x1_t, x2_t], index=['XT1', 'XT2']).T
h = (sns.jointplot(data=X_trans, x='XT1', y='XT2', kind='hex', gridsize=2,
marginal_kws=dict(bins=30),
xlim=(0, 0.6), ylim=(-5, 2.0),
**jnt_kw)
.plot_joint(sns.kdeplot, cmap='viridis'))
plt.show()
# With Independence Copula
x1, x2 = st.beta(a=3, b=10).rvs(N), st.gumbel_l().rvs(N)
h = (sns.jointplot(x=x1, y=x2, kind='hex', marginal_kws=dict(bins=30),
xlim=(0,0.6), ylim=(-5,2.0),
**jnt_kw)
.plot_joint(sns.kdeplot, cmap='viridis'))
plt.show()
Example 3: non-gaussian copula and non-gaussian marginals
At last let's replicate the last exercise replacing the gaussian copula
with e.g. a Clayton copula.
One can find [5] the correct
cumulative distribution function \(G\) to be considered in this case
is that of a Gamma distribution \(\Gamma(\theta^{-1},1)\).
Fig.7: Clayton copula
with \(\theta=2\).Fig.8: Clayton copula
with \(\text{Beta}(\alpha=3, \beta=10)\) and
\(\text{Gumbel}(\mu=0,\beta=1)\)
marginals.
import pandas as pd, matplotlib.pyplot as plt, seaborn as sns
import scipy.stats as st
jnt_kw = dict(height=5, ratio=5, space=0,
color="tab:blue", alpha=0)
N = 1_000_000
θ = 2
def clayton(θ, N, d=2):
Γ = st.gamma.rvs(a=1/θ, scale=1, size=N)
X = st.uniform.rvs(size=(d,N))
return (1 - np.log(X) / Γ) ** (-1/θ)
U1, U2 = clayton(θ=θ, N=N, d=2)
U = pd.DataFrame([U1, U2], index=['U1', 'U2']).T
h = (sns.jointplot(data=U, x='U1', y='U2', marginal_kws=dict(bins=30),
xlim=(0,1), ylim=(0,1),
kind='hex', gridsize=2, **jnt_kw)
.plot_joint(plt.hexbin, gridsize=30, cmap=my_cmap))
plt.show()
x1_t = st.beta(a=3, b=10).ppf(U.U1)
x2_t = st.gumbel_l().ppf(U.U2)
X_trans = pd.DataFrame([x1_t, x2_t], index=['XT1', 'XT2']).T
h = (sns.jointplot(data=X_trans, x='XT1', y='XT2', kind='hex', gridsize=2,
marginal_kws=dict(bins=30),
xlim=(0, 0.6), ylim=(-6, 2.0),
** jnt_kw)
.plot_joint(sns.kdeplot, cmap='viridis'))
plt.show()
In Python, the package
OpenTURNS
provides a fast implementation of copulas. As an example, one can perform sampling
from the Clayton copula we have considered so far simply as it follows:
import openturns as ot, numpy as np
X = ot.ComposedDistribution([ot.Uniform(0, 1),
ot.Uniform(0, 1)],
ot.ClaytonCopula(2))
U1, U2 = np.array(X.getSample(1_000_000)).T
As usual the two uniform marginals can now be transformed into
whichever marginals one might desire.
Estimation: Fitting Copulae to Data
The task of fitting a copula to data is a rather challenging one.
Generally speaking, one approach could follow a Maximum Likelihood Estimation principle.
For example, given the random vector \(\mathbf{X}=(X_1, ..., X_d)\), and assuming
parametric models for the copula density \(c_{\mathbf{X}}(\cdot|\mathbf{\theta}_C)\)
and for the marginal cumulative distribution functions
\(F_1(\cdot|\mathbf{\theta}_1), ..., F_d(\cdot|\mathbf{\theta}_d)\), one can write
an expression for the density of \(\mathbf{X}\) as in \eqref{eq:density_from_sklar}:
\[
f_\mathbf{X}(\mathbf{x})=c_{\mathbf{X}}(F_{X_1}(x_1), ..., F_{X_d}(x_d))\prod_{i=1}^d f_{X_i}(x_i).
\]
Given a vector of \(n\) observations \(\mathbf{D}_{1:n}=(\mathbf{D}_1, ..., \mathbf{D}_n)\)
one can then attempt
to maximise the log-likelihood
so as to obtain the ML estimators
\(\hat{\mathbf{\theta}}_1, ..., \hat{\mathbf{\theta}}_d, \hat{\mathbf{\theta}}_C\).
The problem with this approach is found in the very large number of parameters
to be estimated, which complicates optimisation. Moreover, the mispecification of
one single univariate distributions \(F_{X_i}(\cdot|\mathbf{\theta}_i)\) can introduce
biases for all other marginals as well as for the copula.
One alternative is provided by pseudo-MLE whereby one splitts the estimation
into a two-step procedure, estimating the marginals first,
and the copula in a second stage.
In particular, one starts by estimating the marginal cumulative distribution functions
\(F_{X_i}(\cdot|\mathbf{\theta}_i)\) yielding the estimator \(\hat{F}_{X_i}\), either using a parametric model to obtain
a univariate ML-estimator for \(\hat{\mathbf{\theta}}_i\), or even considering the
empirical cumulative distribution function. Then, one estimates the copula as a second
step, again maximising the log-likelihood
For Gaussian and Student-t distributed \(\mathbf{X}=(X_1, ..., X_d)\),
and Gaussian or Student-t copulae with covariance matrix \(\Omega\),
the following holds:
\begin{equation}
\rho_\tau(X_i, X_j) = \frac{2}{\pi}\arcsin(\Omega_{ij}),
\end{equation}
where \(\rho_\tau\) denotes Kendall's tau.
A number of non-parametric methods exist as well, such as the
empirical Bernstein copula [6]
as well as kernel smoothing estimators.
The Python package
OpenTURNS
provides
utilities
to fit these non-parametric copula models.
Estimation in practice: parametric estimates
Practically, python has a number of packages (although still in their infancy)
which implement similar methods. Popular ones are
Copulas,
Copulae
and
OpenTURNS.
Let's consider a few example with
OpenTURNS.
The simplest case is that of estimating the gaussian copula associated with
a multivariate normal distribution, which effectively boils down
to estimating the correlation matrix.
The code
import openturns as ot, numpy as np, scipy.stats as st
np.random.seed(seed=0)
mvnorm = st.multivariate_normal(mean=[0, 0], cov=[[1., 0.75],
[0.75, 1.]])
X = mvnorm.rvs(10_000)
dist = ot.NormalCopulaFactory().build(X)
print(dist)
returns
NormalCopula(R = [[ 1 0.754492 ]
[ 0.754492 1 ]])
The parameters can then be extracted as
print(dist.getParameter())
which returns
[0.754492]
Similarly, for data with gaussian copula but non-gaussian marginals:
import openturns as ot, numpy as np, scipy.stats as st
np.random.seed(seed=0)
mvnorm = st.multivariate_normal(mean=[0, 0], cov=[[1., 0.75],
[0.75, 1.]])
X = mvnorm.rvs(10_000)
U = st.norm().cdf(X)
X_trans = np.array([st.beta(a=3, b=10).ppf(U[:,0]),
st.gumbel_l().ppf(U[:,1])]).T
dist = ot.NormalCopulaFactory().build(X_trans)
print(dist)
one obtains
NormalCopula(R = [[ 1 0.754492 ]
[ 0.754492 1 ]])
Notice that having different non-gaussian marginals
does not affect the estimates of the parameters.
Fig.9: Convergence of the estimate
of the linear correlation parameter \(\rho\) as
the sample size \(N\) increases. The shaded area represents
the one standard deviation error around the mean estimate.
Finally, consider the following example
for the non-gaussian Clayton copula with \(\theta=2\)
and \(\text{Beta}(\alpha=3, \beta=10)\) and
\(\text{Gumbel}(\mu=0,\beta=1)\)
marginals:
the code
import openturns as ot, numpy as np, scipy.stats as st
J = ot.ComposedDistribution([ot.Uniform(0, 1),
ot.Uniform(0, 1)],
ot.ClaytonCopula(2))
U1, U2 = np.array(J.getSample(10_000)).T
X_trans = np.array([st.beta(a=3, b=10).ppf(U1),
st.gumbel_l().ppf(U2)]).T
dist = ot.ClaytonCopulaFactory().build(X_trans)
print(dist)
returns
ClaytonCopula(theta = 1.98901)
Estimation in practice: non-parametric estimates
As briefly mentioned
OpenTURNS
also allows us to estimate copulae non-parametrically.
Let us start by sampling from a bivariate normal distribution with
linear correlation coefficient \(\rho=0.75\):
import openturns as ot, numpy as np, scipy.stats as st
N = 1_000
mvnorm = st.multivariate_normal(mean=[0, 0], cov=[[1., 0.75],
[0.75, 1.]])
X = mvnorm.rvs(N)
Now, one can extract a
Bernstein non-parametric estimate simply as
where the second argument is the number of bins which each dimension of the unit cube
\([0,1]^d\) is divided into in order to estimate the empirical copula.
Similarly, one can obtain a
Kernel Smoothing
estimation:
It is then possible to obtain a new sample of N points
from the estimated copulae
Bs_Est and
KS_Est
as Bs_Est.getSample(N) and
KS_Est.getSample(N) respectively.
Fig.10: Left: the gaussian copula density
with \(\rho=0.75\).
Center: Bernstein non-parametric copula density estimate for a \(N=1000\) sample
from a multivariate normal distribution with \(\rho=0.75\).
Right: Kernel Smoothing non-parametric copula density estimate for a \(N=1000\) sample
from a multivariate normal distribution with \(\rho=0.75\).