LUCA MINGARELLI ☰

Measures of dependence

For convenience, one often desires to summarise the dependency structure of a multivariate distribution with a single scalar metric. The most common measures is linear correlation. However this metric is known to have its severe pitfalls in many general cases, and better measures should be considered.

Linear correlation

The first measure of dependency we encounter is Pearson's coefficient of linear correlation:

\begin{matrix} (1) & \begin{aligned} ρ (X_{1}, X_{2}) & = \frac{cov (X_{1}, X_{2})}{σ_{X_{1}} σ_{X_{2}}} \\ = \frac{E ((X_{1} - E X_{1}) (X_{2} - E X_{2}))}{\sqrt{E ((X_{1} - E X_{1})^{2})} \sqrt{E ((X_{2} - E X_{2})^{2})}} \end{aligned} . \end{matrix}

While very useful in many linear or approximately linear scenarios, this metric however fails to capture fundamental properties of more complex and realistic distributions. In addition, thinking in terms of linear correlation can easily make us prey of insidious pitfalls and fallacies.
First of all we notice that

(1)

depends on the marginals and in particular, that it is well defined if and only if the second moments exist,

E (X_{j})^{2} < \infty, \forall j

. Therefore this metric is not well defined for a number of marginals such as many power-law distributions where the second moment does not exist.
Moreover, the linear correlation

(1)

is also unable to capture strong non-linear functional dependencies such as

X_{2} = X_{1}^{2}

X_{2} = \sin (X_{1})

. Indeed in general one has

| ρ | \leq 1

and

| ρ | = 1 ⟺ X_{2} = a X_{1} + b

for some

a \in R ∖ {0}, b \in R

.
The linear correlation

ρ

is also invariant under strictly increasing linear transformation, but not under more general stricly increasing transformations.
One further source of confusion arise from us being used to reason in terms of normal distributions. Indeed a number of seemingly intuitive statements on correlations which are true in the case of normal distributions do not generalise outside of this distribution.
As an example, for normal distributions one finds zero correlation and independence equivalent, which is no longer true already for e.g. student-t distributed random variables.
Another fallacy is to think that the marginals and the correlations matrix (

F_{1}

F_{2}

, and

ρ

in the bivariate case) are sufficient to determine the joint distribution

F

. This is true for elliptical distributions, but wrong in general. Indeed the only mathematical object encoding all information concerning the dependency structure is the copula itself.
Yet another fallacy is to think two marginals

F_{1}

F_{2}

, any value of

ρ \in [- 1, 1]

is attainable. Again, this is true for elliptically distributed

(X_{1}, X_{2})

with finite second moment, but wrong in general. The attainable range can be computed via Hoeffding's formula

\begin{matrix} (2) & cov (X_{1}, X_{2}) = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} C (F_{1} (x_{1}), F_{2} (x_{2})) - F_{1} (x_{1}) F_{2} (x_{2}) d x d y, \end{matrix}

where

ρ_{min}

is attained for

C = W_{counter}

and

ρ_{max}

for

C = C_{co}

. This can be arbitrarily small for appropriate choices of the marginals

F_{1}

and

F_{2}

tail_dependence — **Fig.1:** Attainable range $[ρ_{min}, ρ_{max}]$ of the linear correlation coefficient for two random variables $\log X_{1} \sim N (0, 1)$ and $\log X_{2} \sim N (0, σ^{2})$ . See [1] for more details.

Exercise 1

Consider two independent random variables

Z, W \sim N (0, 1)

. The random variables

X = Z

and

Y = Z W

are clearly not independent. What's

ρ (X, Y)

The linear correlation coefficient is

\begin{aligned} ρ (X, Y) & = cov (X, Y) \\ = E (X Y) \\ = E (W) E (Z^{2}) = 0 \end{aligned}

◻

Exercise 2

Prove

(2)

Start by considering an

X_{j}, Y_{j} \sim F_{j}

, with

X_{j} ⊥ ⊥ Y_{j}

, for

j \in {1, 2}

. One has:

\begin{aligned} 2 cov (X_{1}, X_{2}) & = E ((X_{1} - E X_{1}) (X_{2} - E X_{2})) + E ((Y_{1} - E Y_{1}) (Y_{2} - E Y_{2})) \\ = E (((X_{1} - E X_{1}) - (Y_{1} - E Y_{1})) ((X_{2} - E X_{2}) - (Y_{2} - E Y_{2}))) \\ = E ((X_{1} - Y_{1}) (X_{2} - Y_{2})) . \end{aligned}

Now recall that for any

a, b \in R

one has

b - a = \int_{- \infty}^{\infty} Θ (x - a) - Θ (x - b) d x,

with

Θ (x)

indicating Heaviside's theta function (with the convention

Θ (0) = 1

). Therefore:

\begin{aligned} 2 cov (X_{1}, X_{2}) & = E \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} (Θ (x_{1} - Y_{1}) - Θ (x_{1} - X_{1})) (Θ (x_{2} - Y_{2}) - Θ (x_{2} - X_{2})) d x_{1} d x_{2} \\ \overset{Fubini}{\to} & = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} E ((Θ (x_{1} - Y_{1}) - Θ (x_{1} - X_{1})) (Θ (x_{2} - Y_{2}) - Θ (x_{2} - X_{2}))) d x_{1} d x_{2} \\ = 2 \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} F (x_{1}, x_{2}) - F_{1} (x_{1}) F_{2} (x_{2}) d x_{1} d x_{2} . \end{aligned}

◻

Rank correlation

Many of the drawbacks and pitfalls encountered with linear correlation are resolved when considering rank correlations instead. As we shall see, rank correlation coefficients are always defined and are invariant under any strictly increasing transformation, implying they depend exclusively on the copula. In the following we shall present two of the most prominent, namely Kendall's tau and Spearman's rho.

Definition 1 — Kendall's tau

Consider $X_{j}, Y_{j} \sim F_{j}$ for $j \in {1, 2}$ . Kendall's tau is defined as $\begin{matrix} (3) & \begin{aligned} ρ_{τ} & = E (sign ((X_{1} - Y_{1}) (X_{2} - Y_{2}))) \\ = P ((X_{1} - Y_{1}) (X_{2} - Y_{2}) > 0) - P ((X_{1} - Y_{1}) (X_{2} - Y_{2}) < 0) . \end{aligned} \end{matrix}$ That is, the probability of concordance minus the probability of discordance (i.e. the probability of two points from $F$ to have positive or negative slope respectively).

Proposition 1.1 — Formula for Kendall's tau

Consider two random variables $X_{1}$ and $X_{2}$ with marginals $F_{1}$ and $F_{2}$ and copula $C$ . One has: $\begin{matrix} (4) & \begin{aligned} ρ_{τ} & = 4 \int_{[0, 1]^{2}} C (u_{1}, u_{2}) d C (u_{1}, u_{2}) - 1 \\ = 4 E (C (U_{1}, U_{2})) - 1, \end{aligned} \end{matrix}$ with $(U_{1}, U_{2}) \sim C$ .

Definition 2 — Spearman's rho

Consider $X_{j}, Y_{j} \sim F_{j}$ for $j \in {1, 2}$ . Spearman's rho is defined as $\begin{matrix} (5) & ρ_{S} = ρ (F_{1} (X_{1}), F_{2} (X_{2})) . \end{matrix}$

Proposition 2.1 — Formula for Spearman's rho

Consider two random variables $X_{1}$ and $X_{2}$ with marginals $F_{1}$ and $F_{2}$ and copula $C$ . One has: $\begin{matrix} (6) & \begin{aligned} ρ_{S} & = 12 \int_{0}^{1} \int_{0}^{1} C (u_{1}, u_{2}) d u_{1} d u_{2} - 3 \\ = 12 E (C (U_{1}, U_{2})) - 3, \end{aligned} \end{matrix}$ with $U_{1} ⊥ ⊥ U_{2}$ .

Rank correlations are useful to characterise the dependence, providing comparable measures, and can also serve as a tool for calibration and estimation of a copula's parameter(s). As we have mentioned, a number of fallacies are avoided, but not all.

Resolved fallacies

For

κ = ρ_{τ}

as well as for

κ = ρ_{S}

one has:

$κ$ is always well defined
$κ$ is invariant under any strictly increasing transformation of the random variables
$κ = \pm 1$ if and only if $X_{1}$ and $X_{2}$ are co-/counter- monotonic.
Any $κ \in [- 1, 1]$ is attainable

Unresolved fallacies

However, in general for

κ = ρ_{τ}

as well as for

κ = ρ_{S}

one has:

The marginals $F_{1}, F_{2}$ and the rank correlation $κ$ are still not sufficient to uniquely determine $F$ .
While $X_{1} ⊥ ⊥ X_{2} ⟹ κ = 0$ , the converse is still not true: $κ = 0$ does not imply independence

The last point is something that might still be desirable. However one can show that this requirement would be be in contraddiction with the fundamental property of invariance under strictly increasing transformations.

Proposition 3

There exist no dependency measure $κ$ such that:

$κ (X_{1}, X_{2}) = 0 ⟺ X_{1} ⊥ ⊥ X_{2}$ , and

$κ (T (X_{1}), X_{2}) = {\begin{cases} κ (X, Y) & if T strictly increasing \\ - κ (X, Y) & if T strictly decreasing \end{cases}$ .

See Exercise 8 for a proof. Nonetheless, it is still possible to define a dependency measure

κ

such that

κ (X_{1}, X_{2}) = 0 ⟺ X_{1} ⊥ ⊥ X_{2}

as long as one is willing to trade off other properties. In particular, it can be shown that one can have

κ (X_{1}, X_{2}) = 0 ⟺ X_{1} ⊥ ⊥ X_{2}

, with

0 \leq κ (X_{1}, X_{2}) \leq 1

, and

κ (X_{1}, X_{2}) = 1 ⟺ X_{1}, X_{2}

co-/counter- monotonic, and

κ (T (X_{1}), X_{2}) = κ (X_{1}, X_{2})

for

T

strictly increasing. See [1] for more details.

Exercise 3

Prove

(4)

Consider

X_{j}, Y_{j} \sim F_{j}

, with

X_{j} ⊥ ⊥ Y_{j}

, for

j \in {1, 2}

. One has:

\begin{aligned} ρ_{τ} & = P ((X_{1} - Y_{1}) (X_{2} - Y_{2}) > 0) - P ((X_{1} - Y_{1}) (X_{2} - Y_{2}) < 0) \\ = 2 P ((X_{1} - Y_{1}) (X_{2} - Y_{2}) > 0) - 1 \\ = 2 (2 P (X_{1} < Y_{1}, X_{2} < Y_{2})) - 1 \\ = 4 P (U_{X, 1} < U_{Y, 1}, U_{X, 2} < U_{Y, 2}) - 1 \\ = 4 \int_{0}^{1} \int_{0}^{1} P (U_{1} \leq u_{1}, U_{2} \leq u_{2}) d C (u_{1}, u_{2}) - 1. \end{aligned}

◻

Exercise 4

Prove

(6)

From definition

(5)

one has:

\begin{aligned} ρ_{S} & = ρ (F_{1} (X_{1}), F_{2} (X_{2})) \\ Hoeffding's formula \to & = 12 \int_{0}^{1} \int_{0}^{1} C (u_{1}, u_{2}) - u_{1} u_{2} d u_{1} d u_{2} \\ = 12 \int_{0}^{1} \int_{0}^{1} C (u_{1}, u_{2}) d u_{1} d u_{2} - 3. \end{aligned}

◻

Exercise 5

Show that the comonotonic copula

C_{co}

implies

κ = 1

for both

κ = ρ_{τ}

and

κ = ρ_{S}

Notice that

P (min (U_{1}, U_{2}) < t) = 1 - P (min (U_{1}, U_{2}) \geq t) = 1 - (1 - t)^{2},

so that

f_{T = min (U_{1}, U_{2})} (t) = 2 (1 - t) .

Therefore one has

\begin{aligned} E (min (U_{1}, U_{2})) & = \int_{0}^{1} t f_{T = min (U_{1}, U_{2})} (t) d t \\ = 2 \int_{0}^{1} t (1 - t) d t = \frac{1}{3} . \end{aligned}

Hence:

\begin{aligned} ρ_{S} & = 12 E (min (U_{1}, U_{2})) - 3 = \frac{12}{3} - 3 = 1. \end{aligned}

Let's consider Kendall's tau now. First, the copula viewed as a random variable has a distribution function called "Kendall distribution function", which is equal to

K_{C} (t) = t - \frac{ψ (t)}{ψ^{'} (t)},

with

ψ (t)

the copula's generator and

ψ^{'} (t)

its first derivative. One can then write the expected value of the copula as

\begin{aligned} E [C (U_{1}, U_{2})] & = \int_{0}^{1} t d K_{C} \\ \overset{by parts}{\to} & = t K_{C} (t) |_{0}^{1} - \int_{0}^{1} K_{C} (t) d t . \end{aligned}

Recalling the limiting results for the Clayton copula (see Exercise 8 here ), let's consider the generator of the Clayton copula and its derivative:

\begin{aligned} ψ (t) & = θ^{- 1} (t^{- θ} - 1), \\ ψ^{'} (t) & = - t^{- (1 + θ)}, \end{aligned}

which gives

K_{C} (t) = t (1 + θ^{- 1} (1 - t^{θ})) .

Then

\begin{aligned} E [C (U_{1}, U_{2})] & = t K_{C} (t) |_{0}^{1} - \int_{0}^{1} K_{C} (t) d t \\ = 1 - \frac{θ + 3}{2 θ + 4} . \end{aligned}

Finally, taking the limits

θ \to \infty

for

C_{co}

and

θ \to - 1

for

W_{counter}

one finds:

\begin{aligned} E [C_{co} (U_{1}, U_{2})] & = \frac{1}{2}, \\ E [W_{counter} (U_{1}, U_{2})] & = 0. \end{aligned}

Therefore,

\begin{aligned} ρ_{τ} & = 4 E [C_{co} (U_{1}, U_{2})] - 1 = 1. \end{aligned}

◻

Exercise 6

Show that the countermonotonic copula

W_{counter}

implies

κ = - 1

for both

κ = ρ_{τ}

and

κ = ρ_{S}

As in the solution of Exercise 5 and with the results therein.

◻

Exercise 7

Write a bivariate joint distribution parametrised by a single parameter

λ \in [0, 1]

and show that this can attain any

κ \in [- 1, 1]

for both

κ = ρ_{τ}

and

κ = ρ_{S}

Consider the following:

F (x_{1}, x_{2}) = λ C_{co} (F_{1} (x_{1}), F_{2} (x_{2})) + (1 - λ) W_{counter} (F_{1} (x_{1}), F_{2} (x_{2})) .

Therefore one has

κ = λ - (1 - λ) = 2 λ - 1.

That is, in general one has

ρ_{τ} = ρ_{S} = 2 λ - 1

◻

Exercise 8

Prove Proposition 3.

Consider

(X_{1}, X_{2})

uniformely distributed on the unit circle in

R^{2}

, so that the vector can be parametrised by

ϕ \sim U [0, 2 π]

(X_{1}, X_{2}) = (\cos ϕ, \sin ϕ)

. Because

(X_{1}, X_{2}) \overset{d}{=} (- X_{1}, X_{2})

one has

κ (- X_{1}, X_{2}) = κ (X_{1}, X_{2}) = - κ (X_{1}, X_{2}) .

This implies

κ (X_{1}, X_{2}) = 0

although

X_{1}

and

X_{2}

are not independent, which is a contraddiction. See also [1].

◻

Coefficients of tail dependence

If one wants to study extreme values, asymptotic measures of tail dependence can be defined as a function of the copula. In what follows we shall distinguish between upper tail dependence and lower tail dependence.

Definition 4 — Coefficient of tail dependence

Consider two random variables $X_{j} \sim F_{j}$ . The associated coefficients of upper and lower tail dependence are: $\begin{matrix} (7) & \begin{aligned} λ_{u} & = lim_{α \to 1^{-}} P (X_{2} > F_{2}^{- 1} (α) | X_{1} > F_{1}^{- 1} (α)), \\ λ_{ℓ} & = lim_{α \to 0^{+}} P (X_{2} \leq F_{2}^{- 1} (α) | X_{1} \leq F_{1}^{- 1} (α)) . \end{aligned} \end{matrix}$ If $λ_{u} \in] 0, 1]$ ( $λ_{ℓ} \in] 0, 1]$ ), then $(X_{1}, X_{2})$ is said to be upper (lower) tail dependent, or more generally, asymptotically dependent. Similarly, if $λ_{u} = 0$ ( $λ_{ℓ} = 0$ ), then $(X_{1}, X_{2})$ is said to be upper (lower) tail independent, or more generally, asymptotically independent.

Proposition 4.1

The coefficients of upper and lower tail dependence can be written as a function of the copula as: $\begin{matrix} (8) & \begin{aligned} λ_{u} & = lim_{α \to 1^{-}} 2 - \frac{1 - C (α, α)}{1 - α}, \\ λ_{ℓ} & = lim_{α \to 0^{+}} \frac{C (α, α)}{α} . \end{aligned} \end{matrix}$

Proposition 4.2

For radially symmetric copulae one has $λ_{u} = λ_{ℓ}$ .

Proposition 4.3

For Archimedean copulae with strict generator $ψ$ one has $\begin{matrix} (9) & \begin{aligned} λ_{u} & = 2 - 2 lim_{α \to 0^{+}} \frac{ψ^{'} (2 α)}{ψ^{'} (α)}, \\ λ_{ℓ} & = 2 lim_{α \to \infty} \frac{ψ^{'} (2 α)}{ψ^{'} (α)} . \end{aligned} \end{matrix}$

Figure 1 below presents the coefficient of tail dependence for the Student-t copula

C_{ν, ρ}^{t}

for which one can find

λ_{ℓ} = λ_{u}

. The tail dependence grows with the correlation coefficient

ρ

and quickly decreases with increasing degrees of freedom

ν

. Therefore, recalling the limit

ν \to \infty

leads to normality, one can easily see the gaussian copula is asymptotically independent for all

ρ

except

ρ = 1

Exercise 9

Prove Proposition 4.1.

One has:

\begin{aligned} λ_{u} & = lim_{α \to 1^{-}} P (U_{2} > α | U_{1} > α) \\ = lim_{α \to 1^{-}} \frac{1 - P (U_{2} \leq α or U_{1} \leq α)}{P (U_{1} > α)} \\ = lim_{α \to 1^{-}} \frac{1 - P (U_{2} \leq α) - P (U_{1} \leq α) + P (U_{2} \leq α, U_{1} \leq α)}{1 - α} \\ = lim_{α \to 1^{-}} 2 - \frac{1 - C (α, α)}{1 - α} . \end{aligned}

Similarly,

\begin{aligned} λ_{ℓ} & = lim_{α \to 0^{+}} P (U_{2} \leq α | U_{1} \leq α) \\ = lim_{α \to 0^{+}} \frac{P (U_{2} \leq α, U_{1} \leq α)}{P (U_{1} \leq α)} \\ = lim_{α \to 0^{+}} \frac{C (α, α)}{α} . \end{aligned}

◻

Exercise 10

Prove Proposition 4.3. Moreover, compute the upper and lower coefficients for Clayton's and Gumbel's copulae.

Consider the upper coefficient first. One has:

\begin{aligned} λ_{u} & = 2 - lim_{α \to 1^{-}} \frac{1 - ψ (2 ψ^{- 1} (α))}{1 - α} \\ \overset{β = ψ^{- 1} (α)}{\to} & = 2 - lim_{β \to 0^{+}} \frac{1 - ψ (2 β)}{1 - ψ (β)} \\ \overset{de l'Hôspital}{\to} & = 2 - 2 lim_{β \to 0^{+}} \frac{ψ^{'} (2 β)}{ψ^{'} (β)} . \end{aligned}

Now the lower coefficient:

\begin{aligned} λ_{ℓ} & = lim_{α \to 0^{+}} \frac{ψ (2 ψ^{- 1} (α))}{α} \\ \overset{β = ψ^{- 1} (α)}{\to} & = lim_{β \to \infty} \frac{ψ (2 β)}{ψ (β)} \\ \overset{de l'Hôspital}{\to} & = 2 lim_{β \to \infty} \frac{ψ^{'} (2 β)}{ψ^{'} (β)} . \end{aligned}

Finally, for the Clayton copula one finds:

\begin{aligned} λ_{u} & = 0, \\ λ_{ℓ} & = 2^{- 1 / θ}, \end{aligned}

while for the Gumbel copula one has:

\begin{aligned} λ_{u} & = 2 - 2^{1 / θ}, \\ λ_{ℓ} & = 0. \end{aligned}

◻

One major issue with gaussian copulae is their asymptotic independence for any

| ρ | < 1

. The great financial crisis of 2007-2008 is often partly attributed to the misuse of the gaussian copula model [2] [3], and in particular to the infamous Li model of credit default [4]. The inadequacy of gaussian copula is evident: attempting to model defaults in a portfolio of corporate bonds leads to crucial underestimation of the likelihood of joint defaults (because of the gaussian copula asymptotic independence). This is especially important in times of financial distress when defaults are clustered and conditional on observing one company defaulting one has a high likelihood of observing more defaults. Consequently some have argued one cause of the crisis was to be attributed to a "misplaced reliance on sophisticated mathematics" [5]. However this could not be further from the truth. Quite to the contrary in fact, models based on the gaussian copula were enthusiastically embraced by the financial industry exactly because of their simplicity. In particular, had the industry employed more 'sophisticated' mathematics, which in this specific case means asymptotically dependent copulae, the banks' risk management would have been arguably more sound, and capable to cope with the materialisation of large clusters of defaults.

References

[1] "Correlation and Dependence in Risk Management: Properties and Pitfalls", Paul Embrechts, Alexander McNeil, and Daniel Straumann, 1999
[2] "Recipe for Disaster: The Formula That Killed Wall Street", Felix Salmon, February 2009, Wired Magazine
[3] "The devil is in the tails: actuarial mathematics and the subprime mortgage crisis", Catherine Donnelly and Paul Embrechts, 2010, ASTIN Bulletin: The Journal of the IAA, 40(1), 1-33
[4] "On Default Correlation: A Copula Function Approach", David X. Li, April 2000, The RiskMetrics Group Working Paper Number 99-07
[5] "The Turner Review", Turner, J. A., March 2009, Financial Services Authority, UK

Measures of dependence

Linear correlation

Exercise 1

Exercise 2

Rank correlation

Resolved fallacies

Unresolved fallacies

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Coefficients of tail dependence

Exercise 9

Exercise 10

References

Back to Teaching