Recap
Two weeks ago, we met Maximus Extremus Distributus. He was taking different forms depending on the origin (parent) distribution.
Last week, Mumble, Joe, and Devine met to discuss the central ideas behind extreme value distribution. They derived the exact function for it.
.
From this cumulative function, we can derive the probability function.
Mumble pointed out the issue of degeneration of the exact function as and introduced the idea of normalizing constants to stabilize the function.
If there are two normalizing constants and , we can create a normalized version of Y as .
Before we closed off for the week, we knew that converges to three types, Type I, Type II and Type III extreme value distributions.
If there exist normalizing constants and , then,
as .
is the non-degenerate cumulative distribution function.
Type I (Gumbel Distribution): . Double exponential distribution.
Type II (Frechet Distribution): for and 0 for . This is single exponential function.
Type III (Weibull Distribution): for and 1 for . This is also a single exponential distribution.
The convergence
Let’s get some intuition on why the parent distributions converge to these three types. The mathematical foundation is much more in-depth. We will skip that for some later lessons and get a simple intuition here.
Exponential origin: Let’s take Joe’s wait time example from last week. We assume that the arrival times between successive vehicles has an exponential distribution. Let’s call this random variable .
There are arrival times between successive vehicles that can be shown as a set of random variables.
The maximum wait time is the max of these numbers. Let’s call this maximum time, .
The cumulative function for an exponential distribution is . Hence, for (maximum wait time), it will be
For simplicity, let’s assume a value of 1 for and take the binomial series expansion for .
As , this series converges to , an asymptotic double exponential functions.
Norming Constants: Now let’s derive the double exponential function with the idea of norming constants. We know for .
Let’s introduce a variable and evaluate the function at . We are adding a constant to .
As , converges to , a double exponential function. If you observe the equation carefully, it is of the form , which in the limit is . In our case, . Hence,
If we replace , we get,
So, with appropriate scaling (stabilization/norming), we see a double exponential function when the origin is an exponential function.
Power law origin: Let’s try one more parent function. This time, it is a power law function with a cumulative density function . We did not learn this distribution, but it is like a decay function with controlling the degree of decay. The distribution of the maximum value of this function is
We will assume a new variable and evaluate the function at .
In the limit, as ,
Hence,
So, the origin functions with a power law type of functions converge to single exponential Type II Frechet distribution. Similar norming constants can be observed for other distributions that converge to Type III Weibull distribution.
To summarize, the three types are .
They are of the form .
Here’s a visual of how these three distributions look.
If the right tail is of exponential type, the extreme value distribution is a Gumbel distribution. Here the parent distribution (or the distribution of ) is unbounded on the right tail. Extremes of most common exponential type distributions such as normal, lognormal, exponential and gamma distributions converge to the double exponential Gumbel distribution. It is most commonly used to model maximum streamflow, maximum rainfall, earthquake occurrence and in some cases, maximum wind speed.
The Frechet distribution, like the Gumbel distribution, is unbounded on the right tail and is much fatter. Extremes from Pareto distribution (Power Law) and Cauchy distributions converge to Frechet Distribution. Rainfall and streamflow extremes, air pollution and economic impacts can be modeled using this type. Notice how the red line (Frechet distribution) has a heavy tail and is bigger than the black line (Gumbel distribution).
If the right tail converges to a finite endpoint, it is a Weibull distribution. The parent distribution is also bounded on the right. Remember the extremes of Uniform, which is bounded converged to a Weibull. It is most widely used in minima of the strength of materials and fatigue analysis. It is also used in modeling temperature extremes and sea level.
The Generalized Extreme Value Distribution (GEV)
The three types of extreme value distributions can be combined into a single function called the generalized extreme value distribution (GEV). Richard von Mises and Jenkinson independently showed this.
is the location parameter. is the scale parameter. is the shape parameter.
When , GEV tends to a Gumbel distribution. It is the same form , which in the limit is . For GEV, the exponential term goes to hence yielding the double exponential .
When , GEV tends to the Frechet distribution. Replace and see for yourself what you get.
When , GEV tends to the Weibull distribution. Replace and check.
and are the surrogate norming variables and controls the shape.
GEV folds all the three types into one form. The parameters and can be estimated from the data, hence negating the necessity to know which Type a parent distribution or data converges to. The function has a closed form solution to compute the quantiles and probabilities. GEV also has the max stable property about which we will learn in some later lessons.
All these concepts will become more concrete once we play with some data. Aren’t you itching to do some GEV in R? You know what is coming next week.
If you find this useful, please like, share and subscribe. You can also follow me on Twitter @realDevineni for updates on new lessons.