Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the probability of events that are more extreme than any previously observed. Extreme value analysis is widely used in many disciplines, such as structural engineering, finance, earth sciences, traffic prediction, and geological engineering. For example, EVA might be used in the field of hydrology to estimate the probability of an unusually large flooding event, such as the 100-year flood. Similarly, for the design of a breakwater, a coastal engineer would seek to estimate the 50-year wave and design the structure accordingly.

Contents

The first method relies on deriving block maxima (minima) series as a preliminary step. In many situations it is customary and convenient to extract the annual maxima (minima), generating an "Annual Maxima Series" (AMS).

The second method relies on extracting, from a continuous record, the peak values reached for any period during which values exceed a certain threshold (falls below a certain threshold). This method is generally referred to as the "Peak Over Threshold" [1] method (POT).

For AMS data, the analysis may partly rely on the results of the Fisher–Tippett–Gnedenko theorem, leading to the generalized extreme value distribution being selected for fitting.[2][3] However, in practice, various procedures are applied to select between a wider range of distributions. The theorem here relates to the limiting distributions for the minimum or the maximum of a very large collection of independentrandom variables from the same distribution. Given that the number of relevant random events within a year may be rather limited, it is unsurprising that analyses of observed AMS data often lead to distributions other than the generalized extreme value distribution (GEVD) being selected.[4]

For POT data, the analysis may involve fitting two distributions: one for the number of events in a time period considered and a second for the size of the exceedances.

The field of extreme value theory was pioneered by Leonard Tippett (1902–1985). Tippett was employed by the British Cotton Industry Research Association, where he worked to make cotton thread stronger. In his studies, he realized that the strength of a thread was controlled by the strength of its weakest fibres. With the help of R. A. Fisher, Tippet obtained three asymptotic limits describing the distributions of extremes. Emil Julius Gumbel codified this theory in his 1958 book Statistics of Extremes, including the Gumbel distributions that bear his name.

The associated indicator functionIn=I(Mn>z){\displaystyle I_{n}=I(M_{n}>z)} is a Bernoulli process with a success probability p(z)=(1−(F(z))n){\displaystyle p(z)=(1-(F(z))^{n})} that depends on the magnitude z{\displaystyle z} of the extreme event. The number of extreme events within n{\displaystyle n} trials thus follows a binomial distribution and the number of trials until an event occurs follows a geometric distribution with expected value and standard deviation of the same order O(1/p(z)){\displaystyle O(1/p(z))}.

In practice, we might not have the distribution function F{\displaystyle F} but the Fisher–Tippett–Gnedenko theorem provides an asymptotic result. If there exist sequences of constants an>0{\displaystyle a_{n}>0} and bn∈R{\displaystyle b_{n}\in \mathbb {R} } such that

where ζ{\displaystyle \zeta } depends on the tail shape of the distribution. When normalized, G belongs to one of the following non-degenerate distribution families:

Weibull law: G(z)={exp⁡{−(−(z−ba))α}z<b1z≥b{\displaystyle G(z)={\begin{cases}\exp \left\{-\left(-\left({\frac {z-b}{a}}\right)\right)^{\alpha }\right\}&z<b\\1&z\geq b\end{cases}}} when the distribution of Mn{\displaystyle M_{n}} has a light tail with finite upper bound. Also known as Type 3.

Gumbel law: G(z)=exp⁡{−exp⁡(−(z−ba))} for z∈R.{\displaystyle G(z)=\exp \left\{-\exp \left(-\left({\frac {z-b}{a}}\right)\right)\right\}{\text{ for }}z\in \mathbb {R} .} when the distribution of Mn{\displaystyle M_{n}} has an exponential tail. Also known as Type 1

Fréchet Law: G(z)={0z≤bexp⁡{−(z−ba)−α}z>b.{\displaystyle G(z)={\begin{cases}0&z\leq b\\\exp \left\{-\left({\frac {z-b}{a}}\right)^{-\alpha }\right\}&z>b.\end{cases}}} when the distribution of Mn{\displaystyle M_{n}} has a heavy tail (including polynomial decay). Also known as Type 2.

Fisher, R.A.; Tippett, L.H.C. (1928), "Limiting forms of the frequency distribution of the largest and smallest member of a sample", Proc. Cambridge Phil. Soc., 24: 180–190, doi:10.1017/s0305004100015681