5.11: The F Distribution (2024)

  1. Last updated
  2. Save as PDF
  • Page ID
    10351
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vectorC}[1]{\textbf{#1}}\)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}}\)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}\)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\P}{\mathbb{P}}\) \(\newcommand{\var}{\text{var}}\) \(\newcommand{\sd}{\text{sd}}\) \(\newcommand{\skw}{\text{skew}}\) \(\newcommand{\kur}{\text{kurt}}\)

    In this section we will study a distribution that has special importance in statistics. In particular, this distribution arises from ratios of sums of squares when sampling from a normal distribution, and so is important in estimation and in the two-sample normal model and in hypothesis testing in the two-sample normal model.

    Basic Theory

    Definition

    Suppose that \(U\) has the chi-square distribution with \(n \in (0, \infty)\) degrees of freedom, \(V\) has the chi-square distribution with \(d \in (0, \infty)\) degrees of freedom, and that \(U\) and \(V\) are independent. The distribution of \[ X = \frac{U / n}{V / d} \] is the \(F\) distribution with \(n\) degrees of freedom in the numerator and \(d\) degrees of freedom in the denominator.

    The \(F\) distribution was first derived by George Snedecor, and is named in honor of Sir Ronald Fisher. In practice, the parameters \( n \) and \( d \) are usually positive integers, but this is not a mathematical requirement.

    Distribution Functions

    Suppose that \(X\) has the \( F \) distribution with \( n \in (0, \infty) \) degrees of freedom in the numerator and \( d \in (0, \infty) \) degrees of freedom in the denominator. Then \( X \) has a continuous distribution on \( (0, \infty) \) with probability density function \( f \) given by \[ f(x) = \frac{\Gamma(n/2 + d/2)}{\Gamma(n / 2) \Gamma(d / 2)} \frac{n}{d} \frac{[(n/d) x]^{n/2 - 1}}{\left[1 + (n / d) x\right]^{n/2 + d/2}}, \quad x \in (0, \infty) \] where \( \Gamma \) is the gamma function.

    Proof

    The trick, once again, is conditioning. The conditional distribution of \( X \) given \( V = v \in (0, \infty) \) is gamma with shape parameter \( n/2 \) and scale parameter \( 2 d / n v \). Hence the conditional PDF is \[ x \mapsto \frac{1}{\Gamma(n/2) \left(2 d / n v\right)^{n/2}} x^{n/2 - 1} e^{-x(nv /2d)} \] By definition, \( V \) has the chi-square distribution with \( d \) degrees of freedom, and so has PDF \[ v \mapsto \frac{1}{\Gamma(d/2) 2^{d/2}} v^{d/2 - 1} e^{-v/2} \] The joint PDF of \( (X, V) \) is the product of these functions: \[g(x, v) = \frac{1}{\Gamma(n/2) \Gamma(d/2) 2^{(n+d)/2}} \left(\frac{n}{d}\right)^{n/2} x^{n/2 - 1} v^{(n+d)/2 - 1} e^{-v( n x / d + 1)/2}; \quad x, \, v \in (0, \infty)\] The PDF of \( X \) is therefore \[ f(x) = \int_0^\infty g(x, v) \, dv = \frac{1}{\Gamma(n/2) \Gamma(d/2) 2^{(n+d)/2}} \left(\frac{n}{d}\right)^{n/2} x^{n/2 - 1} \int_0^\infty v^{(n+d)/2 - 1} e^{-v( n x / d + 1)/2} \, dv \] Except for the normalizing constant, the integrand in the last integral is the gamma PDF with shape parameter \( (n + d)/2 \) and scale parameter \( 2 d \big/ (n x + d) \). Hence the integral evaluates to \[ \Gamma\left(\frac{n + d}{2}\right) \left(\frac{2 d}{n x + d}\right)^{(n + d)/2} \] Simplifying gives the result.

    Recall that the beta function \( B \) can be written in terms of the gamma function by \[ B(a, b) = \frac{\Gamma(a) \Gamma(b)}{\Gamma(a + b)},\ \quad a, \, b \in (0, \infty) \] Hence the probability density function of the \( F \) distribution above can also be written as \[ f(x) = \frac{1}{B(n/2, d/2)} \frac{n}{d} \frac{[(n/d) x]^{n/2 - 1}}{\left[1 + (n / d) x\right]^{n/2 + d/2}}, \quad x \in (0, \infty) \] When \( n \ge 2 \), the probability density function is defined at \( x = 0 \), so the support interval is \( [0, \infty) \) is this case.

    In the special distribution simulator, select the \(F\) distribution. Vary the parameters with the scroll bars and note the shape of the probability density function. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function.

    Both parameters influence the shape of the \( F \) probability density function, but some of the basic qualitative features depend only on the numerator degrees of freedom. For the remainder of this discussion, let \( f \) denote the \( F \) probability density function with \( n \in (0, \infty) \) degrees of freedom in the numerator and \( d \in (0, \infty) \) degrees of freedom in the denominator.

    Probability density function \( f \) satisfies the following properties:

    1. If \( 0 \lt n \lt 2 \), \( f \) is decreasing with \( f(x) \to \infty \) as \( x \downarrow 0 \).
    2. If \( n = 2 \), \( f \) is decreasing with mode at \( x = 0 \).
    3. If \( n \gt 2 \), \(f\) increases and then decreases, with mode at \(x = \frac{(n - 2) d}{n (d + 2)}\).
    Proof

    These properties follow from standard calculus. The first derivative of \( f \) is \[ f^\prime(x) = \frac{1}{B(n/2, d/2)} \left(\frac{n}{d}\right)^2 \frac{[(n/d)x]^{n/2-2}}{[1 + (n/2)x]^{n/2 + d/2 + 1}} [(n/2 - 1) - (n/d)(d/2 + 1)x], \quad x \in (0, \infty) \]

    Qualitatively, the second order properties of \( f \) also depend only on \( n \), with transitions at \( n = 2 \) and \( n = 4 \).

    For \( n \gt 2 \), define \begin{align} x_1 & = \frac{d}{n} \frac{(n - 2)(d + 4) - \sqrt{2 (n - 2)(d + 4)(n + d)}}{(d + 2)(d + 4)} \\ x_2 & = \frac{d}{n} \frac{(n - 2)(d + 4) + \sqrt{2 (n - 2)(d + 4)(n + d)}}{(d + 1)(d + 4)} \end{align} The probability density function \( f \) satisfies the following properties:

    1. If \( 0 \lt n \le 2 \), \( f \) is concave upward.
    2. If \( 2 \lt n \le 4 \), \( f \) is concave downward and then upward, with inflection point at \( x_2 \).
    3. If \( n \gt 4 \), \( f \) is concave upward, then downward, then upward again, with inflection points at \( x_1 \) and \( x_2 \).
    Proof

    These results follow from standard calculus. The second derivative of \( f \) is \[ f^{\prime\prime}(x) = \frac{1}{B(n/2, d/2)} \left(\frac{n}{d}\right)^3 \frac{[(n/d)x]^{n/2-3}}{[1 + (n/d)x]^{n/2 + d/2 + 2}}\left[(n/2 - 1)(n/2 - 2) - 2 (n/2 - 1)(d/2 + 2) (n/d) x + (d/2 + 1)(d/2 + 2)(n/d)^2 x^2\right], \quad x \in (0, \infty) \]

    The distribution function and the quantile function do not have simple, closed-form representations. Approximate values of these functions can be obtained from the special distribution calculator and from most mathematical and statistical software packages.

    In the special distribution calculator, select the \(F\) distribution. Vary the parameters and note the shape of the probability density function and the distribution function. In each of the following cases, find the median, the first and third quartiles, and the interquartile range.

    1. \(n = 5\), \(d = 5\)
    2. \(n = 5\), \(d = 10\)
    3. \(n = 10\), \(d = 5\)
    4. \(n = 10\), \(d = 10\)

    The general probability density function of the \( F \) distribution is a bit complicated, but it simplifies in a couple of special cases.

    Special cases.

    1. If \( n = 2 \), \[ f(x) = \frac{1}{(1 + 2 x / d)^{1 + d / 2}}, \quad x \in (0, \infty) \]
    2. If \( n = d \in (0, \infty)\), \[ f(x) = \frac{\Gamma(n)}{\Gamma^2(n/2)} \frac{x^{n/2-1}}{(1 + x)^n}, \quad x \in (0, \infty)\]
    3. If \( n = d = 2 \), \[ f(x) = \frac{1}{(1 + x)^2}, \quad x \in (0, \infty) \]
    4. If \( n = d = 1 \), \[ f(x) = \frac{1}{\pi \sqrt{x}(1 + x)}, \quad x \in (0, \infty) \]

    Moments

    The random variable representation in the definition, along with the moments of the chi-square distribution can be used to find the mean, variance, and other moments of the \( F \) distribution. For the remainder of this discussion, suppose that \(X\) has the \(F\) distribution with \(n \in (0, \infty)\) degrees of freedom in the numerator and \(d \in (0, \infty)\) degrees of freedom in the denominator.

    Mean

    1. \(\E(X) = \infty\) if \(0 \lt d \le 2\)
    2. \(\E(X) = \frac{d}{d - 2}\) if \(d \gt 2\)
    Proof

    By independence, \( \E(X) = \frac{d}{n} \E(U) \E\left(V^{-1}\right) \). Recall that \( \E(U) = n \). Similarly if \( d \le 2 \), \( \E\left(V^{-1}\right) = \infty \) while if \( d \gt 2 \), \[ \E\left(V^{-1}\right) = \frac{\Gamma(d/2 - 1)}{2 \Gamma(d/2)} = \frac{1}{d - 2} \]

    Thus, the mean depends only on the degrees of freedom in the denominator.

    Variance

    1. \(\var(X)\) is undefined if \(0 \lt d \le 2\)
    2. \(\var(X) = \infty\) if \(2 \lt d \le 4\)
    3. If \(d \gt 4\) then \[ \var(X) = 2 \left(\frac{d}{d - 2} \right)^2 \frac{n + d - 2}{n (d - 4)} \]
    Proof

    By independence, \( \E\left(X^2\right) = \frac{d^2}{n^2} \E\left(U^2\right) \E\left(V^{-2}\right) \). Recall that \[ E(\left(U^2\right) = 4 \frac{\Gamma(n/2 + 2)}{\Gamma(n/2)} = (n + 2) n \] Similarly if \( d \le 4 \), \( \E\left(V^{-2}\right) = \infty \) while if \( d \gt 4 \), \[ \E\left(V^{-2}\right) = \frac{\Gamma(d/2 - 2)}{4 \Gamma(d/2)} = \frac{1}{(d - 2)(d - 4)} \] Hence \( \E\left(X^2\right) = \infty \) if \( d \le 4 \) while if \( d \gt 4 \), \[ \E\left(X^2\right) = \frac{(n + 2) d^2}{n (d - 2)(d - 4)} \] The results now follow from the previous result on the mean and the computational formula \( \var(X) = \E\left(X^2\right) - \left[\E(X)\right]^2 \).

    In the simulation of the special distribution simulator, select the \(F\) distribution. Vary the parameters with the scroll bar and note the size and location of the mean \( \pm \) standard deviation bar. For selected values of the parameters, run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation..

    General moments. For \( k \gt 0 \),

    1. \(\E\left(X^k\right) = \infty\) if \(0 \lt d \le 2 k\)
    2. If \(d \gt 2 k\) then \[ \E\left(X^k\right) = \left( \frac{d}{n} \right)^k \frac{\Gamma(n/2 + k) \, \Gamma(d/2 - k)}{\Gamma(n/2) \Gamma(d/2)} \]
    Proof

    By independence, \( \E\left(X^k\right) = \left(\frac{d}{n}\right)^k \E\left(U^k\right) \E\left(V^{-k}\right) \). Recall that \[ \E\left(U^k\right) = \frac{2^k \Gamma(n/2 + k)}{\Gamma(n/2)} \] On the other hand, \( \E\left(V^{-k}\right) = \infty \) if \( d/2 \le k \) while if \( d/2 \gt k \), \[ \E\left(V^{-k}\right) = \frac{2^{-k} \Gamma(d/2 - k)}{\Gamma(d/2)} \]

    If \( k \in \N \), then using the fundamental identity of the gamma distribution and some algebra, \[ \E\left(X^{k}\right) = \left(\frac{d}{n}\right)^k \frac{n (n + 2) \cdots [n + 2(k - 1)]}{(d - 2)(d - 4) \cdots (d - 2k)} \] From the general moment formula, we can compute the skewness and kurtosis of the \( F \) distribution.

    Skewness and kurtosis

    1. If \( d \gt 6 \), \[ \skw(X) = \frac{(2 n + d - 2) \sqrt{8 (d - 4)}}{(d - 6) \sqrt{n (n + d - 2)}} \]
    2. If \( d \gt 8 \), \[ \kur(X) = 3 + 12 \frac{n (5 d - 22)(n + d - 2) + (d - 4)(d-2)^2}{n(d - 6)(d - 8)(n + d - 2)} \]
    Proof

    These results follow from the formulas for \( \E\left(X^k\right) \) for \( k \in \{1, 2, 3, 4\} \) and the standard computational formulas for skewness and kurtosis.

    Not surprisingly, the \( F \) distribution is positively skewed. Recall that the excess kurtosis is \[ \kur(X) - 3 = 12 \frac{n (5 d - 22)(n + d - 2) + (d - 4)(d-2)^2}{n(d - 6)(d - 8)(n + d - 2)}\]

    In the simulation of the special distribution simulator, select the \(F\) distribution. Vary the parameters with the scroll bar and note the shape of the probability density function in light of the previous results on skewness and kurtosis. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function.

    Relations

    The most important relationship is the one in the definition, between the \( F \) distribution and the chi-square distribution. In addition, the \( F \) distribution is related to several other special distributions.

    Suppose that \(X\) has the \(F\) distribution with \(n \in (0, \infty)\) degrees of freedom in the numerator and \(d \in (0, \infty)\) degrees of freedom in the denominator. Then \(1 / X\) has the \(F\) distribution with \(d\) degrees of freedom in the numerator and \(n\) degrees of freedom in the denominator.

    Proof

    This follows easily from the random variable interpretation in the definition. We can write \[ X = \frac{U/n}{V/d} \] where \( U \) and \( V \) are independent and have chi-square distributions with \( n \) and \( d \) degrees of freedom, respectively. Hence \[ \frac{1}{X} = \frac{V/d}{U/n} \]

    Suppose that \(T\) has the \(t\) distribution with \(n \in (0, \infty)\) degrees of freedom. Then \(X = T^2\) has the \(F\) distribution with 1 degree of freedom in the numerator and \(n\) degrees of freedom in the denominator.

    Proof

    This follows easily from the random variable representations of the \( t \) and \( F \) distributions. We can write \[ T = \frac{Z}{\sqrt{V/n}} \] where \( Z \) has the standard normal distribution, \( V \) has the chi-square distribution with \( n \) degrees of freedom, and \( Z \) and \( V \) are independent. Hence \[ T^2 = \frac{Z^2}{V/n} \] Recall that \( Z^2 \) has the chi-square distribution with 1 degree of freedom.

    Our next relationship is between the \( F \) distribution and the exponential distribution.

    Suppose that \( X \) and \( Y \) are independent random variables, each with the exponential distribution with rate parameter \( r \in (0, \infty) \). Then \(Z = X / Y\). has the \( F \) distribution with \( 2 \) degrees of freedom in both the numerator and denominator.

    Proof

    We first find the distribution function \( F \) of \( Z \) by conditioning on \( X \): \[ F(z) = \P(Z \le z) = \P(Y \ge X / z) = \E\left[\P(Y \ge X / z \mid X)\right] \] But \( \P(Y \ge y) = e^{-r y} \) for \( y \ge 0 \) so \( F(z) = \E\left(e^{-r X / z}\right) \). Also, \( X \) has PDF \( g(x) = r e^{-r x} \) for \( x \ge 0 \) so \[ F(z) = \int_0^\infty e^{- r x / z} r e^{-r x} \, dx = \int_0^\infty r e^{-r x (1 + 1/z)} \, dx = \frac{1}{1 + 1/z} = \frac{z}{1 + z}, \quad z \in (0, \infty) \] Differentiating gives the PDF of \( Z \) \[ f(z) = \frac{1}{(1 + z)^2}, \quad z \in (0, \infty) \] which we recognize as the PDF of the \( F \) distribution with 2 degrees of freedom in the numerator and the denominator.

    A simple transformation can change a variable with the \( F \) distribution into a variable with the beta distribution, and conversely.

    Connections between the \( F \) distribution and the beta distribution.

    1. If \( X \) has the \( F \) distribution with \( n \in (0, \infty) \) degrees of freedom in the numerator and \( d \in (0, \infty) \) degrees of freedom in the denominator, then \[ Y = \frac{(n/d) X}{1 + (n/d) X} \] has the beta distribution with left parameter \( n/2 \) and right parameter \( d/2 \).
    2. If \( Y \) has the beta distribution with left parameter \( a \in (0, \infty) \) and right parameter \( b \in (0, \infty) \) then \[ X = \frac{b Y}{a(1 - Y)} \] has the \( F \) distribution with \( 2 a \) degrees of freedom in the numerator and \( 2 b \) degrees of freedom in the denominator.
    Proof

    The two statements are equivalent and follow from the standard change of variables formula. The function \[ y = \frac{(n/d) x}{1 + (n/d) x} \] maps \( (0, \infty) \) one-to-one onto (0, 1), with inverse \[ x = \frac{d}{n}\frac{y}{1 - y} \] Let \( f \) denote the PDF of the \( F \) distribution with \( n \) degrees of freedom in the numerator and \( d \) degrees of freedom in the denominator, and let \( g \) denote the PDF of the beta distribution with left parameter \( n/2 \) and right parameter \( d/2 \). Then \( f \) and \( g \) are related by

    1. \( g(y) = f(x) \frac{dx}{dy} \)
    2. \( f(x) = g(y) \frac{dy}{dx} \)

    The \( F \) distribution is closely related to the beta prime distribution by a simple scale transformation.

    Connections with the beta prime distributions.

    1. If \( X \) has the \( F \) distribution with \( n \in (0, \infty) \) degrees of freedom in the numerator and \( d \in (0, \infty) \) degrees of freedom in the denominator, then \( Y = \frac{n}{d} X \) has the beta prime distribution with parameters \( n/2 \) and \( d/2 \).
    2. If \( Y \) has the beta prime distribution with parameters \( a \in (0, \infty) \) and \( b \in (0, \infty) \) then \( X = \frac{b}{a} X \) has the \( F \) distribution with \( 2 a \) degrees of the freedom in the numerator and \( 2 b \) degrees of freedom in the denominator.
    Proof

    Let \( f \) denote the PDF of \( X \) and \( g \) the PDF of \( Y \).

    1. By the change of variables formula, \[ g(y) = \frac{d}{n} f\left(\frac{d}{n} y\right), \quad y \in (0, \infty) \] Substituting into the beta \( F \) PDF shows that \( Y \) has the appropriate beta prime distribution.
    2. Again using the change of variables formula, \[ f(x) = \frac{a}{b} g\left(\frac{a}{b} x\right), \quad x \in (0, \infty) \] Substituting into the beta prime PDF shows that \( X \) has the appropriate \( F \) PDF.

    The Non-Central \( F \) Distribution

    The \( F \) distribution can be generalized in a natural way by replacing the ordinary chi-square variable in the numerator in the definition above with a variable having a non-central chi-square distribution. This generalization is important in analysis of variance.

    Suppose that \(U\) has the non-central chi-square distribution with \(n \in (0, \infty) \) degrees of freedom and non-centrality parameter \(\lambda \in [0, \infty)\), \(V\) has the chi-square distribution with \(d \in (0, \infty)\) degrees of freedom, and that \(U\) and \(V\) are independent. The distribution of \[ X = \frac{U / n}{V / d} \] is the non-central \(F\) distribution with \(n\) degrees of freedom in the numerator, \(d\) degrees of freedom in the denominator, and non-centrality parameter \( \lambda \).

    One of the most interesting and important results for the non-central chi-square distribution is that it is a Poisson mixture of ordinary chi-square distributions. This leads to a similar result for the non-central \( F \) distribution.

    Suppose that \( N \) has the Poisson distribution with parameter \( \lambda / 2 \), and that the conditional distribution of \( X \) given \( N \) is the \( F \) distribution with \( N + 2 n \) degrees of freedom in the numerator and \( d \) degrees of freedom in the denominator, where \( \lambda \in [0, \infty) \) and \( n, \, d \in (0, \infty) \). Then \( X \) has the non-central \( F \) distribution with \( n \) degrees of freedom in the numerator, \( d \) degrees of freedom in the denominator, and non-centrality parameter \( \lambda \).

    Proof

    As in the theorem, let \( N \) have the Poisson distribution with parameter \( \lambda / 2 \), and suppose also that the conditional distribution of \( U \) given \( N \) is chi-square with \( n + 2 N \) degrees of freedom, and that \( V \) has the chi-square distribution with \( d \) degrees of freedom and is independent of \( (N, U) \). Let \( X = (U / n) \big/ (V / d) \). Since \( V \) is independent of \( (N, U) \), the variable \( X \) satisfies the condition in the theorem; that is, the conditional distribution of \( X \) given \( N \) is the \( F \) distribution with \( n + 2 N \) degrees of freedom in the numerator and \( d \) degrees of freedom in the denominator. But then also, (unconditionally) \( U \) has the non-central chi-square distribution with \( n \) degrees of freedom in the numerator and non-centrality parameter \( \lambda \), \( V \) has the chi-square distribution with \( d \) degrees of freedom, and \( U \) and \( V \) are independent. So by definition \( X \) has the \( F \) distribution with \( n \) degrees of freedom in the numerator, \( d \) degrees of freedom in the denominator, and non-centrality parameter \( \lambda \).

    From the last result, we can express the probability density function and distribution function of the non-central \( F \) distribution as a series in terms of ordinary \( F \) density and distribution functions. To set up the notation, for \( j, k \in (0, \infty) \) let \( f_{j k} \) be the probability density function and \( F_{j k} \) the distribution function of the \( F \) distribution with \( j \) degrees of freedom in the numerator and \( k \) degrees of freedom in the denominator. For the rest of this discussion, \( \lambda \in [0, \infty) \) and \( n, \, d \in (0, \infty) \) as usual.

    The probability density function \( g \) of the non-central \( F \) distribution with \( n \) degrees of freedom in the numerator, \( d \) degrees of freedom in the denominator, and non-centrality parameter \( \lambda \) is given by \[ g(x) = \sum_{k = 0}^\infty e^{-\lambda / 2} \frac{(\lambda / 2)^k}{k!} f_{n + 2 k, d}(x), \quad x \in (0, \infty) \]

    The distribution function \( G \) of the non-central \( F \) distribution with \( n \) degrees of freedom in the numerator, \( d \) degrees of freedom in the denominator, and non-centrality parameter \( \lambda \) is given by \[ G(x) = \sum_{k = 0}^\infty e^{-\lambda / 2} \frac{(\lambda / 2)^k}{k!} F_{n + 2 k, d}(x), \quad x \in (0, \infty) \]

    5.11: The F Distribution (2024)

    FAQs

    What does the F distribution tell you? ›

    The F-distribution was developed by Fisher to study the behavior of two variances from random samples taken from two independent normal populations. In applied problems we may be interested in knowing whether the population variances are equal or not, based on the response of the random samples.

    What is the limit of the F distribution? ›

    The F distribution is an asymmetric distribution that has a minimum value of 0, but no maximum value. The curve reaches a peak not far to the right of 0, and then gradually approaches the horizontal axis the larger the F value is. The F distribution approaches, but never quite touches the horizontal axis.

    What is the F distribution v1 and v2? ›

    The F Distribution

    The distribution of all possible values of the f statistic is called an F distribution, with v1 = n1 - 1 and v2 = n2 - 1 degrees of freedom. The curve of the F distribution depends on the degrees of freedom, v1 and v2.

    What is the difference between the T and F distribution? ›

    The t-distribution is used in hypothesis testing and confidence intervals for the mean of a single population, or the difference in means between two populations. The F-distribution is used in hypothesis testing and confidence intervals for the variance ratio of two populations.

    How do you interpret the F-test results? ›

    Result of the F Test (Decided Using F Directly)

    If the F value is smaller than the critical value in the F table, then the model is not significant. If the F value is larger, then the model is significant. Remember that the statistical meaning of significant is slightly different from its everyday usage.

    What is the significance level of the F distribution? ›

    If the F-test statistic is greater than or equal to 2.92, our results are statistically significant. The probability distribution plot below displays this graphically. The shaded area is the probability of F-values falling within the rejection region of the F-distribution when the null hypothesis is true.

    Can F distribution be greater than 1? ›

    Since variances are always positive, if the null hypothesis is false, MSbetween will generally be larger than MSwithin. Then the F-ratio will be larger than one. However, if the population effect is small, it is not unlikely that MSwithin will be larger in a given sample.

    What is the expected value of f distribution? ›

    Note that the F distribution is constructed as the ratio of two independent scaled chi-squared random variables. The expected values and variances of such ratio distributions are often undefined. In this case, the expected value is only defined for ν₂ > 2, and the variance is only defined for ν₂ > 4.

    What is the conclusion of the F distribution? ›

    If the F statistic is larger than the critical value from the F distribution, the null hypothesis is rejected and it can be concluded that there is a significant difference between the means of the groups.

    How to solve F distribution? ›

    F-Distribution Formula

    The formula to calculate the F-statistic, or F-value, is: F = σ 1 σ 2 , or Variance 1/Variance 2. In order to accommodate the skewed right shape of the F-distribution, the larger variance is placed in the numerator and the smaller variance is used in the denominator.

    What is the decision rule for the F distribution? ›

    Decision Rule: Determine the critical value of F from the F-distribution tables for a chosen significance level (α, often 0.05) and the degrees of freedom from both samples. If the calculated F-statistic is greater than the critical value from the F-distribution table, reject H0.

    What is the range of the F distribution? ›

    The F-distribution curve is positively skewed towards the right with a range of 0 and ∞. The value of F is always positive or zero. No negative values. The shape of the distribution depends on the degrees of freedom of numerator ϑ1 and denominator ϑ2.

    What is another name for the F-distribution? ›

    In probability theory and statistics, the F-distribution or F-ratio, also known as Snedecor's F distribution or the Fisher–Snedecor distribution (after Ronald Fisher and George W.

    What is the t-distribution in layman's terms? ›

    What is the t-distribution? The t-distribution describes the standardized distances of sample means to the population mean when the population standard deviation is not known, and the observations come from a normally distributed population.

    When should you use t-distribution? ›

    You must use the t-distribution table when working problems when the population standard deviation (σ) is not known and the sample size is small (n<30). General Correct Rule: If σ is not known, then using t-distribution is correct. If σ is known, then using the normal distribution is correct.

    How do you interpret the F-test results in Excel? ›

    Decoding the F-Test Output in Excel

    F Statistic: Think of it as the ratio of variance between two datasets. A higher value indicates a significant difference in variances, suggesting that not all samples come from populations with the same variance.

    What does F mean in probability distribution? ›

    Definition. The cumulative distribution function (cdf) gives the probability that the random variable X is less than or equal to x and is usually denoted F(x) . The cumulative distribution function of a random variable X is the function given by F(x)=P[X≤x].

    What is the significance F in a regression? ›

    F is a test for statistical significance of the regression equation as a whole. It is obtained by dividing the explained variance by the unexplained variance. By rule of thumb, an F-value of greater than 4.0 is usually statistically significant but you must consult an F-table to be sure.

    Top Articles
    Latest Posts
    Article information

    Author: Van Hayes

    Last Updated:

    Views: 5815

    Rating: 4.6 / 5 (46 voted)

    Reviews: 85% of readers found this page helpful

    Author information

    Name: Van Hayes

    Birthday: 1994-06-07

    Address: 2004 Kling Rapid, New Destiny, MT 64658-2367

    Phone: +512425013758

    Job: National Farming Director

    Hobby: Reading, Polo, Genealogy, amateur radio, Scouting, Stand-up comedy, Cryptography

    Introduction: My name is Van Hayes, I am a thankful, friendly, smiling, calm, powerful, fine, enthusiastic person who loves writing and wants to share my knowledge and understanding with you.