# Standard Deviation

The **standard deviation** of a probability distribution, just like the variance of a probability distribution, is a measurement of the deviation in that probability distribution. It allows one to quantify how much the outcomes of a probability experiment tend to differ from the expected value.

Standard deviation is often used in the calculation of other statistics such as the \(z\)-score and the coefficient of variation.

#### Contents

## Properties of Standard Deviation

The standard deviation of a random variable \(X\), denoted as \(\sigma\) or \(\sigma(X)\), is defined as the square root of the variance.

The standard deviation of a random variable \(X\) is

\[ \sigma(X) = \sqrt{ \text{Var}[X] }.\]

Equivalently, the

standard deviationof a random variable \(X\) is\[\sigma=\sqrt{\sigma^2}.\]

What is the standard deviation of a fair six-sided die roll?

Let \(X\) be the random variable that represents the result of a fair six-sided die roll.

Recall from another example that \(\text{Var}[X]=\dfrac{35}{12}\). Then,

\[\sigma(X)=\sqrt{\dfrac{35}{12}}\approx 1.708.\ _\square\]

By the properties of variance, we have the following properties of standard deviation:

For random variable \(X\) and any constant \(c\), we have

\[\sigma(cX ) = \lvert c \rvert \big( \sigma(X) \big).\]

For random variable \(X\) and any constant \(c\), we have

\[\sigma(X + c ) = \sigma(X).\]

Let \(X_1, X_2, \ldots, X_k\) be pairwise independent random variables. Then

\[ \sigma(X_1 + X_2 + \ldots + X_k) = \sqrt{\text{Var}(X_1) + \text{Var}(X_2) + \cdots + \text{Var}(X_k)}.\]

## Gaussian Distribution and \(z\)-Scores

The variance and standard deviation of a random variable intuitively measure the amount of spread, or dispersion of the random variable from the mean. This allows us to approach the following question: what is the probability that the value of the random variable is far from its expected value?

In the worked examples below, we will prove bounds on this probability for a general random variable using Chebyshev's inequality. In the special case of a random variable \(X\) with a normal distribution (or Gaussian distribution) \(N(\mu, \sigma)\), we have the following:

\[\begin{align} P(\mu - \sigma \leq X \leq \mu + \sigma) & \approx 0.6826 \\ P(\mu - 2\sigma \leq X \leq \mu + 2\sigma) & \approx 0.9544 \\ P(\mu - 3\sigma \leq X \leq \mu + 3\sigma) & \approx 0.9972. \end{align}\]

If we consider the probability density function of \(X\), this shows that the area under the curve in the interval \( ( \mu - \sigma, \mu + \sigma ) \) is approximately \(0.6826\), the area under the curve in the interval \( ( \mu - 2\sigma, \mu + 2\sigma ) \) is approximately \(0.9544\), and the area under the curve in the interval \( ( \mu - 3\sigma, \mu + 3\sigma ) \) is approximately \(0.9972\). These probabilities, derived from the area under the curve of a normal distribution, hold true regardless of the value of \(\mu\) or \(\sigma\).

For a random variable, the **\(z\)-score**, or **standardized score**, of a value is the number of standard deviations the value is from the mean of the data.

Let \(X\) be a random variable. If \(\mu\) is the expected value of \(X\), \(\sigma\) is the standard deviation of \(X\), and \(x\) is a value that \(X\) can take, then the

\(z\)-scoreof \(x\) is\[z=\frac{x-\mu}{\sigma}.\]

This gives a measure of how far the value is with respect to the mean value. If a point has value less than the expected value, the \(z\)-score is negative. If the point has value greater than the expected value, the \(z\)-score is positive.

## Coefficient of Variation

While studying the spread of the random variable, we may want to take into account the magnitude of the expected value. The **coefficient of variation** is the ratio of the standard deviation to the expected value:

\[ CV(X) = \frac{\sigma}{\mu}. \]

The coefficient of variation is a measurement of the amount of deviation in a probability distribution relative to the expected value.

Prove Markov's inequality: For any nonnegative random variable \(X\) and positive constant \(a\),

\[P(X \geq a ) \leq \frac{E[X]}{a}.\]

Consider the event \(A = \{ s : X(s) \geq a \} \). Then

\[ \begin{align} E[X] &= \sum_s P(s)X(s) \\ &= \sum_{s \in A} P(s)X(s) + \sum_{s \not\in A} P(s)X(s) \\ & = \sum_{s \in A} P(s)X(s) \\ & \geq \sum_{s \in A} P(s)X(s) \qquad (\text{since } X(s) \geq 0 \text{ for all } s) \\ &\geq a \sum_{s \in A} P(s) \qquad (\text{since } X(s) \geq a \text{ for all } s \in A) \\ &= a P(A). \end{align}\]

Therefore, \(E[X] \geq a P( X \geq a) \), implying \( P( X \geq a) \leq \frac{E[X]}{a}.\) \(_\square\)

We now consider bounds for a general random variable (not necessarily nonnegative).

Prove Chebyshev's inequality: For a random variable \(X\) with mean \(\mu\) and standard deviation \(\sigma\) and for any positive constant \(a,\)

\[ P \big( \lvert X - \mu \rvert \geq a \sigma \big) \leq \frac{1}{a^2}.\]

Consider the random variable \( (X - \mu)^2\). This is a nonnegative random variable, so we can apply Markov's inequality to obtain

\[ \begin{align} P\big( (X - \mu) \geq a \sigma \big) &= P\big( (X - \mu)^2 \geq a^2 \sigma^2\big) \\\\ &\leq \frac{E\big[ (X - \mu)^2\big] }{a^2 \sigma^2} \\\\ &= \frac{\sigma^2 }{a^2 \sigma^2}\\\\ &= \frac{1}{a^2}. \end{align} \]

Therefore,

\[ P \big( \lvert X - \mu \rvert \geq a \sigma \big) \leq \frac{1}{a^2}.\ _\square\]

## Example Problems

In annual check-up, a standard fasting blood sugar level has a normal range of 75 to 115 mg/dL, claimed to cover approximately 95% of the world's population.

What is the cut-point integer value of high sugar level \(\big(\)beyond \(99^\text{th}\) percentile\(\big)\) in an individual? (You don't have to be a doctor to diagnose diabetes.)

**Cite as:**Standard Deviation.

*Brilliant.org*. Retrieved from https://brilliant.org/wiki/standard-deviation/