# Standard Deviation

The **standard deviation** of a probability distribution, just like the variance of a probability distribution, is a measurement of the deviation in that probability distribution. It allows one to quantify how much the outcomes of a probability experiment tend to differ from the expected value.

Standard deviation is often used in the calculation of other statistics such as the $z$-score and the coefficient of variation.

#### Contents

## Properties of Standard Deviation

The standard deviation of a random variable $X$, denoted as $\sigma$ or $\sigma(X)$, is defined as the square root of the variance.

The variance of a random variable $X$ is

$\sigma(X) = \sqrt{ \text{Var}[X] }.$

Equivalently, the

standard deviationof a random variable $X$ is$\sigma=\sqrt{\sigma^2}.$

What is the standard deviation of a fair six-sided die roll?

Let $X$ be the random variable that represents the result of a fair six-sided die roll.

Recall from another example that $\text{Var}[X]=\dfrac{35}{12}$. Then,

$\sigma(X)=\sqrt{\dfrac{35}{12}}\approx 1.708.\ _\square$

By the properties of variance, we have the following properties of standard deviation:

For random variable $X$ and any constant $c$, we have

$\sigma(cX ) = \lvert c \rvert \big( \sigma(X) \big).$

For random variable $X$ and any constant $c$, we have

$\sigma(X + c ) = \sigma(X).$

Let $X_1, X_2, \ldots, X_k$ be pairwise independent random variables. Then

$\sigma(X_1 + X_2 + \ldots + X_k) = \sqrt{\text{Var}(X_1) + \text{Var}(X_2) + \cdots + \text{Var}(X_k)}.$

## Gaussian Distribution and $z$-Scores

The variance and standard deviation of a random variable intuitively measure the amount of spread, or dispersion of the random variable from the mean. This allows us to approach the following question: what is the probability that the value of the random variable is far from its expected value?

In the worked examples below, we will prove bounds on this probability for a general random variable using Chebyshev's inequality. In the special case of a random variable $X$ with a normal distribution (or Gaussian distribution) $N(\mu, \sigma)$, we have the following:

$\begin{aligned} P(\mu - \sigma \leq X \leq \mu + \sigma) & \approx 0.6826 \\ P(\mu - 2\sigma \leq X \leq \mu + 2\sigma) & \approx 0.9544 \\ P(\mu - 3\sigma \leq X \leq \mu + 3\sigma) & \approx 0.9972. \end{aligned}$

If we consider the probability density function of $X$, this shows that the area under the curve in the interval $( \mu - \sigma, \mu + \sigma )$ is approximately $0.6826$, the area under the curve in the interval $( \mu - 2\sigma, \mu + 2\sigma )$ is approximately $0.9544$, and the area under the curve in the interval $( \mu - 3\sigma, \mu + 3\sigma )$ is approximately $0.9972$. These probabilities, derived from the area under the curve of a normal distribution, hold true regardless of the value of $\mu$ or $\sigma$.

For a random variable, the **$z$-score**, or **standardized score**, of a value is the number of standard deviations the value is from the mean of the data.

Let $X$ be a random variable. If $\mu$ is the expected value of $X$, $\sigma$ is the standard deviation of $X$, and $x$ is a value that $X$ can take, then the

$z$-scoreof $x$ is$z=\frac{x-\mu}{\sigma}.$

This gives a measure of how far the value is with respect to the mean value. If a point has value less than the expected value, the $z$-score is negative. If the point has value greater than the expected value, the $z$-score is positive.

## Coefficient of Variation

While studying the spread of the random variable, we may want to take into account the magnitude of the expected value. The **coefficient of variation** is the ratio of the standard deviation to the expected value:

$CV(X) = \frac{\sigma}{\mu}.$

The coefficient of variation is a measurement of the amount of deviation in a probability distribution relative to the expected value.

Prove Markov's inequality: For any nonnegative random variable $X$ and positive constant $a$,

$P(X \geq a ) \leq \frac{E[X]}{a}.$

Consider the event $A = \{ s : X(s) \geq a \}$. Then

$\begin{aligned} E[X] &= \sum_s P(s)X(s) \\ &= \sum_{s \in A} P(s)X(s) + \sum_{s \not\in A} P(s)X(s) \\ & = \sum_{s \in A} P(s)X(s) \\ & \geq \sum_{s \in A} P(s)X(s) \qquad (\text{since } X(s) \geq 0 \text{ for all } s) \\ &\geq a \sum_{s \in A} P(s) \qquad (\text{since } X(s) \geq a \text{ for all } s \in A) \\ &= a P(A). \end{aligned}$

Therefore, $E[X] \geq a P( X \geq a)$, implying $P( X \geq a) \leq \frac{E[X]}{a}.$ $_\square$

We now consider bounds for a general random variable (not necessarily nonnegative).

Prove Chebyshev's inequality: For a random variable $X$ with mean $\mu$ and standard deviation $\sigma$ and for any positive constant $a,$

$P \big( \lvert X - \mu \rvert \geq a \sigma \big) \leq \frac{1}{a^2}.$

Consider the random variable $(X - \mu)^2$. This is a nonnegative random variable, so we can apply Markov's inequality to obtain

$\begin{aligned} P\big( (X - \mu) \geq a \sigma \big) &= P\big( (X - \mu)^2 \geq a^2 \sigma^2\big) \\\\ &\leq \frac{E\big[ (X - \mu)^2\big] }{a^2 \sigma^2} \\\\ &= \frac{\sigma^2 }{a^2 \sigma^2}\\\\ &= \frac{1}{a^2}. \end{aligned}$

Therefore,

$P \big( \lvert X - \mu \rvert \geq a \sigma \big) \leq \frac{1}{a^2}.\ _\square$

## Example Problems

In annual check-up, a standard fasting blood sugar level has a normal range of 75 to 115 mg/dL, claimed to cover approximately 95% of the world's population.

What is the cut-point integer value of high sugar level $\big($beyond $99^\text{th}$ percentile$\big)$ in an individual? (You don't have to be a doctor to diagnose diabetes.)

**Cite as:**Standard Deviation.

*Brilliant.org*. Retrieved from https://brilliant.org/wiki/standard-deviation/