Mean Absolute Deviation

Mean absolute deviation describes the average distance between the values in a data set and the mean of the set.

For example, a data set with a mean average deviation of 3.2 has values that are on average 3.2 units away from the mean.

Understanding Variability

A group of kids in town A has a bake sale every week for 5 weeks, making $40, $40, $50, $55, and $65. Their mean sales are $\frac{40+40+50+55+65}{5}= $ $50.

A group of kids in town B also has a bake sale every week for 5 weeks, making $10, $20, $35, $80, and $105. Their mean sales are $\frac{10+20+35+80+105}{5}=$ $50.

Notice that both groups made the same average amount of money, but group B had much more variability in their sales. We might say that the mean is a better representation of "typical" for group A than for group B.

While the mean, and other measures of center, provide us with important information about a data set, measures of variability provide us with additional insight.

Is the following statement true or false?

If two data sets share the same amount of variability, then they will have the same mean.

The statement is false. Variability and mean are independent measurements. For example, given two data sets with the same variability, one data set might have more data points with higher values, yielding a higher mean.

Calculating Mean Absolute Deviation (MAD)

One way to measure variability is to use mean absolute deviation, (MAD). It's the average of the absolute values of the differences between each data point and the mean. A data set with a MAD of 7 has more variability than a data set with a MAD of 4.

Let's find the mean absolute deviation for this data set: $ \{ 1, 1, 2, 5, 6\}.$ It has a mean of $\frac{1+1+2+5+6}{5} = 3.$ The differences between the data points and the mean are \[(1-3 = -2),\] \[(1-3=-2),\] \[(2-3 = -1),\] \[(5-3=2),\] \[(6-3=3).\] Averaging the absolute value of these differences, we get $\frac{2+2+1+2+3}{5}=3.$ The data points are, on average, 3 units away from the mean so the mean absolute deviation is 3.

Why do we need to take the absolute value of the differences between the data points and the mean?

Given that the mean value is the weighted center of a data set, exactly half of the differences between the data points and the mean are negative and exactly half of the differences between the data points and the mean are positive. Therefore, summing all of these differences would yield a sum of 0 and a mean absolute deviation of 0.

Which of the following has a larger mean absolute deviation?

\[ A = \{ 1, 13, 13, 13, 13, 25 \} \\ B = \{ 6, 6, 6, 14, 14, 14 \} \]

The mean absolute deviation for both sets is the same. Both have a total absolute deviation of 24 and both have 6 values, so the mean absolute deviation in each set is 4.

Note that intuitively, $A$ is "spread out" far more than $B$: it has significant outliers and a range of 24 rather than the range of 8 we find in set $B$. However, according to the mean absolute deviation, they have the same amount of variance.

Contents

Is the following statement true or false?

Why do we need to take the absolute value of the differences between the data points and the mean?

Which of the following has a larger mean absolute deviation?