Statistics is such a powerful tool because it uses data to convey important information. This information may be visualized as a graph, chart, or table. It might be given numerically as the mean, median, mode, or percent change. There are so many different ways to frame information.

Because people can *choose* how to show statistics, sometimes it can be deceiving. Let's look at some examples.

A study is done on 250 people who claim they have psychic abilities.

A regular deck of cards has an equal number of red and black cards. The deck is shuffled, and each person predicts the color of the top card. The top card is then flipped face up to see if they are correct. This is repeated 8 times in a row.

One participant gets all 8 predictions correct! The researcher claims that this person has exhibited psychic abilities.

Based on just the way the experiment is designed, is the researcher's claim a viable hypothesis? (Note: we're not judging the plausibility of psychic ability in itself, just this particular experiment's design.)

Deception in statistics can be quite intentional. In the first problem, the choice of what months to display did not happen by accident! The researcher probably wanted to convey the data in a certain way.

Deception can also be unintentional. In the second problem, the researcher examining psychic phenomena may have been earnest about their hypothesis, but they still interpreted the results of their experiment poorly.

Alternatively, sometimes the statistics themselves can deceive. It depends *who* gathers the data and *how* they do it. The questions that follow will demonstrate this issue.

The Transportation Association of Canada wanted to study the effects of wearing a helmet on motorcycle accidents. The data below was collected by Canadian **hospitals** between 2002 and 2014. Each motorcyclist who was in a traffic accident and arrived at a hospital had the information below recorded.

Wearing helmet? | Yes | No |

Arrived in ambulance | 18% | 9% |

Had concussion | 9% | 3% |

Needed blood transfusion | 0.8% | 0.4% |

Based on the table, what is true?

**A.** Motorcyclists who came to a Canadian hospital that were wearing helmets in traffic accidents were twice as likely to arrive in an ambulance than motorcyclists who were *not* wearing helmets.

**B.** Motorcyclists who came to a Canadian hospital that were wearing helmets in traffic accidents were three times as likely to have a concussion than motorcyclists who were *not* wearing helmets.

**C.** 0.8% of all Canadian motorcycle riders who wear helmets will need a blood transfusion.

(*Data source:* Journal of Internal Medicine.)

The previous problem showed how it is important to remember the data source. Not everyone who gets in an accident goes to the hospital!

The table below shows ** all data** regarding motorcycle accidents in Canada between 2002 and 2014.

Which quadrants would be missing from the hospital's data?

The data from the previous two questions demonstrates what's known as *Berkson's bias*. In taking what appears to be a representative sample, a specific group may still be excluded. Hospitals won't see those who are healthy enough to not need a hospital, so the data won't include this part of the population.

Recall that the table below shows all motorcycle accidents in Canada between 2002 and 2014.

Look at all accidents involving motorcyclists wearing helmets. What percentage of these accidents did the hospital see?

Statistics cannot be done in a mathematical vacuum. Usually it involves real-life knowledge about the circumstances. This can be as important as any sort of calculation.

When looking at graphs or charts, it is important to inspect the axes and labels. When analyzing data, it is important to remember the data source and how the data was collected. This can help you recognize deceptive statistics.

It may seem, at this point, that statistics can only serve to confound. However, you'll find statistics can be used to pierce deception rather than just create it — continue the course to find out how!