Data Handling

Statistical literacy

Evaluate claims in media and research: sampling bias, correlation versus causation, and misleading charts.

Statistical literacy is the ability to read claims critically, question data quality, and avoid misleading conclusions.

Questions to ask about a claim

  • Who collected the data and why?
  • How large and representative is the sample?
  • How was the data measured?
  • Are conclusions consistent with evidence?

Sampling and bias

Biased samples produce unreliable conclusions. Prefer random, sufficiently large, and representative samples.

Correlation vs causation

Correlation means variables move together. It does not prove one causes the other. Confounders may explain both.

Misleading charts

  • Truncated axes exaggerating differences.
  • Inconsistent scales.
  • Cherry-picked time ranges.
  • 3D chart distortion.

Interpreting risk language

Distinguish absolute change from relative change. “Risk doubled” may still represent a very small absolute increase.

Exam strategy

  • Quote numerical evidence when critiquing claims.
  • Mention possible bias/confounding clearly.
  • Use “suggests” unless causation is justified by design.
  • Comment on uncertainty and data limitations.

Checkpoints

A survey about school meals only asks students in one class. Name one issue.

Answer: Sampling bias; sample may not represent the whole school.

Two variables are strongly correlated. Can we conclude causation?

Answer: No, correlation alone is insufficient.

A bar chart axis starts at 95 instead of 0. Why can this mislead?

Answer: Differences look much larger than they really are.

“Risk increased from 1% to 2%.” State absolute and relative increase.

Answer: Absolute +1 percentage point; relative increase 100% (doubled).

Name one way to improve reliability of a poll.

Answer: Use a larger random sample from the target population.

Last modified: