Data Handling

Measures of dispersion

Interpret range, interquartile range, variance, and standard deviation for comparing data sets.

Dispersion describes how spread out data is. Two datasets can have the same average but very different variability.

Range and interquartile range

\[ \text{Range}=\text{max}-\text{min},\qquad \text{IQR}=Q_3-Q_1. \]

IQR is less affected by outliers than range.

Variance and standard deviation

Variance measures average squared distance from mean. Standard deviation is square root of variance and has same unit as data.

\[ \sigma=\sqrt{\frac{\sum (x-\bar{x})^2}{n}} \]

Comparing datasets

  • Smaller spread statistics \(\Rightarrow\) more consistent data.
  • Use median + IQR for skewed data.
  • Use mean + standard deviation for roughly symmetric data.

Box plot interpretation

Box shows middle 50% (\(Q_1\) to \(Q_3\)); median line marks center. Longer whisker/box indicates larger spread.

Exam strategy

  • State both center and spread when comparing sets.
  • Do not confuse variance with standard deviation.
  • Check if grouped-data values are estimates.
  • Interpret spread in context (consistency/risk).

Checkpoints

Find range of 12, 9, 15, 20, 13.

Answer: \(20-9=11\).

If \(Q_1=18\), \(Q_3=31\), find IQR.

Answer: 13.

Dataset A and B have same mean. A has SD 2, B has SD 8. Which is more consistent?

Answer: A (smaller SD).

Why is IQR preferred over range when outliers exist?

Answer: IQR focuses on middle 50% and is less distorted by extreme values.

Last modified: