EST. 2024 · LONDON·MMXXVI SPECIFICATION
AQA·Edexcel·OCR|Foundation + Higher
Statistics & Probability

Sheet № 68 · Foundation + Higher · AQA · Edexcel · OCR

68

Scatter Graphs and Correlation –

Scatter graphs and correlation appear on both Foundation and Higher GCSE Maths papers across AQA, Edexcel and OCR. A scatter graph plots pairs of data to show whether a relationship exists between two variables — and if it does, you need to describe the type and strength of the correlation, draw a line of best fit, and use it to make esti

§Key definitions

Question:

A teacher records the number of hours of revision and the test score for eight pupils.

(a)

Plot hours on the horizontal axis (0–10) and score on the vertical axis (0–80). Plot each pair as a cross.

(b)

The points rise from left to right. There is a strong positive correlation — as hours of revision increase, test scores tend to increase.

(c)

Draw a straight line through the data, passing roughly through (2, 30) and (9, 72). The line should have approximately equal points above and below.

(d)

From 5.5 on the horizontal axis, draw up to the line, then across to the vertical axis. The estimate is approximately 52 marks. This is interpolation because 5.5 is within the range of the data.

§Formulas to memorise

Positive correlation — as one variable increases, the other also increases. The points form a pattern rising from left to right. Example: hours of revision and test score.

Negative correlation — as one variable increases, the other decreases. The points fall from left to right. Example: temperature and number of coats sold.

No correlation — there is no clear pattern. The points are scattered randomly. Example: shoe size and IQ.

Strong — the points lie close to a straight line.

Weak — the points roughly follow a trend but are more spread out.

Interpolation — making an estimate within the range of the data. This is generally reliable.

Extrapolation — making an estimate outside the range of the data. This is unreliable because the trend may not continue.

Have roughly equal numbers of points above and below the line.

Position a ruler so that the line passes through the middle of the data with roughly equal points either side.

Worked example

A teacher records the number of hours of revision and the test score for eight pupils. | Hours | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |---|---|---|---|---|---|---|---|---| | Score | 30 | 38 | 42 | 50 | 55 | 65 | 60 | 72 | (a) Plot a scatter grap

Working:

Common mistakes

  • Joining the points — a scatter graph uses individual crosses, not a joined-up line. Only the line of best fit is drawn as a line.
  • Forcing the line through the origin — the line of best fit should follow the data, not necessarily pass through (0, 0).
  • Describing correlation without context — saying "positive correlation" is not enough at Higher level. Relate it to the variables: "As temperature increases, ice cream sales increase."
  • Confusing correlation with causation — a correlation does not prove that one variable causes the other. There may be a third factor.
  • Ignoring outliers — if one point clearly does not fit the pattern, ignore it when drawing your line of best fit and mention it as an outlier.

Exam tips

  • Use a sharp pencil and ruler — neatly plotted points and a straight line of best fit earn presentation marks.
  • Label your axes — include units. Missing labels lose marks.
  • Show your reading lines — when using the line of best fit, draw dotted guide lines from the axis to the line and across. The examiner needs to see how you obtained your estimate.
  • State interpolation or extrapolation — Higher-tier questions often ask you to comment on reliability. If the value is inside the data range, it is interpolation (reliable). If outside, it is extrapolation (unreliable).
  • Correlation ≠ causation — if the question asks "Does this prove…?", the answer is no. Correlation shows a relationship, not a cause.
MMXXVI specification · AQA · Edexcel · OCRgcsemathsai.co.uk/formulas/scatter-graphs-and-correlation