Scatter graphs and correlation appear on both Foundation and Higher GCSE Maths papers across AQA, Edexcel and OCR. A scatter graph plots pairs of data to show whether a relationship exists between two variables — and if it does, you need to describe the type and strength of the correlation, draw a line of best fit, and use it to make estimates. These are accessible marks when you know what to look for, but careless plotting or vague descriptions can cost you. This guide covers all the key skills, gives worked examples at both tiers, highlights common errors, and provides practice questions. For broader revision planning, visit our complete GCSE Maths topics list.
What Is a Scatter Graph?
A scatter graph (also called a scatter diagram or scatter plot) displays two variables on a pair of axes. Each data point is plotted as a cross or dot. The pattern formed by the points reveals whether a correlation (relationship) exists between the variables.
Types of Correlation
- Positive correlation — as one variable increases, the other also increases. The points form a pattern rising from left to right. Example: hours of revision and test score.
- Negative correlation — as one variable increases, the other decreases. The points fall from left to right. Example: temperature and number of coats sold.
- No correlation — there is no clear pattern. The points are scattered randomly. Example: shoe size and IQ.
Strength of Correlation
- Strong — the points lie close to a straight line.
- Weak — the points roughly follow a trend but are more spread out.
Line of Best Fit
A line of best fit is a straight line drawn through the data that best represents the overall trend. It should:
- Pass through or close to as many points as possible.
- Have roughly equal numbers of points above and below the line.
- Pass through the mean point (x̄, ȳ) if you have calculated it.
Interpolation and Extrapolation
- Interpolation — making an estimate within the range of the data. This is generally reliable.
- Extrapolation — making an estimate outside the range of the data. This is unreliable because the trend may not continue.
Step-by-Step Method
Plotting a Scatter Graph
- Draw and label the horizontal and vertical axes with the two variables and their units.
- Choose sensible scales that use most of the available space.
- Plot each pair of values as a small cross (×).
- Do not join the crosses with lines.
Describing Correlation
- Look at the overall pattern. Do the points rise, fall, or show no trend?
- State the type: positive, negative or no correlation.
- State the strength if the question asks for it: strong or weak.
- If possible, put it in context: "There is a strong positive correlation between hours of study and test score, suggesting that students who study more tend to achieve higher marks."
Drawing a Line of Best Fit
- Identify the general trend.
- Position a ruler so that the line passes through the middle of the data with roughly equal points either side.
- Draw the line with a pencil and ruler — it does not have to pass through the origin.
- If given the mean values, ensure the line passes through (x̄, ȳ).
Using the Line of Best Fit
- Find the known value on the appropriate axis.
- Draw a line from that value to the line of best fit (use a ruler).
- Read the corresponding value from the other axis.
- State whether your estimate is interpolation or extrapolation.
Worked Example 1 — Foundation Level
Question: A teacher records the number of hours of revision and the test score for eight pupils.
| Hours | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|
| Score | 30 | 38 | 42 | 50 | 55 | 65 | 60 | 72 |
(a) Plot a scatter graph. (b) Describe the correlation. (c) Draw a line of best fit. (d) Use your line to estimate the score for a pupil who revises for 5.5 hours.
Working:
(a) Plot hours on the horizontal axis (0–10) and score on the vertical axis (0–80). Plot each pair as a cross.
(b) The points rise from left to right. There is a strong positive correlation — as hours of revision increase, test scores tend to increase.
(c) Draw a straight line through the data, passing roughly through (2, 30) and (9, 72). The line should have approximately equal points above and below.
(d) From 5.5 on the horizontal axis, draw up to the line, then across to the vertical axis. The estimate is approximately 52 marks. This is interpolation because 5.5 is within the range of the data.
Worked Example 2 — Higher Level
Question: The mean number of hours is x̄ = 5.5 and the mean score is ȳ = 51.5. (a) Plot the mean point on your scatter graph. (b) Draw a line of best fit through the mean point. (c) Estimate the score for a pupil who revises for 12 hours and comment on the reliability of your estimate.
Working:
(a) Plot (5.5, 51.5) on the scatter graph. Mark it clearly — for example, with a circle.
(b) Draw a line of best fit that passes through (5.5, 51.5) and follows the general trend of the data.
(c) Extending the line to 12 hours gives an estimated score of approximately 90 marks. However, this estimate is unreliable because 12 hours is outside the range of the data (extrapolation). The trend may not continue beyond the data collected, and scores are likely capped at a maximum mark.
Common Mistakes
- Joining the points — a scatter graph uses individual crosses, not a joined-up line. Only the line of best fit is drawn as a line.
- Forcing the line through the origin — the line of best fit should follow the data, not necessarily pass through (0, 0).
- Describing correlation without context — saying "positive correlation" is not enough at Higher level. Relate it to the variables: "As temperature increases, ice cream sales increase."
- Confusing correlation with causation — a correlation does not prove that one variable causes the other. There may be a third factor.
- Ignoring outliers — if one point clearly does not fit the pattern, ignore it when drawing your line of best fit and mention it as an outlier.
Exam Tips
- Use a sharp pencil and ruler — neatly plotted points and a straight line of best fit earn presentation marks.
- Label your axes — include units. Missing labels lose marks.
- Show your reading lines — when using the line of best fit, draw dotted guide lines from the axis to the line and across. The examiner needs to see how you obtained your estimate.
- State interpolation or extrapolation — Higher-tier questions often ask you to comment on reliability. If the value is inside the data range, it is interpolation (reliable). If outside, it is extrapolation (unreliable).
- Correlation ≠ causation — if the question asks "Does this prove…?", the answer is no. Correlation shows a relationship, not a cause.
- For more on related averages, see mean, median, mode and range. For key formulas, visit our GCSE Maths formulas list.
Practice Questions
Question 1 (Foundation): A scatter graph shows a pattern of points falling from left to right. Describe the correlation.
Question 2 (Foundation): A scatter graph shows no clear pattern. Can you draw a line of best fit?
Question 3 (Higher): A line of best fit on a scatter graph passes through (10, 25) and (30, 65). Estimate the value of y when x = 20.
Question 4 (Higher): A student says: "The scatter graph shows a strong positive correlation between ice cream sales and sunburn cases, so eating ice cream causes sunburn." Is the student correct? Explain.
Ready to practise these skills with instant, personalised feedback? Try our AI-powered GCSE Maths tutor at gcsemathsai.co.uk — it adapts to your level and helps you build confidence before exam day.
Related Topics
- Mean, Median, Mode and Range
- Cumulative Frequency and Box Plots
- Sampling Methods
- Probability Basics and Relative Frequency
Summary
Scatter graphs plot pairs of data to show relationships between two variables. You need to identify the type of correlation — positive (points rise), negative (points fall) or none (no pattern) — and describe its strength. A line of best fit follows the trend and passes through the mean point when given. Use the line to make estimates: interpolation (within the data range) is reliable; extrapolation (outside the data range) is not. Always remember that correlation does not prove causation. In the exam, plot neatly, label axes, show reading lines, and relate your answers to the context of the question.