A-Level Maths / Statistics / Statistics

Correlation & Regression

Scatter diagrams, correlation coefficients, regression lines, interpolation and extrapolation.

Statistics AS 45 min

Learning Objectives

  • Draw and interpret scatter diagrams for bivariate data
  • Understand and interpret the product moment correlation coefficient (PMCC)
  • Calculate and interpret the equation of a least squares regression line
  • Distinguish between interpolation and extrapolation, and understand their reliability
  • Use coding to simplify regression and correlation calculations
  • Understand the limitations of correlation and regression analysis

Key Formulae

r=SxySxxSyyr = \frac{S_{xy}}{\sqrt{S_{xx} S_{yy}}}
Sxy=xyxynS_{xy} = \sum xy - \frac{\sum x \sum y}{n}
Sxx=x2(x)2nS_{xx} = \sum x^2 - \frac{(\sum x)^2}{n}
b=SxySxx,a=yˉbxˉb = \frac{S_{xy}}{S_{xx}}, \quad a = \bar{y} - b\bar{x}
Regression line: y=a+bx\text{Regression line: } y = a + bx

Prior Knowledge Check

Answer at least 3 of 3 correctly to complete this section.

Q1. What does it mean if two variables have positive correlation?
Q2. What is interpolation?
Q3. If a scatter diagram shows points closely following a downward line from left to right, which best describes the correlation?

Why This Matters

When we collect data on two variables — such as hours of revision and exam scores, or temperature and ice cream sales — we want to know: is there a relationship, and can we use it to make predictions?

Correlation measures the strength and direction of a linear relationship. Regression gives us an equation to predict one variable from the other. Together, they are among the most widely-used tools in data analysis, from medical research to economics.

1/3

Scatter Diagrams and Correlation

2/3

Least Squares Regression

3/3

Exam Practice

Ready to practise?

Lock in what you've learned with exam-style questions and spaced repetition.

Exam Tips

  • The regression line of y on x is used to predict y from x — not the other way round
  • When interpreting the gradient, give the context: 'For each additional unit increase in x, y increases by b on average'
  • Always state whether a prediction involves interpolation (reliable) or extrapolation (unreliable)
  • Correlation does not imply causation — always consider lurking variables
  • The PMCC r is always between −1 and +1; values close to ±1 indicate strong linear correlation

Specification

Edexcel A Level Maths
Statistics > Correlation & Regression
WJEC A Level Maths
Statistics > Data Presentation & Interpretation

Resources

Related Lessons