You have a set of data points (x1,y1), (x2,y2), ..., (xn,yn), and you have assumed a line model, y = mx + b + e, where e is random error.
You have fit the regression model to obtain estimates of the slope, m, and the intercept, b. Let me call them m and b.
Now you can calculate yi - mxi - b for i = 1, 2, ... n. Notice that, for each i, this is an estimate of the error in yi. It's called the residual because it's what's 'left over' in yi after removing the part 'explained' by the regression.
Another way of understanding this is to take a set of linearly related (x,y) pairs, graph them, calculate the regression line, plot it on the same graph and then measure the verticaldistances between the regression line and the each of the pairs. Those vertical distances are the residuals.
Chat with our AI personalities
One of the main reasons for doing so is to check that the assumptions of the errors being independent and identically distributed is true. If that is not the case then the simple linear regression is not an appropriate model.
a random pattern
Linear regression can be used in statistics in order to create a model out a dependable scalar value and an explanatory variable. Linear regression has applications in finance, economics and environmental science.
I want to develop a regression model for predicting YardsAllowed as a function of Takeaways, and I need to explain the statistical signifance of the model.
The value depends on the slope of the line.