Interpreting the Output from Multiple Linear Regression

Read It: Interpreting Output from Multiple Linear Regression

After implementing multiple linear regression in Python, the primary source of interpretation will be the output of the model summary, an example of which is shown below.

Example output of implementing multiple linear regression in Python. This output is critical for interpreting your multiple linear regression model.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

There are three key areas to focus on in this plot. The first is the F-statistic and its associated p-value. As in Lesson 8, these values can be used to determine model effectiveness using the F-statistic hypothesis test. For multiple linear regression, however, the alternative hypothesis is slightly different, as shown below.

$F-statistic and p-value for whole model:$

$H_{0} : model is ineffective$

$H_{A} : at least one predictor is effective$

Notice how the alternative tells us that at least one explanatory variable (predictor) is effective, rather than focusing on the model as a whole. To figure out which predictors are effective, you need to look at the lower half of the output, where the coefficients, t-statistic, and associated p-values are listed. These p-values tell you how significant a given explanatory variables, with values < 0.05 (or your chosen significance level) being significant. You can also use these p-values to test the significance of the coefficient using the hypothesis tests below, notice that they are similar to those you learned in Lesson 8 for the hypothesis test for slope.

$t-statistic and p-value for individual predictors:$

$H_{0} : β_{i} = 0$

$H_{A} : β_{i} \neq 0$

Finally, the last piece of critical information in the model summary is the Adjusted R² value. This is the value that you will use to determine the "goodness of fit" for any multiple linear regression models. In particular, the Adjusted R² value accounts for model complexity, as well as the difference between the predicted and actual values. In this sense, adding explanatory variables that don't contribute to the model accuracy can actually reduce your Adjusted R². Mathematically, the Adjusted R² is represented below. You may notice in the above output that the Adjusted R² and regular R² are the same, this happens when there are no insignificant explanatory variables in your model. That being said, it is much more common to have a regular R² that is greater than your Adjusted R².

$R_{a d j}^{2} = 1 - (1 - R^{2}) \frac{n - 1}{n - p}$

Interpreting the Output from Multiple Linear Regression

Interpreting the Output from Multiple Linear Regression

Read It: Interpreting Output from Multiple Linear Regression

Assess It: Check Your Knowledge

Knowledge Check

Navigation

EMS

Programs

Related Links