Click here for a transcript.
Hello, and welcome back to another video in our linear regression lesson. In this video, I'm going to talk about how we can conduct a hypothesis test to test for the significance of our slope value. And so, this is another great way to add sort of a statistical inference aspect to your linear regression analysis. So you can do correlation, you can look at slope, and later we'll get into additional tests that you can do.
So the first thing that we do with our hypothesis test is state our hypotheses. And I have them stated here. So our h naught is beta 1 equals zero. Likely always it's got an equal sign. And in this case, we're always assuming that it's zero. And then we have our alternative where beta naught is not equal to zero. Now alternatively, you could have beta naught less than zero, or beta 1 greater than zero. But here I'm going to do that two-sided alternative where we say that beta 1 is simply not equal to zero. And now, unlike some of the previous hypothesis testing that we have done, there's no randomization procedure for the slope. In fact, we're going to use this linregress function that we did to do our one-line linear regression in a previous video. And so, this we say stats dot linregress and we just give one value or one variable, then the next variable. And we state our alternative. And this time we're doing two-sided. Alternatively, we could do greater or less as long as we follow our alternative hypothesis. Then we can print the output. And this output is what we have seen before, slope, intercept, r-value, p-value, etc. And so, when we're trying to figure out how to respond, what conclusion to draw from our hypothesis test, we need the p-value. That's just what's given here. And so, we can print that in particular, and say output dot key value. We can do it in a print statement like we have before, so print 'p-value' just output pvalue and run that. And so, we get this p-value of 0.001519, which is below the standard significance level 0.05. And so, we can make this conclusion that the p-value is 0.000152, which is below the significance level. Therefore, we reject the null hypothesis in favor of the alternative that the slope is statistically significantly different from zero. And this is interesting, because if you look up here, our slope is barely above zero, zero point three one three. But that is enough for it to be statistically significantly different from zero, because of really how correlated these values are that even a small slope is significantly different than no slope. And so, this is an example of when you might see that small value and think that it can't be that different from zero. But in reality, because of units and because of the correlation between these two variables, it is much statistically significantly different from zero.