CHAPTER 06.03: LINEAR REGRESSION: Linear Regression with Zero Intercept: Derivation
In this segment, we're going to talk about linear regression, but we're going to talk about a special case of no intercept. So this is the example where you have no intercept for a linear regression model. If you remember that what we are trying to do here in the linear regression with no intercept is, given x1, y1, all the way up to xn, yn, somebody's telling you to best fit now y is equal to a1 x, that's it. There is no intercept, there is no a0 quantity in your linear regression model, but what that means is that the regression model is going right through the origin, because when x is equal to 0, y is going to be 0 as well. A good example of this can be when you have stress versus . . . stress versus strain data. So let's suppose you're taking stress versus strain data for a piece of steel, let's suppose, you're going to get stress versus strain data, then you may have to regress it to sigma is equal to E epsilon. So here there's no . . . there's no intercept in this straight line, this is a special case of a straight line where the intercept is 0. So you're going to, this is stress, and this is strain, and once you have the stress versus strain, whatever regression you're going to draw, the slope of that line will be simply the Young's modulus. So this is a pragmatic application of why we would want to use a straight line regression model without the intercept. Now, if we go back to the formula where we had, given x1, y1, all the way up to xn, yn, so this was something which we did previously, that if it's given, and somebody says, hey, best fit y is equal to a0, plus a1 x. And when we had this, where we had the intercept, then we developed the formulas. We got a1 equal to n times summation, xi yi, i is equal to 1 to n, minus the summation of xi, summation of yi, divided by n times summation, xi squared, i is equal to 1 to n, minus summation, xi, i is equal to 1 to n, squared. And then a0 we got as y-bar, minus a1 x-bar. So this is what we obtained when we had y is equal to a0, plus a1 x, so I'm just recapitulating what we have done. This is what we got for a1, this is what we got for a0. Now, somebody might suggest that, hey, since a0 is equal to 0 . . . since a0 is equal to 0, I can just simply use whatever a1 I get here to choose, since a0 is equal to 0, can I use a1 of the above formula for y is equal to a1 x model? The question is that since a0 is equal to 0 for our current model, in which we have no intercept, can I simply use this expression for a1, say that, hey, this is my value for a1 for the regression model of y is equal to a1 x? And somebody might say, no, why don't you just go just go ahead and take this one? Since a0 is equal to 0, 0 is equal to y-bar, minus a1 x-bar, so this looks nice, so a1 is y-bar divided by x-bar then. But what is wrong with this? There are a lot of things wrong with this. There are a lot of things wrong with choosing this to be the value of a1 for your y is equal to a1 x regression model. So you cannot use this expression for your a1 for your regression model, you cannot choose this one for your regression model, and the reason why that is so is because the way a0 and a1 were found for . . . a0 and a1 were found for your linear regression model which had an intercept and had a slope, was based on the fact that you are minimizing with respect to a0 and a1. You are minimizing with respect to a0 and a1, that's how you got these two expressions, but when you have this simple . . . this simpler linear regression model where there's no intercept, you can only minimize with respect to a1, and since you only can minimize with respect to a1, you cannot use the same formulas as are given here, because a0 is fixed. Your a0 is fixed for your regression model with no intercept, because a0 is 0, so you cannot minimize something which is already fixed. So that's why using these formulas are not valid, so I'm going to put a big cross right here. So this is a wrong formula to use, this is a wrong formula to use, so both of these are wrong formulas to use for if your model is y is equal to a1 x. So the question arises how will I find a1? Well, you go back to the drawing board by calculating the sum of the square of the residuals, and finding out what a1 is. So again here, what's going to happen is that the residual at each point will be the observed value, which is yi, minus the predicted value, which will be a1 times xi, because the value of x at that particular point is xi. So that is the residual. Now I want to square the residual, that's this, then what I want to do is I want to sum all these residuals from all . . . of all the data points, i is equal to 1 to n, and that's what my Sr is. So for this model, the sum of the square of the residuals is simply squaring the residual at each point, and then adding them all up, just like I had for the regular linear regression model. Now, in this case, in order to find out the derivative . . . in order to find out a1, all I have to do is to take the derivative with respect to a1, and put that equal to 0, and in this case it will turn out to be 2 times summation, yi, minus a1 xi, times minus xi, going from i is equal to 1 to n. So it is 2 times this quantity, times the derivative of this inside bracket quantity, parentheses quantity, with respect to a1, which is minus xi, and that I'll have to put equal to 0. So if I expand this, I get -2 summation, i is equal to 1 to n, xi yi, plus 2 summation, a1 xi squared, i is equal to 1 to n, this is what I get by expanding the summation there. The 2 and 2 cancels, and then I can take this to the right-hand side, because it's a known quantity, and I can take a1 outside, because it's a constant, so I'll get a1 times summation, i is equal to 1 to n, xi squared is equal to summation, i is equal to 1 to n, xi yi, that's what I'm going to get. And what that . . . this particular equation here, this particular equation here is going to give me the value for a1, which is summation, xi yi, i is equal to 1 to n, divided by summation, i is equal to 1 to n, xi squared. So that is the way you're going to find out what the value of a1 is for the regression model, y is equal to a1 x. So as part of your homework, I'm going to, since we are taking the first derivative, so we took the first derivative with respect to a1, put that equal to 0, that only tells me it is a min or max. It does not tell me whether it is a, this a1 which we just found out, the a1 expression which we just found out, whether it corresponds to a minimum or a maximum. What I would like you to do is to use the second derivative test to show that a1 value you got corresponds to a minimum. So what you'll have to do is you'll have to take the second derivative of Sr with respect to a1, put that equal . . . then substitute the value of a1 which you just . . . which we just obtained, and then see whether that quantity is positive, that will correspond then to a minimum, so that's part of your homework. And that's the end of this segment.