CHAPTER 01.03: SOURCES OF ERROR : Effect of Carrying Significant Digits

 

We're going to talk about what is the effect of carrying significant digits in calculation.  We'll take a pragmatic problem to do that.  There are many other problems which people talk about when they talk about carrying significant digits, like they will show you some examples of subtraction of two close numbers, or they will show you the quadratic equation, where if you subtract two close numbers, that you get results which are inaccurate. But we wanted to show you a pragmatic application of the same phenomenon. This is an example where we wanted to be able to figure out how much would be the contraction in the diameter of a steel shaft here.  So somebody gave us this steel shaft, and what they wanted to do to the steel shaft was, they wanted to shrink-fit it into another . . . in a hub.  And in order to be able to do that, they have to shrink this diameter by about 0.015 inches, so as to be able to take this trunnion and shrink-fit it into the hub.  So in order to be able to calculate how much is the change in the diameter, we are using this particular formula here for finding out what the change in the diameter is, which is given by . . . the change in the diameter is given by the diameter, D, which is the original diameter, multiplied by some integral of the thermal expansion coefficient, which you are going to integrate from the room temperature, which is 80 degrees Fahrenheit, room temperature of . . . not to the room temperature, but to the temperature of the liquid medium in which you are placing this trunnion, or this shaft, to be able to contract. And this particular mixture which you are seeing here, or this temperature which you are seeing here of -108 degrees Fahrenheit is actually the temperature of dry ice and alcohol mixture. And then here, in order to be able to do the problem, what we said, hey, let's go ahead and regress alpha as a function of temperature. 

     

      So we are given data of alpha versus temperature from the handbook, so we want to be able to regress it to a second-order polynomial like this one. And then, the reason why we're doing that is because once we do the regression to a second-order polynomial like this one, we can just simply substitute it in here, we can substitute it in here and do the integration of the second-order polynomial by using any of the . . . by using integral calculus, and be able to find out what the changed in the diameter is.  So what we did was that we went into Excel to be able to find out what this second-order polynomial would be for regression.  Now here I'm just showing you the data which we were given in the handbook for this particular material of the shaft, which was cast steel, and the data is given to us from -340 to 80 degrees Fahrenheit, I'm just showing you the data points here.  So what we wanted to was to be able to regress this particular data  to a second-order polynomial.  Now how we go about doing that is that we went into an Excel spreadsheet, and used the trendline command.  So all you have to do is to once you develop the scatter plot in Excel, you right-click on the data and it gives you the option of trendline.  And trendline basically gives you an option of using regression to a first-order polynomial, a second-order polynomial, all the way up to a sixth-order polynomial, also it gives you the option of regressing to exponential and other nonlinear models, too.  So in this case we chose a second-order polynomial, and this is what we got as a regression line, as a default regression line when we chose that, hey, go ahead and regress is to a second-order polynomial. And you're finding out that this particular polynomial most probably is doing a very good job of approximating the data, as you can see from a visual point of view that the curve is almost very close to the data points itslef.  So what we did was, we said, hey, let's go ahead and see how good does it predict the values.  So we say, let's go and take the data points which are given to us, and these were the given data points.  So for example, at -340, the data was given as 2.45, and when I took this particular default format regression formula which is given by Excel, I got a predicted value of 2.76. Now some people might say, hey, we can dismiss that, the difference between the given value and the predicted value, we can dismiss that, because we're doing regression, we don't expect that the predicted values would be the same as the given values, but what you saw from the previous slide was that you are getting values which are very close to the predicted . . .  to the given values from the graph itself.  So how can you account for a difference of 0.31 between the predicted and the given value? Visually, I didn't see that.  So let's go ahead and see what is the reason for that.  Why are we getting a difference of 0.31 when we are plugging it into the formula, one of the given values.  And the reason what we found out is that we should have not chosen the default format for Excel for the second-order polynomial, but we should have chosen the scientific format.  So here I chose the scientific format, and used five significant digits, which is the same as saying four after the decimal digit in Excel, and I get the same alpha, there's no difference between the second-order polynomial which I've obtained now and which I had before, it's just that the format in which it was written is different.  And let's go ahead and see what happens when I use a scientific format to be able to compare my given values and predicted values.  And now here, for the same data point which I have for -340, 2.45 was the given value which I have, and now by using this as my second-order polynomial regression curve, I'm getting a predicted value of 2.46, and there's only a difference of 0.01 between the two.  So just because I chose more significant digits here, I chose five significant digits as opposed to the default format which Excel gave to me, I am finding out that the predicted and the observed values, or the given values difference is smaller, as expected from when I looked at the graph. So in the next slide what I did was I basically took both the expressions which I had, these are the . . . this is the scientific format one with five significant digits, this is the default format which Excel got.  And the reason why you should be cautious is because when you're going to do your Excel regression, you're always going to get it in default format, unless or until you have changed it . . . changed that default to some other format.  So many people, students, when they do this particular process, they just take the default format and just run with it, and you can very well see that at -340, 2.45 is the given value, 2.46 is what I predict by using the . . . by using the scientific format second-order polynomial, and 2.76 is what I predict by using the one which was the default format.  So that's the difference which you see between the predicted values from the two different regression lines, which are the same regression line, but you're just showing more significant digits in one than the other.  So it shows you a pragmatic view of why you need to carry significant digits in all of your calculations, and you should not be basically relying on decimal digits, you should be relying on significant digits in your calculations.  And that's the end of this segment.