Outliers Identify any of the differences found from Exercise 1...

Question

Outliers Identify any of the differences found from Exercise 1 that appear to be outliers. For any outliers, how much of an effect do they have on the mean, median, and standard deviation?

Accepted Answer

Welcome back, everyone. In this problem, the self-reported weights of adults aged 25 and over, along with their corresponding measured weights and the differences between reported and measured weights are provided in the table below where all weights are in points. Using the differences between reported and measured weights, we want to identify any outlayers and determine the effect of out layers on the mid mean median and the standard deviation. For our answer twice. A says the odd layer is 5, and it increases the mean median and standard deviation. B says it's 5, and it increases the mean and standard deviation while having no effect on the median. C says it's 2.5 and it increases the mean median and standard deviation. And B says there's no outlayer. The mean median and standard deviation remain unchanged. Now let's focus on the first part of our problem. Let's identify if any of our differences are outlayers. How can we tell? Well, we can use the 1.5 interquartile rule. Basically it says that if OK a value, let's say our value X is greater than Q3 or 3 quartile plus 1.5 multiplied by the value of the interquartile range or. If it's less than our first quartile minus 1.5 multiplied by the value of the interquartile range, OK, then our value is an outlayer. So if we can arrange our data to find the first and the 3rd quarters and then use this idea to find all of the values that are outside of this range or satisfy these conditions, then we'll know if it's an outlayer. And then we can play around with those values to help us figure out the effect on the mean, median and standard deviation. So first we need to arrange our data. Let's arrange it in ascending order, OK? Now, if you look at our differences, 1.5 is the least. Volume, OK, and we have two values for 1.5. Then we have 2 twice, OK, so that's 22. Then we have 2.53 times. We have 3 twice, OK? And then we have 5, OK. So no, using this, we can figure out our first and 3rd quarter. By using the median of the halves method, then our first quartile is going to be the median of the lower half of our data. No, because we have 10 values in 12345, we can split our data evenly into two parts, and the median value for our lower half is 2, thus Q1 is 2. On the other hand, for the upper half of our data, 3 is the median value, thus Q3 equals 3. Know that we have Q1 and Q3, we can use that to help us. We can use that to help us figure out uh what our outlayers are going to be. But before we do that, we'll also need the interquartile range. How do we find that? Well, we know that our interquartile range is the difference between the 3rd and 1st quartile. OK. So you know what, let me write that down here actually. OK, so given our interquartile range. As the difference between our 3rd and 1st quartile, that's going to be equal to 3 minus 2, which equals 1, OK? So that's our interquartile range. In this case then. Or lower bound. is going to be equal to Q1 minus 1.5 IQR, OK. So that's gonna be 2 minus 1.5 multiplied by 1, and that is equal to 0.5. On the other hand, our upper bound, OK. Is going to be equal to Q3 + 1.5 IQR. And that equals 3 + 1.5 multiplied by 1, which is 4.5. Know that we have the lower and upper bounds of our data, we can tell what the outliers would be because an outlayer then would be any value less than 0.5 or greater than 4.5. When we check the data, we know that 5 is the only value. Greater than 4.5. It's outside of this range between 0.5 and 4.5. So thus, that means 5 is an old layer, OK? So we know what the old layer of our data is. Now that we know the out there, we need to ask ourselves what does. This old layer, what effect does it have on the mean, median and standard deviation? Well, if we think about it with our out layer, our mean is higher because 5 pulls it up, OK? And with all the out layer, the mean will decrease as 5 is removed. So the out layer increases the mean. When we think about the median, remember that to find the median, we ordered our data and the median remains unchanged because it is going to be resistant to extreme values, OK? So the median will stay at 2.5. On the other hand, for a standard deviation, it will increase because 5 is far from the mean. So with the odd layer it increases and without the odd layer it would decrease because there is less spread in the data. Thus, the oil is 5, the mean and the standard deviation increase, but it has no effect on the median. If we look at our answer choices, we can tell then that B is the correct answer. Thanks a lot for watching everyone. I hope this video helped.

Outliers Identify any of the differences found from Exercise 1 that appear to be outliers. For any outliers, how much of an effect do they have on the mean, median, and standard deviation?

Key Concepts

Outliers

Mean, Median, and Standard Deviation

Effect of Outliers on Statistical Measures

Watch next