# Explain why the standard deviation would likely not be a reliable measure of variability for a distribution of data that includes at least one extreme outlier.

Discussion 1 – Explain why the standard deviation would likely not be a reliable measure of variability for a distribution of data that includes at least one extreme outlier.

Discussion 2 – Suppose that you collect a random sample of 250 salaries for the salespersons employed by a large PC manufacturer. Furthermore, assume that you find that two of these salaries are considerably higher than the others in the sample. Before analyzing this data set, should you delete the unusual observations? Explain why or why not.

Discussion 3 – A researcher is interested in determining whether there is a relationship between the number of room air- conditioning units sold each week and the time of year. What type of descriptive chart would be most useful in performing this analysis? Explain your choice.

Discussion 4 – Suppose that the histogram of a given income distribution is positively skewed. What does this fact imply about the relationship between the mean and median of this distribution?

Discussion 5 – The midpoint of the line segment joining the first quartile and third quartile of any distribution is the median.” Is this statement true or false? Explain your answer.

Discussion 6 – If two variables are highly correlated, does this imply that changes in one cause changes in the other? If not, give at least one example from the real world that illustrates what else could cause a high correlation.

Discussion 7 – Suppose you have data on student achievement in high school for each of many school districts. In spreadsheet format, the school district is in column A, and various student achievement measures are in columns B, C, and so on. If you find fairly low correlations (magnitudes from 0 to 0.4, say) between the variables in these achievement columns, what exactly does this mean?

Discussion 8 – Suppose you have customer data on whether they have bought your product in a given time period, along with various demographics on the customers. Explain how you could use pivot tables to see which demographics are the primary drivers of their “yes/no” buying behavior.