## 1. Which of the following would indicate that a dataset is not bell shaped?

(a) The range is equal to 5 standard deviations.

(b) The range is larger than the interquartile range.

(c) The mean is much smaller than the median.

(d) There are no outliers.

2. Which statement is not true about confidence intervals?

(a) A confidence interval is an interval of values computed from sample data that is likely to include the true population value.

(b) An approximate formula for a 95% confidence interval is sample estimate ± margin of error.

(c) A confidence interval between 20% and 40% means that the population proportion lies between 20% and 40%.

(d) A 99% confidence interval procedure has a higher probability of producing intervals that will include the population parameter than a 95% confidence interval procedure.

3. Which of the following denotes the expected value of a random variable?

(a) It is a value that has the highest probability of occurring.

## (b) It is the mean value over an infinite number of observations of the variable.

(c) It is the largest value that will ever occur.

(d) It is most common value over a finite number of observations of the variable.

4. Which of the following is INCORRECT about the requirements of any distance metric?

(a) The distance must never be negative

(b) The distance between two identical vectors, x and y, is a non-zero value.

(c) The distance from x to y is the same as the distance from y to x

(d) The metric must satisfy the triangular inequality

5. Which of the following distance measures calculates the distance between two binary vectors?

(a) Euclidean distance

(b) Manhattan distance

(c) Minkowski distance

(d) Hamming distance

