## TOPICS (Click to Navigate)

Please visit, subscribe and share 10 Minutes Lectures in Computer Science

# Top 5 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions

## Machine learning MCQ - Set 09

1. The number of test examples needed to get statistically significant results should be _________

a) Larger if the error rate is larger.
b) Larger if the error rate is smaller.
c) Smaller if the error rate is smaller.
d) It does not matter.

 Answer: (b) Larger if the error rate is smaller Tests for statistical significance tell us what the probability is that the relationship we think we have found is due only to random chance. They tell us what the probability is that we would be making an error if we assume that we have found that a relationship exists. Statistical significance is a way of mathematically proving that a certain statistic is reliable. When you make decisions based on the results of experiments that you’re running, you will want to make sure that a relationship actually exists. Your statistical significance level reflects your risk tolerance and confidence level. For example, if you run an A/B testing experiment with a significance level of 95%, this means that if you determine a winner, you can be 95% confident that the observed results are real and not an error caused by randomness. It also means that there is a 5% chance that you could be wrong. [Source:

2. Neural networks:
a) Optimize a convex objective function
b) Can only be trained with stochastic gradient descent
c) Can use a mix of different activation functions
d) None of the above

 Answer: (c) Can use a mix of different activation functions Neural networks can use a mix of different activation functions like sigmoid, tanh, and ReLu functions. Activation function In a neural network, numeric data points, called inputs, are fed into the neurons in the input layer. Each neuron has a weight, and multiplying the input number with the weight gives the output of the neuron, which is transferred to the next layer. The activation function is a mathematical “gate” in between the input feeding the current neuron and its output going to the next layer. It can be as simple as a step function that turns the neuron output on and off, depending on a rule or threshold. Or it can be a transformation that maps the input signals into output signals that are needed for the neural network to function.

### 3. Consider the Bayesian network given below. How many independent parameters are needed for this Bayesian Network?

a) 2
b) 4
c) 8
d) 16

 Answer: (c) 8 Given Bayesian network model needs 8 independent parameters. P(H) = 1 P(W) = 1 P(P|W) = 2 P(U|H, P) = 4

### 4. For Kernel Regression, which one of these structural assumptions is the one that most affects the trade-off between underfitting and overfitting:

a) Whether kernel function is Gaussian versus triangular versus box-shaped
b) Whether we use Euclidian versus L1 versus L metrics
c) The kernel width
d) The maximum height of the kernel function

Small kernel width means only training points very close to a test point will influence the prediction for that test point. This can result in overfitting. On the other hand, if kernel width is too large, this can result in underfitting.

### 5. Which one of the following is the main reason for pruning a Decision Tree?

a) To save computing time during testing
b) To save space for storing the Decision Tree
c) To make the training set error smaller
d) To avoid overfitting the training set

 Answer: (d) to avoid overfitting the training set The reason for pruning is that the trees prepared by the base algorithm can be prone to overfitting as they become incredibly large and complex. Pruning is a technique in machine learning and search algorithms that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances. Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting. [Wikipedia]

**********************