Top 5 Machine Learning Quiz Questions with Answers explanation, Interview
questions on machine learning, quiz questions for data scientist answers
explained, machine learning exam questions, SVM
Machine learning MCQ - Set 15
1. In the general case, imagine that we have d binary features, and we want to count the number of features with value 1. How many leaf nodes would a decision tree need to represent this function?
a) 2^{1} leaf nodes
b) 2^{d} leaf nodes
c) 2^{d-1} leaf nodes
d) 2^{d}-1 leaf nodes
Answer: (b) 2^{d} leaf nodes We need 2^{d} leaf nodes. For example, for one feature we have two leaf nodes at maximum (either 0 or 1). For 2 features we have four leaf nodes and so on. |
2. The measures developed for selecting the best split are often based on the degree of impurity of the child nodes. Which of the following is NOT an impurity measure?
a) Gini
b) Entropy
c) Pruning
d) Classification error
Answer: (c) Pruning Pruning is not an impurity measure. It is to reduce the size of the decision tree once the tree is built.Decision trees that are too large are susceptible to a phenomenon known as overfitting. Pruning helps by trimming the branches of the initial tree in a way that improves the generalization capability of the decision tree. |
3. We are dealing with samples x where x is a single value. We would like to test two alternative regression models:
1) y = ax + e
2) y = ax + bx^{2} + e
Which of the two models is more likely to fit the test data better?
a) model 1
b) model 2
c) both will equally fit
d) impossible to decide
Answer: (d) impossible to decide It depends on the underlying model of the data and the amount of data available for training. If the data indeed comes from a linear model and we do not have a lot of data to train on model 2 will lead to overfitting and model 1 would do better. On the other hand if the data comes from an underlying quadratic model, model 2 would be better. |
4. What is the biggest weakness of decision trees compared to logistic regression classifiers?
a) Decision trees are more likely to overfit the data
b) Decision trees are more likely to underfit the data
c) Decision trees do not assume independence of the input features
d) None of the mentioned
Answer: (a) Decision trees are more likely to overfit the data Decision trees are more likely to overfit the data since they can split on many different combination of features whereas in logistic regression we associate only one parameter with each feature. |
5. Which of the following classifiers can generate linear decision boundary?
a) Linear SVM
b) Random forest
c) Logistic regression
d) k-NN
Answer: (a) Linear SVM and (c) Logistic regression Linear SVM and Logistic regression are the linear classifiers. Random forest and k-NN are the non-linear classifiers. They cannot linearly classify. |