Sunday, September 20, 2020

Machine Learning Multiple Choice Questions and Answers 16

Top 5 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, SVM, VC dimension, decision tree, knn


Machine learning MCQ - Set 16

 

1. A project team performed a feature selection procedure on the full data set and reduced their large feature set to a smaller set. Then they split the data into test and training portions. They built their model on training data using several different model settings, and report the best test error they achieved. Which of the following is TRUE about the given experimental setup?

a) Best setup

b) Problematic setup

c) Invalid setup

d) Cannot be decided

Answer: (b) Problematic setup

(a) Using the full data for feature selection will leak information from the test examples into the model. The feature selection should be done exclusively using training and validation data not on test data.

(b) The best parameter setting should not be chosen based on the test error; this has the danger of overfitting to the test data. They should have used validation data and use the test data only in the final evaluation step.

 

2. If we increase the k value in k-nearest neighbor, the model will _____ the bias and ______ the variance.

a) Decrease, Decrease

b) Increase, Decrease

c) Decrease, Increase

d) Increase, Increase

Answer: (b) Increase, Decrease

When K increases to a large value, the model becomes simplest. All test data point will belong to the same class: the majority class. This is under-fit, that is, high bias and low variance.

 

Bias-Variance tradeoff

The bias is an error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to miss the relevant relations between features and target outputs. In other words, model with high bias pays very little attention to the training data and oversimplifies the model.

The variance is an error from sensitivity to small fluctuations in the training set. High variance can cause an algorithm to model the random noise in the training data, rather than the intended outputs. In other words, model with high variance pays a lot of attention to training data and does not generalize on the data which it hasn’t seen before. [Source: Refer here] 

 

3. For a large k value the k-nearest neighbor model becomes _____ and ______ .

a) Complex model, Overfit

b) Complex model, Underfit

c) Simple model, Underfit

d) Simple model, Overfit

Answer: (c) Simple model, Underfit

When K increases to inf, the model is simplest. All test data point will belong to the same class: the majority class. This is under-fit, that is, high bias and low variance.

knn classification is an averaging operation. To come to a decision, the labels of K nearest neighbour samples are averaged. The standard deviation (or the variance) of the output of averaging decreases as the number of samples increases. In the case K==N (you select K as large as the size of the dataset), variance becomes zero.

Underfitting means the model does not fit, in other words, does not predict, the (training) data very well.

Overfitting means that the model predicts the (training) data too well. It is too good to be true. If the new data point comes in, the prediction may be wrong.

 

4. When we have a real-valued input attribute during decision-tree learning, what would be the impact multi-way split with one branch for each of the distinct values of the attribute?

a) It is too computationally expensive.

b) It would probably result in a decision tree that scores badly on the training set and a test set.

c) It would probably result in a decision tree that scores well on the training set but badly on a test set.

d) It would probably result in a decision tree that scores well on a test set but badly on a training set.

Answer: (c) It would probably result in a decision tree that scores well on the training set but badly on a test set

It is usual to make only binary splits because multiway splits break the data into small subsets too quickly. This causes a bias towards splitting predictors with many classes since they are more likely to produce relatively pure child nodes, which results in overfitting. [For more, refer here]

 

5. The VC dimension of a Perceptron is _____ the VC dimension of a simple linear SVM.

a) Larger than

b) Smaller than

c) Same as

d) Not at all related

Answer: (c) Same as

Both Perceptron and linear SVM are linear discriminators (i.e. a line in 2D space or a plane in 3D space.), so they should have the same VC dimension.

VC dimension

The Vapnik–Chervonenkis (VC) dimension is a measure of the capacity (complexity, expressive power, richness, or flexibility) of a space of functions that can be learned by a statistical binary classification algorithm. It is defined as the cardinality of the largest set of points that the algorithm can shatter. [Wikipedia] 

 

 

**********************

Related links:

top 3 questions in machine learning

quiz questions for data scientists

data science quiz online

online quiz questions on machine learning

MCQs on machine learning and data science

define bias variance trade-off

What is overfitting what is underfitting

knn machine learning multiple choice questions

No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery