Thursday, June 10, 2021

Machine Learning TRUE or FALSE Questions with Answers 20

Machine learning exam questions, ML solved quiz questions, Machine Learning TRUE or FALSE questions, TOP 5 machine learning quiz questions with answers

Machine Learning TRUE / FALSE Questions - SET 20

1. Training neural networks has the potential problem of over-fitting the training data.

(a) TRUE                                                   (b) FALSE

Answer: TRUE

Over-fitting occurs when a model tries to predict a trend in data that is too noisy. This is caused due to an overly complex model with too many parameters. A model that is over-fit is inaccurate because the trend does not reflect the reality present in the data.

Many of the modern advancements in neural networks have been a result of stacking many hidden layers. This deep stacking allows us to learn more complex relationships in the data. However, because we're increasing the complexity of the model, we're also more prone to potentially over-fitting our data. References – first, second.

 

2. Given a linearly-separable dataset for a 2-class classification problem, a Linear SVM is better to use than a Perceptron because the SVM will often be able to achieve better classification accuracy on the testing set.

(a) TRUE                                                   (b) FALSE

Answer: TRUE

Perceptron stops after it classifies data correctly whereas SVM stops after finding the best plane that has the maximum margin, i.e. the maximum distance between data points of both classes. Maximizing the margin distance provides some reinforcement so that future data points can be classified with more confidence. The decision boundary learned by the SVM will often be farther away from more of the training examples, leading to better performance on the testing examples that are close to the decision boundary.

 

3. With a non-linearly-separable dataset that contains some extra noise, using an SVM with slack variables to create a soft margin classifier, and a small value for the penalty parameter, C, that controls how much to penalize misclassified points, will often reduce over-fitting the training data.

(a) TRUE                                                   (b) FALSE

Answer: TRUE

Small C means the penalty for misclassifying a few points will be small and therefore we are more likely to maximize the margin between most of the points while misclassifying a few points including the noise points.

 

4. Hierarchical Agglomerative Clustering requires the user to specify the number of clusters, k, when building the dendrogram for a set of data.

(a) TRUE                                                   (b) FALSE

Answer: FALSE

Hierarchical agglomerative clustering algorithm uses “bottom-up” approach. It begins with each object in a separate cluster. At each step, the two clusters that are most similar are joined into a single new cluster. Once fused, objects are never separated. These techniques do not let you explicitly set the number of clusters. Instead, you pick a distance value that will yield an appropriate number of clusters.

 

5. A good way to pick the number of clusters, k, used for k-Means clustering is to try multiple values of k and choose the value that minimizes the distortion measure.

(a) TRUE                                                   (b) FALSE

Answer: TRUE

Choosing an optimal k value is a hard problem. It is costly to choose the k value by trying multiple different values of k. Rather, the value for k can be chosen using these methods;

Elbow curve method, silhouette analysis, gap statistic. 

The most popular among these is elbow method.

 

*********************

Related links:

 

k-means clustering

Hierarchical agglomerative clustering

Overfitting in neural networks - how to solve?

Linear SVM vs Perceptron. Which is best?

Support vector machine


No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery