Showing posts with label Machine Learning Quiz. Show all posts

Monday, January 11, 2021

Machine Learning TRUE or FALSE Questions with Answers 19

Machine learning exam questions, ML solved quiz questions, Machine Learning TRUE or FALSE questions, TOP 5 machine learning quiz questions with answers

Machine Learning TRUE / FALSE Questions - SET 19

1. Solving a non linear separation problem with a hard margin Kernelized SVM (Gaussian RBF Kernel) might lead to overfitting.

(a) TRUE (b) FALSE

Answer: TRUE

When there are outliers, hard margin SVM + Gaussian-RBF kernel result in an unnecessarily complicated decision boundary that overfits the training noise.

In SVM, to avoid overfitting, we choose a Soft Margin, instead of a Hard margin, i.e. we let some data points enter our margin intentionally so that our classifier don’t overfit on our training sample.

SVM is less prone to overfitting than other methods.

[Refer here for more]

2. Random forests can be used to classify infinite dimensional data.

(a) TRUE (b) FALSE

Answer: TRUE

Random forest is great with high dimensional data since we are working with subsets of data. With Random Forests there’s almost no harm in keeping columns whose importance is not certain and no harm in adding more columns.

But, random forests do not have high performance when dealing with very-high-dimensional data

3. The training accuracy increases as the size of the tree grows (assuming no noise).

(a) TRUE (b) FALSE

Answer: TRUE

The training accuracy increases as the size of the tree grows until the tree fits all the training data.

A decision tree overfits the training data when its accuracy on the training data goes up but its accuracy on unseen data goes down.

4. Hierarchical clustering methods require a predefined number of clusters, much like k-means.

(a) TRUE (b) FALSE

Answer: FALSE

We do not need to predefine the number of clusters in hierarchical clustering like we do in k-means clustering. Hierarchical clustering considers each data point as individual cluster and groups similar objects into clusters.

5. Suppose that X₁, X₂, ..., X_m are categorical input attributes and Y is categorical output attribute. Suppose we plan to learn a decision tree without pruning, using the standard algorithm. The maximum depth of the decision tree must be less than m+1.

(a) TRUE (b) FALSE

Answer: TRUE

Because the attributes are categorical and can each be split only once.

*********************

Decision tree

Overfitting in decision tree

Random forest

Support vector machine

Wednesday, December 30, 2020

Machine Learning Multiple Choice Questions and Answers 24

Machine learning exam questions, ML solved quiz questions, Machine Learning TRUE or FALSE questions, TOP 5 machine learning quiz questions with answers

Machine Learning TRUE / FALSE Questions - SET 18

1. For linearly separable data, can a small slack penalty (“C") hurt the training accuracy when using a linear SVM without kernel.

(a) TRUE (b) FALSE

Answer: TRUE

If the optimal values of α's (say in the dual formulation) are greater than C, we may end up with a sub-optimal decision boundary with respect to the training examples. Alternatively, a small C can allow large slacks, thus the resulting classifier will have a small value of w² but can have non-zero training error.

C is a regularization parameter that controls the trade-off between the achieving a low training error and a low testing error that is the ability to generalize your classifier to unseen data. If your C is too small then you give your objective function a certain freedom to increase |w| a lot, which will lead to large training error.

C Parameter is used for controlling the outliers — low C implies we are allowing more outliers, high C implies we are allowing fewer outliers.

2. Ridge regression, weight decay, and Gaussian processes use the same regularizer.

(a) TRUE (b) FALSE

Answer: TRUE

Ridge regression, weight decay, and Gaussian processes use the same regularizer ǁwǁ².

Regularization

In the context of machine learning, regularization is the process which regularizes or shrinks the coefficients towards zero. In simple words, regularization discourages learning a more complex or flexible model, to prevent overfitting. [For more, refer here please]

Regularization may be defined as any change we make to the training algorithm in order to reduce the generalization error but not the training error.

Ridge regression is like least-square regression with an additional penalty term ǁwǁ².

Weight decay means decreasing the weights at every learning step.

A Gaussian process is a generative model in which the weights of the target function are drawn according to a Gaussian distribution (for a linear model).

3. Linear soft-margin SVM can only be used when training data are linearly separable.

(a) TRUE (b) FALSE

Answer: FALSE

Hard margin SVM can work only when data is completely linearly separable without any errors (noise or outliers). In case of errors either the margin is smaller or hard margin SVM fails. On the other hand soft margin SVM was proposed to solve this problem by introducing slack variables. It is an extended version of hard-margin SVM

4. In linear regression, using an L2 regularization penalty term results in sparser solutions than using an L1 regularization penalty term.

(a) TRUE (b) FALSE

Answer: FALSE

In linear regression, using an L1 regularization penalty term results in sparser solutions than using an L2 regularization penalty term.

L1 regularization adds an L1 penalty equal to the absolute value of the magnitude of coefficients. In other words, it limits the size of the coefficients. L1 can yield sparse models (i.e. models with few coefficients).

L2 regularization adds an L2 penalty equal to the square of the magnitude of coefficients. L2 will not yield sparse models and all coefficients are shrunk by the same factor. [For more, please refer here]

5. Maximum likelihood estimation gives us not only a point estimate, but a distribution over the parameters that we are estimating.

(a) TRUE (b) FALSE

Answer: FALSE

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. [Refer here]

TOPICS (Click to Navigate)

Monday, January 11, 2021

Machine learning exam questions, ML solved quiz questions, Machine Learning TRUE or FALSE questions, TOP 5 machine learning quiz questions with answers

Machine Learning TRUE / FALSE Questions - SET 19

SVM is less prone to overfitting than other methods.

But, random forests do not have high performance when dealing with very-high-dimensional data

A decision tree overfits the training data when its accuracy on the training data goes up but its accuracy on unseen data goes down.

Related links:

Decision tree

Overfitting in decision tree

Random forest

Support vector machine

Wednesday, December 30, 2020

Top 5 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, classification, ridge regression, lasso regression, statistics

Machine learning Quiz Questions - Set 24

The null and alternative hypotheses are two mutually exclusive statements about a population. A hypothesis test uses sample data to determine whether to reject the null hypothesis.

Type 2 error is caused when the null hypothesis is false and we fail to reject it.

Ridge regression shrinks the regression coefficients, so that variables, with minor contribution to the outcome, have their coefficients close to zero.

When do we use L2 regularization?

Lasso regression adds “absolute value of magnitude” of coefficient as penalty term to the loss function.

Related links:

Multiple choice quiz questions in machine learning

Differentiate between Lasso and ridge regression

When do we need L2 regularization technique

Define Type 2 error

What does null and alternative hypotheses state about population parameters

Saturday, December 5, 2020

Machine learning exam questions, ML solved quiz questions, Machine Learning TRUE or FALSE questions, TOP 5 machine learning quiz questions with answers

Machine Learning TRUE / FALSE Questions - SET 18

C Parameter is used for controlling the outliers — low C implies we are allowing more outliers, high C implies we are allowing fewer outliers.

Ridge regression, weight decay, and Gaussian processes use the same regularizer ǁwǁ2.

Regularization may be defined as any change we make to the training algorithm in order to reduce the generalization error but not the training error.

A Gaussian process is a generative model in which the weights of the target function are drawn according to a Gaussian distribution (for a linear model).

3. Linear soft-margin SVM can only be used when training data are linearly separable.

In linear regression, using an L1 regularization penalty term results in sparser solutions than using an L2 regularization penalty term.

MLE is a method of estimating the parameters of a statistical model by picking the parameters that maximize the likelihood function.

Related links:

Maximum Likelihood Estimation

L1 and L2 regularization

Difference between hard-margin and soft-margin SVM

Regularization in ridge regression

What is slack variable

Differentiate between L1 and L2 regularization

Featured Content

All time most popular contents

Ridge regression, weight decay, and Gaussian processes use the same regularizer ǁwǁ².