Machine
learning exam questions, ML solved quiz questions, Machine Learning
TRUE or FALSE questions, TOP 5 machine learning quiz questions with
answers
Machine Learning TRUE / FALSE Questions - SET 18
1. For linearly separable data, can a small slack penalty (“C") hurt the training accuracy when using a linear SVM without kernel.
(a) TRUE (b) FALSE
Answer: TRUE If the optimal values of α's (say in the dual formulation) are greater than C, we may end up with a sub-optimal decision boundary with respect to the training examples. Alternatively, a small C can allow large slacks, thus the resulting classifier will have a small value of w^{2} but can have non-zero training error.
C is a regularization parameter that controls the trade-off between the achieving a low training error and a low testing error that is the ability to generalize your classifier to unseen data. If your C is too small then you give your objective function a certain freedom to increase |w| a lot, which will lead to large training error. C Parameter is used for controlling the outliers — low C implies we are allowing more outliers, high C implies we are allowing fewer outliers. |
2. Ridge regression, weight decay, and Gaussian processes use the same regularizer.
(a) TRUE (b) FALSE
Answer: TRUE Ridge regression, weight decay, and Gaussian processes use the same regularizer ǁwǁ^{2}.Regularization In the context of machine learning, regularization is the process which regularizes or shrinks the coefficients towards zero. In simple words, regularization discourages learning a more complex or flexible model, to prevent overfitting. [For more, refer here please] Regularization may be defined as any change we make to the training algorithm in order to reduce the generalization error but not the training error.Ridge regression is like least-square regression with an additional penalty term ǁwǁ^{2}. Weight decay means decreasing the weights at every learning step. A Gaussian process is a generative model in which the weights of the target function are drawn according to a Gaussian distribution (for a linear model). |
3. Linear soft-margin SVM can only be used when training data are linearly separable.
(a) TRUE (b) FALSE
Answer: FALSE Hard margin SVM can work only when data is completely linearly separable without any errors (noise or outliers). In case of errors either the margin is smaller or hard margin SVM fails. On the other hand soft margin SVM was proposed to solve this problem by introducing slack variables. It is an extended version of hard-margin SVM |
4. In linear regression, using an L2 regularization penalty term results in sparser solutions than using an L1 regularization penalty term.
(a) TRUE (b) FALSE
Answer: FALSE In linear regression, using an L1 regularization penalty term results in sparser solutions than using an L2 regularization penalty term.
L1 regularization adds an L1 penalty equal to the absolute value of the magnitude of coefficients. In other words, it limits the size of the coefficients. L1 can yield sparse models (i.e. models with few coefficients). L2 regularization adds an L2 penalty equal to the square of the magnitude of coefficients. L2 will not yield sparse models and all coefficients are shrunk by the same factor. [For more, please refer here] |
5. Maximum likelihood estimation gives us not only a point estimate, but a distribution over the parameters that we are estimating.
(a) TRUE (b) FALSE
Answer: FALSE In statistics,
maximum likelihood estimation (MLE) is a method of estimating the parameters
of a probability distribution by maximizing a likelihood function, so that
under the assumed statistical model the observed data is most probable. The
point in the parameter space that maximizes the likelihood function is called
the maximum likelihood estimate. [Refer here] MLE is a method of estimating the parameters of a statistical model by picking the parameters that maximize the likelihood function. |
*********************
Related links: