Please visit, subscribe and share 10 Minutes Lectures in Computer Science

# Top 3 Machine Learning Quiz Questions with Answers explanation, Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions

## Machine learning MCQ - Set 14

1. The error function most suited for gradient descent using logistic regression is

a) The entropy function.

b) The squared error.

c) The cross-entropy function.

d) The number of mistakes.

 Answer: (c) The cross-entropy function For logistic regression, the cross-entropy function (loss function or cost function) is convex. A convex function has just one minimum; there are no local minima to get stuck in, so gradient descent starting from any point is guaranteed to find the minimum.   Since the Cross Entropy cost function is convex a variety of local optimization schemes can be more easily used to properly minimize it. For this reason the Cross Entropy cost is used more often in practice for logistic regression than is the logistic Least Squares cost.   The cost function return value that representing how well your model perform. It's like a function that gives you the amount of error rate.   To find the optimal model that has minimum error rate (cost function) we use gradient descent.

2. You are given a labeled binary classification data set with N data points and D features. Suppose that N < D. In training an SVM on this data set, which of the following kernels is likely to be most appropriate?

a) Linear kernel

c) Higher-order polynomial kernel

d) RBF kernel

 Answer: (a) Linear kernel Linear Kernel is used when the data is Linearly separable, that is, it can be separated using a single Line. It is one of the most common kernels to be used. It is mostly used when there are a Large number of Features in a particular Data Set. When number of examples is less in comparison to number of features you would not have enough data to fit a non linear SVM i.e SVM with non linear kernel. SVM with linear kernel (or without a kernel) is one way to go.

3. You are increasing the size of the layers (more hidden units per layer) in your neural network. What kind of impact it will have on bias and variance?

a) increases, increases

b) increases, decreases

c) decreases, increases

d) decreases, decreases.

 Answer: (c) decreases, increases Increasing the size of layers will result in decreasing bias and increasing variance.   Increasing the size of layers result in increased complexity. High variance means, the model is performing great on training data and poor performance on test data. Low bias means the model is fitting well.

**********************