When the gamma parameter of an RBF kernel is too high, what happens?

The model overfits by focusing on very localized regions of the data.

Which metric is most relevant for evaluating SVM on imbalanced datasets?

Precision and Recall are better indicators than Accuracy for imbalanced datasets.

Top 10 ML MCQs on SVM Concepts (2025 Edition)

Top 10 New Support Vector Machine (SVM) MCQs with Answers | Machine Learning Quiz 2025

Top 10 New MCQs on SVM Concepts (2025 Edition)

1. Which of the following best describes the margin in an SVM classifier?

A. Distance between two closest support vectors
B. Distance between support vectors of opposite classes
C. Distance between decision boundary and the nearest data point of any class
D. Width of the separating hyperplane

Answer: C

Explanation: The margin is the perpendicular distance from the decision boundary to the closest data point (this data point is called support vector) of any class. SVM aims to maximize this margin.

2. In soft-margin SVM, the penalty parameter C controls what?

A. The kernel function complexity
B. The balance between margin width and classification errors
C. The learning rate during optimization
D. The dimensionality of transformed space

Answer: B

Explanation: Parameter C determines how much misclassification is tolerated. A large C → fewer violations, smaller margin; a small C → allows more violations, larger margin.

A Soft Margin SVM is a type of Support Vector Machine that allows some misclassification or margin violations in order to achieve better generalization when data is not perfectly linearly separable. In simple words, Soft Margin SVM finds the best possible separating hyperplane that balances maximum margin and minimum classification error.

Overfitting vs Underfitting: Large C - risk of overfitting, small C - risk of underfitting.

3. Which of the following statements about the kernel trick in SVM is true?

A. It explicitly computes higher-dimensional feature mappings
B. It avoids computing transformations by using inner products in the feature space
C. It can only be applied to linear SVMs
D. It reduces the number of support vectors required

Answer: B

Explanation: The kernel trick enables SVMs to work in high-dimensional spaces without explicitly computing the transformed features. It uses kernel functions to calculate inner products in that space, making non-linear separation computationally efficient.

What is kernel trick?

A Soft Margin SVM is a type of Support Vector Machine that allows some misclassification or margin violations in order to achieve better generalization when data is not perfectly linearly separable. In simple words: Soft Margin SVM finds the best possible separating hyperplane that balances maximum margin and minimum classification error.

4. Which step is unique to non-linear SVMs?

A. Feature normalization
B. Slack variable introduction
C. Kernel trick application
D. Margin maximization

Answer: C

Explanation: The kernel trick allows mapping non-linearly separable data into a higher-dimensional space without explicitly computing transformations — used only in non-linear SVMs.

Why kernel trick is unique to non-linear SVMs only?

The kernel trick is used only in non-linear SVMs because linear SVMs already work directly in the input space — there’s no need to map the data to a higher-dimensional space to make it linearly separable.

The kernel trick is used only in non-linear SVMs because linear SVMs already find a separating hyperplane in the original space, whereas non-linear data needs implicit mapping to higher dimensions.

In other words:

Linear SVM → data is already separable by a straight hyperplane → no mapping needed.
Non-linear SVM → data is not separable in the original space → kernel trick is applied to find a linear boundary in a higher-dimensional feature space.

5. If the data is perfectly linearly separable, what is the ideal value of C?

A. Very small (close to 0)
B. Moderate (around 1)
C. Very large (→ ∞)
D. Exactly equal to margin value

Answer: C

Explanation: A large C ensures no margin violations (hard-margin SVM), which is suitable when data is perfectly separable.

What are the problems with very large C value?

The SVM with very large C value will try to classify every training point correctly, no matter how narrow or overfitted the margin becomes. This lead to;

Very small margin (The model sacrifices margin width to perfectly fit all training points.)
Overfitting risk (It fits even noisy or outlier points, harming generalization to new data.)
Hard margin behavior (the Soft Margin SVM effectively becomes a Hard Margin SVM, demanding perfect separation.)
Unstable model (Small changes in data may cause large shifts in the decision boundary)

6. Which optimization problem does SVM solve during training?

A. Minimization of loss function via gradient descent
B. Maximization of likelihood function
C. Quadratic optimization with linear constraints
D. Linear programming without constraints

Answer: C

Explanation: SVM training is a quadratic optimization problem where a convex quadratic function is minimized under linear constraints.

Why the optimization problem is "Quadratic optimization with linear constraints"?

During SVM training, the goal is to find the best separating hyperplane between two classes. That means we need to find w and b such that the classifier f(x) = w⋅x + b correctly separates the classes with the largest possible margin.

The objective is to maximize the margin (2/||w||). To maximize this margin, we need to minimize L2 norm squared of the weight vector w (i.e. ∣∣w∣∣²). So the optimization problem becomes a quadratic function of w (i.e., ∣∣w∣∣²/2 ). Hence the name quadratic optimization.

To ensure correct classification, every data point (x_i, y_i) must satisfy: y_i(w.x_i + b) >= 1. This means:

Points of the positive class lie on one side of the hyperplane.
Points of the negative class lie on the other side, at least one margin unit away.

These constraints are linear in terms of $w$ and $b$

7. What is the primary reason for using a kernel function in SVM?

A. To increase training speed
B. To handle non-linear relationships efficiently
C. To reduce the number of features
D. To minimize overfitting automatically

Answer: B

Explanation: Kernels implicitly project input data into a higher-dimensional space where linear separation becomes possible.

8. In SVM, support vectors are:

A. All training samples
B. Only samples lying on the margin boundaries
C. Samples inside the margin or misclassified
D. Both B and C

Answer: D

Explanation: Support vectors are data points that either lie on the margin boundaries or violate the margin (inside it or misclassified).

Some points lie exactly on the margin boundary. These are support vectors on the margin.

Some points may lie inside the margin or even be misclassified (in soft margin SVM). These are support vectors violating the margin

9. When the gamma (γ) parameter of an RBF kernel is too high, what typically happens?

A. The decision boundary becomes smoother
B. Model generalizes better
C. Model overfits by focusing on nearby points
D. Model underfits with large bias

Answer: C

Explanation: High γ makes each point’s influence very localized, leading to an overly complex boundary and overfitting.

What is gamma (γ) parameter?

$γ$ (gamma) is a hyperparameter that controls how quickly similarity decreases with distance.

Small γ - Large influence radius (Each data point affects a wide region, leading to smoother, more general decision boundaries)

Large γ - Small influence radius (Each data point affects only nearby points, leading to tighter, more complex decision boundaries)

10. Which of the following metrics is most relevant for evaluating SVM on imbalanced datasets?

A. Accuracy
B. Precision and Recall
C. Log-loss
D. Margin width

Answer: B

Explanation: On imbalanced data, accuracy can be misleading. Precision and recall (or F1-score) better reflect SVM’s real performance.

TOPICS (Click to Navigate)

Wednesday, October 29, 2025