🚨 Quiz Instructions:
Attempt all questions first.
✔️ Click SUBMIT at the end to unlock VIEW ANSWER buttons.

Choosing the Right Machine Learning Algorithm – Real-World MCQs

Selecting the correct machine learning algorithm is a critical step in solving real-world data science problems. The choice depends on factors such as data type, problem objective, labeled vs unlabeled data, and output nature.

In this quiz, you will explore scenario-based MCQs using real-life datasets from domains such as real estate, e-commerce, banking, healthcare, recommendation systems, and time-series forecasting. These questions are commonly asked in university exams, ML interviews, and competitive tests.

Topics covered include:

  • Regression vs Classification problems
  • Supervised vs Unsupervised learning
  • Clustering and Customer Segmentation
  • Recommendation Systems
  • Time-Series Forecasting
  • Dimensionality Reduction

Each question includes difficulty level, data type, and clear explanations to help you understand why a particular ML algorithm is the best choice.

House Price Prediction (Bangalore Real Estate)
1. A real-estate company wants to predict house prices in Bangalore using features such as area (sq.ft), number of bedrooms, location, and age of the building. The target value is continuous.






Correct Answer: B
Difficulty: Easy
Data Type: Labeled, Continuous Target

Linear Regression is ideal for predicting continuous numerical values.

Why not others? Logistic Regression is for classification, K-Means is unsupervised, and Apriori is for association rules.

Email Spam Detection (Gmail-like System)
2. An email service like Gmail wants to classify emails as Spam or Not Spam using word frequencies and sender information.






Correct Answer: D
Difficulty: Easy
Data Type: Labeled, Text Data

Naive Bayes works well for probabilistic text classification problems.

Why not others? K-Means is unsupervised and PCA is for dimensionality reduction.

Customer Segmentation for Amazon
3. Amazon wants to group customers based on purchase history, spending behavior, and browsing activity for marketing purposes.






Correct Answer: B
Difficulty: Medium
Data Type: Unlabeled, Numerical Features

K-Means clusters similar customers without requiring labeled data.

Why not others? Classification algorithms require predefined labels.

Credit Card Fraud Detection
4. A bank wants to detect fraudulent credit-card transactions where fraud cases are rare compared to normal transactions.






Correct Answer: B
Difficulty: Interview-level
Data Type: Labeled, Imbalanced Dataset

Random Forest handles non-linearity and class imbalance effectively.

Why not others? Linear Regression cannot model classification boundaries.

Movie Recommendation System (Netflix-Style)
5. Netflix wants to recommend movies based on users’ viewing history and ratings from similar users.






Correct Answer: B
Difficulty: Medium
Data Type: Labeled User–Item Interactions

Collaborative Filtering leverages similarities among users or items.

Why not others? Regression models do not capture preference similarity.

Predicting Customer Churn (Telecom Dataset)
6. A telecom company wants to predict whether a customer will churn based on usage patterns and complaint history.






Correct Answer: A
Difficulty: Easy
Data Type: Labeled, Binary Target

Logistic Regression is designed for binary classification problems.

Why not others? PCA reduces features but does not classify.

Handwritten Digit Recognition (MNIST Dataset)
7. A system must recognize handwritten digits (0–9) from the MNIST image dataset.






Correct Answer: C
Difficulty: Medium
Data Type: Labeled Image Data

CNNs learn spatial features crucial for image recognition.

Why not others? Traditional ML models cannot exploit image structure.

Product Demand Forecasting (Walmart Sales Data)
8. Walmart wants to forecast next month’s product sales using historical daily sales data.






Correct Answer: B
Difficulty: Medium
Data Type: Time-Dependent Numerical Data

ARIMA models temporal dependencies in sequential data.

Why not others? K-Means ignores time ordering.

Identifying Frequent Product Bundles (Market Basket Analysis)
9. A supermarket wants to identify products that are frequently purchased together.






Correct Answer: B
Difficulty: Easy
Data Type: Transactional Data

Apriori discovers association rules from transaction records.

Why not others? Classification models do not find item associations.

Reducing Features in a High-Dimensional Dataset
10. A dataset contains 1,000 features, and the goal is to reduce dimensionality before training a model.






Correct Answer: B
Difficulty: Easy
Data Type: High-Dimensional Numerical Data

PCA reduces features while preserving maximum variance.

Why not others? K-Means clusters data but does not reduce dimensions.