Please visit, subscribe and share 10 Minutes Lectures in Computer Science

# Data warehousing and Data mining solved quiz questions and answers, multiple choice questions MCQ in data mining, questions and answers explained in data mining concepts, data warehouse exam questions, data mining mcq

## Data Warehousing and Data Mining - MCQ Questions and Answers SET 02

1. In non-parametric models

a) There are no parameters

b) The parameters are fixed in advance

c) A type of probability distribution is assumed, then its parameters are inferred

d) The parameters are flexible

Answer: (d) The parameters are flexible

Non-parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term non-parametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance.

## In non-parametric models, no fixed set of parameters and no probability distribution is assumed. They have parameters that are flexible.

2. The goal of clustering analysis is to:

a) Maximize the inter-cluster similarity

b) Maximize the intra-cluster similarity

c) Maximize the number of clusters

d) Minimize the intra-cluster similarity

Answer: (b) Maximize the intra-cluster similarity

## One of the goals of a clustering algorithm is to maximize the intra-cluster similarity.

A clustering algorithm with small intra-cluster distance (high intra-cluster similarity) and high inter-cluster distance (low inter-cluster similarity) is said to be a good clustering algorithm.

Clustering analysis is a technique for grouping similar observations into a number of clusters based on multiple variables for each individual observed value. It is an unsupervised classification.

## Inter-cluster distance – the distance between two objects from two different clusters.

### Intra-cluster distance – the distance between two objects from the same cluster.

3. In decision tree algorithms, attribute selection measures are used to

a) Reduce the dimensionality

b) Select the splitting criteria which best separate the data

c) Reduce the error rate

d) Rank attributes

Answer: (b) Select the splitting criteria which best separate the data

## Attribute selection measures in decision tree algorithms are mainly used to select the splitting criterion that best separates the given data partition.

During the induction phase of the decision tree, the attribute selection measure is determined by choosing the attribute that will best separate the remaining samples of the nodes partition into individual classes.

The data set is partitioned according to a splitting criterion into subsets.  This procedure is repeated recursively for each subset  until  each  subset  contains  only  members  belonging  to  the  same  class  or  is sufficiently small.

## 4. Pruning a decision tree always

a) Increases the error rate

b) Reduces the size of the tree

c) Provides the partitions with lower entropy

d) Reduces classification accuracy

Answer: (b) Reduces the size of the tree

Pruning means simplifying/compressing and optimizing a decision tree by removing sections of the tree that are uncritical and redundant to classify instances. It helps in significantly reducing the size of the decision tree.

## 5. Which of the following classifiers fall in the category of lazy learners:

a) Decision trees

b) Bayesian classifies

c) k-NN classifiers

d) Rule-based classifiers

k-nearest neighbor (k-NN) classifier is a lazy learner because it doesn’t learn a discriminative function from the training data but “memorizes” the training dataset instead.

Lazy learning (e.g., instance-based learning): Simply stores training data (or only minor processing) and waits until it is given a test tuple. When it does, classification is conducted based on the most related data in the stored training data.

## Lazy learning is also referred as “just-in-time learning”.

The other categories of classifiers is “Eager learners”.

************************

## Featured Content

### Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

data recovery