Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, Hierarchical agglomerative clustering, How to calculate the pairwise distance using single linkage? Complete linkage agglomerative clustering

Machine Learning MCQ - Hierarchical agglomerative clustering - single linkage and complete linkage

1. After three iterations of Hierarchical Agglomerative Clustering using Euclidean distance between points, we get the 3 clusters: C1 = {2, 4}, C2 = {7, 8} and C3 = {12, 14}. What is the distance between clusters C1 and C2 using Single Linkage and Complete Linkage?

a) 2, 2

b) 3, 4

c) 3, 6

d) 5, 6

Answer: (c) 3, 6

Single linkage

In single linkage, we define the distance between two clusters as the minimum distance between any single data point in the first cluster and any single data point in the second cluster. On the basis of this definition of distance between clusters, at each stage of the process we combine the two clusters with the smallest single linkage distance.

As per single linkage, d(C1, C2) = d({2, 4}, {7, 8}) = min(|2-7|, |2-8|, |4-7|, |4-8|)

= min(5, 6, 3, 4) = 3

Single linkage hierarchical clustering

In single-link (or single linkage) hierarchical clustering, we merge in each step the two clusters whose two closest members have the smallest distance (or: the two clusters with the smallest minimum pairwise distance).

Complete linkage

In complete linkage, we define the distance between two clusters to be the maximum distance between any single data point in the first cluster and any single data point in the second cluster. On the basis of this definition of distance between clusters, at each stage of the process we combine the two clusters that have the smallest complete linkage distance.

As per complete linkage, d(C1, C2) = d({2, 4}, {7, 8}) = max(|2-7|, |2-8|, |4-7|, |4-8|)

= max(5, 6, 3, 4) = 6

Complete linkage hierarchical clustering

In complete-link (or complete linkage) hierarchical clustering, we merge in each step the two clusters whose merger has the smallest diameter (or: the two clusters with the smallest maximum pairwise distance).

< Previous

Next >

************************

Hierarchical agglomerative clustering

How is single link distance calculated?

How is complete linkage distance calculated?

What is single linkage method?

What is complete linkage method?

Machine learning solved mcq, machine learning solved mcq

Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, What is entropy? Why entropy is important in decision tree? How to calculate entropy for a dataset?

Machine Learning MCQ - Calculate the entropy of a decision tree given the dataset

< Previous

Next >

1. For the dataset given in Table 1 below to learn a decision tree, find the approximate entropy H(Passed). This decision tree predicts whether students pass or not (Y for yes or N for no), based on their past CGPA scores (H for high, A for average, and L for Low) and whether they prepared or not (Y or N).

CGPA

Prepared

Passed

a) 0.92

b) 0.66

c) 1.92

d) 1.32

Answer: (a) 0.92

How to calculate entropy?

The formula for calculating the entropy is as follows;

Here, p_i is the probability of getting the class i when randomly selecting the one from the available classes. In our example, we need to find the entropy of ‘Passed’ which has only two classes, F or T.

H(Passed) = - (p_{(Passed = F)} log p_{(Passed
= F)} + p_{(Passed = T)} log p_{(Passed = T)})

= -(2/6 log₂(2/6)+4/6 log₂ (4/6))

= -(1/3 log₂(1/3)+2/3 log₂ (2/3))

= -(- 0.52826 – 0.38997)

= 0.92

What is entropy?

Entropy is an information theory metric that measures the impurity or uncertainty in a group of observations.

How does entropy measurement help in decision tree?

Entropy determines how a decision tree chooses to split data to minimize this impurity as much as possible at the leaf (or the end-outcome) nodes. It means the objective function is to decrease the impurity (i.e. uncertainty or surprise) of the target column or in other words, to increase the homogeneity of the variable at every split of the given data.

< Previous

Next >

************************

TOPICS (Click to Navigate)

Friday, February 11, 2022

Machine learning MCQ - Hierarchical agglomerative clustering - single linkage and complete linkage

Machine Learning MCQ - Hierarchical agglomerative clustering - single linkage and complete linkage

Related links:

Machine learning MCQ home page

Machine learning TRUE / FALSE questions home page

Computer science engineering quiz questions and answers page

Data Warehousing and Data Mining MCQ with answers home

Hierarchical agglomerative clustering

How is single link distance calculated?

How is complete linkage distance calculated?

What is single linkage method?

What is complete linkage method?

Machine learning solved mcq, machine learning solved mcq

Friday, February 4, 2022

Machine learning MCQ - Calculate the entropy of a decision tree given the dataset

Machine Learning MCQ - Calculate the entropy of a decision tree given the dataset

How to calculate entropy?

What is entropy?

How does entropy measurement help in decision tree?

Related links:

Machine learning MCQ home page

Machine learning TRUE / FALSE questions home page

Computer science engineering quiz questions and answers page

Data Warehousing and Data Mining MCQ with answers home

What is entropy?

How does entropy helps in decision tree?

How to calculate entropy for a given dataset?

Machine learning solved mcq, machine learning solved mcq

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse