Please visit, subscribe and share 10 Minutes Lectures in Computer Science

# Multiple choices questions in Machine learning. Interview questions on machine learning, quiz questions for data scientist answers explained, machine learning exam questions, question bank in machine learning, What is stratification in cross validation? Why do we stratify in cross validation? why do we use stratification for?

## 1. Which of the following cross validation strategies cannot be stratified?

a) k-fold cross validation

b) hold out cross validation

c) leave one out cross validation

d) shuffle split cross validation

Answer: (c) leave one out cross validation (LOOCV)

Leave one out cross validation (LOOCV) does not permit stratification.

### What is stratification?

Stratification is the process of rearranging the data as to ensure each fold is a good representative of all strata (group of data based on characteristic) of the data. Generally this is done in a supervised way for classification and aims to ensure each class is (approximately) equally represented across each test fold (which are of course combined in a complementary way to form training folds). For example in a binary classification problem where each class comprises 50% of the data, it is best to arrange the data such that in every fold, each class comprises around half the instances.

### Why do we need stratification in cross validation?

Classification problems can exhibit a large imbalance in the distribution of the target classes: for instance there could be several times more negative samples than positive samples. In such cases it is recommended to use stratified sampling to ensure that relative class frequencies is approximately preserved in each train and validation fold.

Also, stratification reduces the variance slightly and thus seems to be uniformly better than cross validation both for bias and variance.

### How does LOOCV work?

Leave-one-out cross-validation is a special case of cross-validation where the number of folds equals the number of instances in the data set. Thus, the learning algorithm is applied once for each instance, using all other instances as a training set and using the selected instance as a single-item test set.

### Why leave one out CV does not permit stratification?

LOOCV uses 1 instance of n instance dataset as test set, and remaining n-1 instances as training set. It repeats this process n number of times. We cannot divide the data as a representative of a class. Hence, stratification cannot be done.

All the other given strategies can be stratified.

************************

## Featured Content

### Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

data recovery