Please visit, subscribe and share 10 Minutes Lectures in Computer Science
Showing posts with label Indexing. Show all posts
Showing posts with label Indexing. Show all posts

## Indexing and Tuning - solved exercises - Indexing examples - Tuning examples - How to choose an appropriate index?

Question:

Consider a relation Customer of a large super market chain with the relational schema Customer (CID, CName, City, PointsCollected). Here CID is the customer identification number that stores unique values.
The following query is the most frequent one;
SELECT * FROM Customer WHERE CID = v1;
Assume that the relation Customer does not have any indexes. We need to create necessary indexes. Which of the following indexes is best suited for executing the above query efficiently? Why?
(a) A sparse primary index on CID           (b) A dense primary index on CID

Solution:

Definition: Primary index can be sparse (only few values of indexing column of data file is stored sparsely in index table along with pointer in sorted order) or dense (all the values of indexing column of the data file is stored in index table along with pointer in sorted order).

The query given above is called point queries. Point query is searching for records that satisfies a condition of the form “attribute_name = value”.

To explain further, let us use the given query with a value as follows;

SELECT * FROM Customer WHERE CID = 123;

This query, according to the information provided in the question, will give exactly one record.

Among sparse index and dense index, the dense index is most suitable index for CID attribute. Let us discuss this with the above query.

The following will happen,

If we use sparse index – the system will look for the matching CID value in the index file. [As index file is always sorted, we can skip further scan once we reach the required record]. If we find the matching record,
2. locate the actual record (or the block where the record is stored) in the data file directly using the address.
If the matching value is not found in the index file (because it is sparse),
1. we have to find the largest CID value that is smaller than 123,
2. use the pointer of that largest CID value, and
3. locate the record (or the block where the record is stored) of largest CID in the data file directly using the address, and
4. perform linear scan until you reach the required value/record.

If we use dense index - the system will look for the matching CID value in the index file. [As index file is always sorted, we can skip further scan once we reach the required record]. If we find the matching record,
2. locate the actual record (or the block where the record is stored) in the data file directly using the address.

The difference between these two is, dense index stores all the values in index file so linear scan in index file is enough to find any record, whereas sparse index stores few records than dense index in the index file so linear scan is required both in index file and in data file (in many cases).

Hence, dense primary index is the best suited for executing the above query.

c

## A sparse primary index, A dense primary index -  Which of the indexes is best suited for executing the given SQL query efficiently? Why?

Question / Exercise:

Consider a relation Faculty with the relational schema Faculty (FID, FName, Dept, Salary). Here FID stores the values 1, 2, 3, … , n as faculty identification numbers.
The following query is the most frequent one;
SELECT * FROM Faculty WHERE FID > v1 AND FID < v2;
Assume that Faculty does not have any indexes. We need to create necessary indexes. Which of the following indexes is best suited for executing the above query efficiently?
(a) A sparse primary index on FID            (b) A dense primary index on FID

Solution:

Sparse primary index – an index with few FIDs listed sparsely in the index table. You can reach the record in the data file by choosing the record’s pointer whose value is the greatest below the searching value. Sparse index is smaller than dense index.
Dense primary index – an index with each FIDs listed densely in the index table for each record in the data file. It has one to one relationship between index table and data file. Dense index is larger than sparse index.
Primary index - The data file is sequentially ordered by an ordering key field, and the indexing field is built on the ordering key field.

Sparse primary index is best suited for handling the above query. When executing the query we search for v1 in the index table, and find an index entry with the smallest value that is greater than v1. Then we use the pointer value in that record to find the location (address) of the record in the data file. We then do a simple sequential scan in the data file from that point (address) till we reach a record with FID value greater than v2. Refer the example given below.

In simple terms, sparse index would be helpful when compared with dense index in the case of range queries.

Dense index occupies all five entries in the index table and makes index table bigger. Dense index is good for point queries.

## Indexing and Tuning - List of solved exercises and examples

### Indexing and Tuning

• Indexing and Tuning exercise - solved exercise - 1
• Indexing and Tuning exercise - solved exercise - 2
• Indexing and Tuning exercise - solved exercise - 3
• Indexing and Tuning exercise - solved exercise - 4
• Indexing and Tuning exercise - solved exercise - 5
• Indexing and Tuning exercise - solved exercise - 6
• Indexing and Tuning exercise - solved exercise - 7
• Indexing and Tuning exercise - solved exercise - 8
• Indexing and Tuning exercise - solved exercise - 9
• Indexing and Tuning exercise - solved exercise - 10

## Dense index VS Sparse index

 Dense Index Sparse Index Index size Larger Smaller Records in data file Need not be clustered Need to be Clustered Time to locate data Less More Computing time in RAM Less More Overhead for insertions and deletions More Less Data pointers pointing to Each record in the data file Fewer records in the data file

## Featured Content

### Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

data recovery