Sunday, December 20, 2020

Query processing and optimization exercise in DBMS

DBMS solved exercises, database query processing and optimization solved exercises, how to find the query IO cost for natural join


Question:

Consider doing a natural join operation on two relations R(A, B) and S(B, C). Assume that the tuples of R are stored contiguously on 400 disk blocks,  while the tuples of S are stored contiguously on 1000 blocks. Each block holds 20 tuples (same for R as for S). There are 51 memory blocks available.

Calculate the I/O cost for iteration join (assume that R and S are not sorted), and sort-merge join (assume that R and S are sorted on attribute B) algorithms. Ignore the I/O cost of writing the final output to disk.

 

Solution:

Given,

Total disk blocks for R, B(R) = 400

Total disk block for S, B(S) = 1000

Iteration join:

We have only 51 blocks are available in main memory. The iteration join algorithm can read 50 R blocks into memory, and all blocks of S (1 block at a time) into memory at a time, to join S with R. The process is repeated until all outputs are generated.

Number of IOs to read R into memory = 400, (ie., 50 * 8)

Number of IOs to read S into memory = for each of 50 R blocks we need 1000 IOs to read S

Total number of IOs required = 400 + 1000 * 8 = 8400

Merge join:

Given,

          Both R and S are sorted on the join attribute B.

Merge join: Both sorted files are scanned concurrently in order of the join attributes, matching the records that have the same values for A and B.

Cost of merging R and S = B(R) + B(S) = 400 + 1000 = 1400

 

**************

Go to Normalization - Solved Exercises page

Go to Advanced DBMS Concepts page

Go to Solved exercises in DBMS page

Go to Multiple Choice Questions (MCQ) in DBMS home

 


No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery