Wednesday, 28 March 2018

Buffer pages required for external sorting of a file in database

Buffer pages required for external sorting of a file in database


Question:
Let us suppose that you have a file with 8500 pages and 3 available buffer pages. If you use external sorting to sort the above said file, answer the following questions;
a) How many runs will you produce in the first pass?
b) To sort the entire file, how many passes do you need?
c) How much is involved in terms of I/O cost to sort the file?
d) If you like to sort the file in just two passes, then what is the number of buffer pages you need?

Solution:
a) If Nf is the number of file pages and Nb is the number of available buffer pages then the first pass requires Nf/Nb [ceiling(Nf/Nb)] sorted runs.
For our case,
Number of file pages, Nf = 8500
Number of available buffer pages, Nb = 3
The number of sorted runs in the first pass = ⌈8500/3⌉ = 2834 runs

b) The number of passes required to sort the entire file can be calculated using the equation ⌈logNb-1Nf/Nb⌉⌉+1.
Number of passes to sort complete file = ⌈log2⌈8500/3⌉⌉+1
                                                                   = ⌈log22834⌉+1 = 9 passes

c) Since each page is read and written once per pass, the total number of page I/Os for sorting the file is 2 * Nf * (#passes):
Total I/O cost for sorting the file = 2 * 8500 * 9 = 1,53,000

d) In Pass 0, Nf/Nbruns are produced. In Pass 1, we must be able to merge this many runs; i.e., Nb − 1 ≥ Nf/Nb. This implies that Nb must at least be large enough to satisfy Nb * (Nb − 1) ≥ Nf; this can be used to guess at Nb, and the guess must be validated by checking the first inequality. Thus;
with 8500 pages in the file, Nb = 93 satisfies both inequalities, but Nb = 92 does not. So we need 93 buffer pages to sort the entire file in just two passes.

[Note: to find the value of Nb, factorize the equation Nb * (Nb-1) = Nf which can be written as a quadratic equation Nb2-Nb-Nf = 0. The equation to be solved is Nb2 - Nb - 8500 = 0]

*********

 


buffer pages required for external sorting
number of passes required to complete sorting of a file using external sorting
external sorting technique in database 
number of runs required for each pass in external sorting
I/O cost involved in sorting the entire file








No comments:

Post a Comment

Find candidate key and normalize the relation into 2nf and 3nf

Find candidate key and normalize the relation into 2nf and 3nf Question: A relation R is defined as follows. R = (name, stre...