Buffer pages required for external sorting of a file in database
Question:
Let us suppose that you have a file
with 8500 pages and 3 available buffer pages. If you use external sorting to
sort the above said file, answer the following questions;
a) How many runs will you produce in
the first pass?
b) To sort the entire file, how many
passes do you need?
c) How much is involved in terms of I/O
cost to sort the file?
d) If you like to sort the file in
just two passes, then what is the number of buffer pages you need?
Solution:
a) If Nf is the number of file pages
and Nb is the number of available buffer pages then the first pass requires ⌈Nf/Nb⌉ [ceiling(Nf/Nb)]
sorted runs.
For our case,
Number of file pages, Nf =
8500
Number of available buffer pages, Nb
= 3
The number of sorted runs in the first
pass = ⌈8500/3⌉
= 2834 runs
b) The number of passes required to
sort the entire file can be calculated using the equation ⌈logNb-1⌈Nf/Nb⌉⌉+1.
Number of passes to sort complete file
= ⌈log2⌈8500/3⌉⌉+1
=
⌈log22834⌉+1 = 9 passes
c)
Since each page is read and written once per pass, the total number of page
I/Os for sorting the file is 2 * Nf * (#passes):
Total
I/O cost for sorting the file = 2 * 8500
* 9 = 1,53,000
d) In Pass 0, ⌈Nf/Nb⌉ runs
are produced. In Pass 1, we must be able to merge this many runs; i.e., Nb − 1 ≥ ⌈Nf/Nb⌉.
This implies that Nb must
at least be large enough to satisfy Nb
* (Nb − 1) ≥ Nf; this can be used to
guess at Nb, and the
guess must be validated by checking the first inequality. Thus;
with 8500 pages in the file, Nb = 93 satisfies both
inequalities, but Nb = 92
does not. So we need 93 buffer pages to sort the entire file in just two
passes.
[Note:
to find the value of Nb, factorize the equation Nb * (Nb-1)
= Nf which can be written as a quadratic equation Nb2-Nb-Nf
= 0. The equation to be solved is Nb2 - Nb - 8500
= 0]
*********
Go to Solved Exercises in DBMS page
buffer pages required for external sorting
number of passes required to complete sorting of a file using external sorting
external sorting technique in database
number of runs required for each pass in external sorting
I/O cost involved in sorting the entire file