Please visit, subscribe and share 10 Minutes Lectures in Computer Science
Showing posts with label Database Performance. Show all posts
Showing posts with label Database Performance. Show all posts

## Sunday, December 20, 2020

### Query processing and optimization exercise in DBMS

DBMS solved exercises, database query processing and optimization solved exercises, how to find the query IO cost for natural join

Question:

Consider doing a natural join operation on two relations R(A, B) and S(B, C). Assume that the tuples of R are stored contiguously on 400 disk blocks,  while the tuples of S are stored contiguously on 1000 blocks. Each block holds 20 tuples (same for R as for S). There are 51 memory blocks available.

Calculate the I/O cost for iteration join (assume that R and S are not sorted), and sort-merge join (assume that R and S are sorted on attribute B) algorithms. Ignore the I/O cost of writing the final output to disk.

Solution:

Given,

Total disk blocks for R, B(R) = 400

Total disk block for S, B(S) = 1000

Iteration join:

We have only 51 blocks are available in main memory. The iteration join algorithm can read 50 R blocks into memory, and all blocks of S (1 block at a time) into memory at a time, to join S with R. The process is repeated until all outputs are generated.

Number of IOs to read R into memory = 400, (ie., 50 * 8)

Number of IOs to read S into memory = for each of 50 R blocks we need 1000 IOs to read S

Total number of IOs required = 400 + 1000 * 8 = 8400

Merge join:

Given,

Both R and S are sorted on the join attribute B.

Merge join: Both sorted files are scanned concurrently in order of the join attributes, matching the records that have the same values for A and B.

Cost of merging R and S = B(R) + B(S) = 400 + 1000 = 1400

**************

Go to Normalization - Solved Exercises page

## Harddisk drive vs RAM - a comparison

 Hard Drive RAM Size Large sizes Internal drives – Max 8 TB for desktops External drives – Max 8 TB Small sizes Max 32 GB (as per the year 2015 for desktop computers) Storage Permanent (Non-volatile) Do not lose data due to power loss (Switched off). Temporary (Volatile) Lose all data due to power loss (Switched off). Time taken to read files Vary for files. Depends on the size and location of a file. Fixed for files. We are able to pick the data directly as per the size. Units of storage Bytes Bytes Requires power to retain content No Hard drives retain contents as the writing surfaces are magnetic Yes RAM cannot retain the content due to power loss is because of RAM is made up of circuits. Made up of Magnetic components IC (Integrated Circuits) components Storage capacity vs performance No performance improvement. Increase in the size of the hard disk, say, from 500 GB to 1 TB would not show any performance improvements. Performance changes. Increase in the size of RAM, say, from 4 GB to 8 GB will definitely show some improvements in performance. How the files are stored? Magnetically Electronically Flexibility in space Flexible. We can delete unwanted files if we need more space (on the same disk) Not flexible. If any program does not function due to less memory size, you need to upgrade your memory to make that program work. Hardware dependency Works on 32 bit and 64 bit computers Upto 4 GB – supported by 32 bit computers Upto and above 4 GB – supported by 64 bit computers

## What are the performance measures of hard disks?

### Function of a hard disk

First, let us see the function and internals of a hard disk drive. See the figure given below for the essential things inside the hard disk arrangement for our discussion.
 Figure 1 - internal component of a hard disk drive
Platter - flat circular disk (like CDs or DVDs) that hold data
Track - the logical division of a side of a platter
Read-write head - a small part of hard disk that move above the disk platter to read or write data from or to disk. [read-write head transform the platter's magnetic field into electrical current - reading, read-write head transform the electrical current into magnetic field - writing]
Spindle - fixed setup which rotates the platters

### How does a hard disk drive function?

If a request is initiated by the Operating System to read a file, the information is passed on to the Disk-controller, and the disk-controller initiates necessary actions.

1. Read-write head is positioned over the right track where the required data might be stored.
2. Platters are rotated to bring the right sector under the read-write head.
3. The data transfer begins at this point.

Watch the video to see the movements of the internal components. [video is taken from Wikipedia.org].

### Various performance measures of hard disk

They are,

• Disk capacity,
• Access time,
• Data-transfer rate
• Reliability,
• Power consumption and Shock resistance.

Disk Capacity – the amount of data which can be stored (measured in terms of GBs / TBs today).
Access time – the time consumed between the request initiation and the beginning of data transfer.
For example, if you would like to open a file, you need to click/double click the file. That moment, the request is generated. This is the request initiation time. To access the data on a particular sector, the disk arm is positioned the read/write head over the correct track of the required sector. Then the disk platter is rotated to position the correct sector under the read-write head. At this moment the transfer begins. This is the data transfer beginning time. The actual time consumed between these two times is called the Access time.
The key components that are considered in calculating the Access time are;

• Seek time – the time for repositioning the arm so that the read-write head can be positioned over the correct track. [Average seek time for Seagate hard disk drive Barracuda 7200.9 is 11 ms approximately]

• Rotational latency time – the time spent for the required sector to appear under the read-write head. [Typical rotational latency for hard disk drive with the speed of 7200 rpm is 4.17 ms approximately]

Hence,
Access time = Seek time + Rotational latency time
Data transfer rate – the rate (speed) at which data can be read from / written to a disk. [Maximum external transfer rate for Seagate hard disk drive Barracuda 7200.9 is 300 MB/s approximately].

Data transfer rate = minimum (the speed at which the data can be moved between disk surface and disk controller, the speed at which the data can be moved between disk controller and the host system)
For example,
Data transfer rate = minimum (100 MB/s, 150 MB/s) = 100 MB/s.
Reliability – How far a disk is reliable, ie., free from failure is one of the major measures. Mean-Time-To-Failure (MTTF) is the measure used to calculate the reliability of the disk. [Typical MTTF value ranges from 30,000 hours to 1,200,000 hours today]
Power consumption and Shock resistance– they are the other issues which are very much analyzed in today’s scenario for the cases mobile devices such as laptops. To get rid of waste heat pollution, these measures should be considered for even desktop computers also.