Partitioned Parallel Hash Join

Sunday, June 22, 2014

Partitioned Parallel Hash Join

Partitioned Parallel Hash Join in Parallel Database / Parallel Hash Join Technique

Partitioned Parallel Hash Join

The Sequential Hash Join technique discussed in the post Hash Join technique in DBMS can be parallelized. .

The Technique

Please go through the post Hash Join technique in DBMS for better understanding of Parallel Hash-Join.

Assume that we have,

· n processors, P0, P1, …, Pn-1

· two tables, r and s (r and s are already partitioned into disks of n processors),

· s is the smaller table

Parallel Hash-Join Algorithm:

Step 1: Choose a hash function h1. h1 should be able to take the join attribute value of each tuple of relations r and s, and maps them to one of the n processors.

Step 2: Smaller relations s is taken as the Build relation.

Step 3: Each processor Pi reads the tuples of smaller relation s in it’s disk Di, and sends them to different processors based on the hash function h1.

Step 4: On receiving the records of s_i at every destination processor Pi, the Pi further partitions the tuples using another hash function h2. This hash function is used for joining the tuples locally at every processor. This step is independent for any processor.

Step 5: On completion of the distribution of all the records of smaller relation s, it is now the turn for the larger relation r. Step 3 is done for all the records of relation r.

Step 6: On receiving the records of r_i at every destination processor Pi, the Pi further partitions the tuples using another hash function h2. This phase is called the Probe Phase.

Step 7: At last, all the records which can be joined will be in same processors’ same partitions. Each processor Pi executes the build and probe phases of the hash–join algorithm on the local partitions r_i and s_i of r and s to produce a partition of the final result of the hash–join.

Points to remember:

Hash Join at each processor is independent of the other.

It works for equi-join and natural join conditions

Major links

Quicklinks

Sunday, June 22, 2014

Partitioned Parallel Hash Join

Partitioned Parallel Hash Join in Parallel Database / Parallel Hash Join Technique

Partitioned Parallel Hash Join

The Technique

Assume that we have,

Parallel Hash-Join Algorithm:

Points to remember:

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse