Sunday, March 23, 2014

Parallel Join Technique - Partitioned Join

Partitioned Join

The relational tables that are to be joined gets partitioned on the joining attributes of both tables using same partitioning function to perform Join operation in parallel.

How does Partitioned Join work?

Assume that relational tables r and s need to be joined using attributes r.A and s.B. The system partitions both r and s using same partitioning technique into n partitions. In this process A and B attributes (joining attributes) to be used as partitioning attributes as well for r and s respectively. r is partitioned into r₀, r₁, r₂, …, r_n-1and s is partitioned into s₀, s₁, s₂, …, s_n-1. Then, the system sends partitions r_i and s_i into processor P_i, where the join is performed locally.

What type of joins can we perform in parallel using Partitioned Join?

Only joins such as Equi-Joins and Natural Joins can be performed using Partitioned join technique.

Why Equi-Jion and Natural Jion?

Equi-Join or Natural Join is done between two tables using an equality condition such as r.A = s.B. The tuples which are satisfying this condition, i.e, same value for both A and B, are joined together. Others are discarded. Hence, if we partition the relations r and s on applying certain partitioning technique on both A and B, then the tuples having same value for A and B will end up in the same partition. Let us analyze this using simple example;

*RegNo*	*SName*	*Gen*	*Phone*
1	Sundar	M	9898786756
3	Karthik	M	8798987867
4	John	M	7898886756
2	Ram	M	9897786776

Table 1 - STUDENT

*RegNo*	*Courses*
4	Database
2	Database
3	Data Structures
1	Multimedia

Table 2 – COURSES_REGD

Let us assume the following;

The RegNo attributes of tables STUDENT and COURSES_REGD are used for joining.

Observe the order of tuples in both tables. They are not in particular order. They are stored in random order on RegNo.

Partition the tables on RegNo attribute using Hash Partition. We have 2 disks and we need to partition the relational tables into two partitions (possibly equal). Hence, n is 2.

The hash function is, h(RegNo) = (RegNo mod n) = (RegNo mod 2). And, if we apply the hash function we shall get the tables STUDENT and COURSES_REGD partitioned into Disk₀ and Disk₁ as stated below.

Partition 0

Partition 1

*RegNo*	*SName*	*Gen*	*Phone*
4	John	M	7898886756
2	Ram	M	9897786776

STUDENT_0

*RegNo*	*SName*	*Gen*	*Phone*
1	Sundar	M	9898786756
3	Karthik	M	8798987867

STUDENT_1

*RegNo*	*Courses*
4	Database
2	Database

COURSES_REGD_0

*RegNo*	*Courses*
3	Data Structures
1	Multimedia

COURSES_REGD_1

From the above table, it is very clear that the same RegNo values of both tables STUDENT and COURSES_REGD are sent to same partitions. Now, join can be performed locally at every processor in parallel.

One more interesting fact about this join is, only 4 (2 Student records X 2 Courses_regd records) comparisons need to be done in every partition for our example. Hence, we need total of 8 comparisons in partitioned join against 16 (4 X 4) in conventional join.

The above discussed process is shown in Figure 1.

Figure 1 - Partitioned Join

Points to note:

1. There are only two ways of partitioning the relations,

Range partitioning on the join attributes or

Hash partitioning on the join attributes.

2. Only equi-joins and natural joins can be performed in parallel using Partitioned Join technique.
3. Non-equi-joins cannot be performed with this method.

4. After successful partitioning, the records at every processor can be joined locally using any of the joining techniques hash join, merge join, or nested loop join.

5. If Range partitioning technique is used to partition the relations into n processors, Skew may present a special problem. That is, for some partitions, we may get fewer records (tuples) for one relation for a given range and many records for other relation for the same range.

6. With Hash partitioning, if there are many tuples with same value in one relation then the difference both relations is possible in one partition. Otherwise, skew has minimal effect.

7. The number of comparisons between relations are well reduced in partitioned join parallel technique.

"The more I help others to succeed, the more I succeed."

- Ray Kroc

By Saravanakumar Kandasamy - March 23, 2014 2 comments:

Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Labels: Intraoperation Parallelism, Parallel Database

Natural Join

It is a join technique which compares the common attributes of the tables to join the tables.
We have to check for the common attributes in the tables well before executing the join query.
NATURAL JOIN is the keyword used in the query to perform join.
Natural join cause problems when new columns are added or columns are renamed or more than one set of common attributes are available. Hence, Natural join is not recommended.

Example:

SELECT * FROM Employee NATURAL JOIN Course_Registration;

The result generated by the above query is slightly different in structure from that of Equi-join. See the post on Equi-join for table instances STUDENT and COURSE_REGISTRATION and to compare the result with Natural join. The result of the above SQL query is given in Table 1;

*RegNo*	*SName*	*Gen*	*Phone*	*Courses*
R1	Sundar	M	9898786756	Database
R2	Ram	M	9897786776	Database
R3	Karthik	M	8798987867	Data Structures
R4	John	M	7898886756	Multimedia

Table 1

*********

Go to Equi-join in SQL page

Go to Multiple choice questions in SQL/Relational algebra page

By Saravanakumar Kandasamy - March 23, 2014 No comments:

Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Functional Dependency Quiz 1

Relational Database Design - Functional Dependency Quiz

1. __________ refers to a attribute or group of attributes mentioned in the left hand side of the arrow in a FD.

2. In a functional dependency X --> Y, if Y is functionally dependent on X, but not on X's proper subsets, then we would call the functional dependency as

3. Which of the following is the result of bad database design?

4. If X is {E, G, H, M} and Y is {G, M} then X --> Y is

5. If X --> YZ then X --> Y and X --> Z is

6. Consider F1 and F2 as two sets of functional dependencies. If every functional dependency in F2 can be inferred from the functional dependencies of F1 using inference rules, then F1 is _________ of F2

7. If X --> Y is a functional dependency and X and Y are sets of attributes, what is the relationship between X and Y?

8. For a functional dependency X --> Y, it is said to be _________ if Y is the subset or equal to X.

9. To check whether X (a set of one or more attributes) is a candidate key of relation R, we need to find ______ of X.

10. (a) If every functional dependency in F1 can be inferred from F2 on application of inference rules, and
(b) removal of any one attribute from any functional dependency of F2 violates (a).
Then F2 is called ______ for F1.

Score =

Correct answers:

By Saravanakumar Kandasamy - March 23, 2014 No comments:

Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Labels: Database Quizzes

Friday, March 21, 2014

ER Model Quiz 3

DBMS Basics and Entity-Relationship Model - Quiz 3

1. Degree of relational table is

2. Cardinality of the relational table is

3. Which of the following would be the appropriate term for a key which consists of more than one attribute?

4. Degree of the relationship set is

5. Cardinality of an attribute in a relational table is

6. A Multivalued attribute in an ER diagram

7. The set of candidate keys that are not selected as the Primary key is called

8. "CREATE TABLE Emp (Emp_No CHAR(5), EName VARCHAR(25), SALARY NUMBER(8))". This DDL statment will

9. An attribute or set of attributes of one relation that matches an attribute or set of attributes of other relation is

TOPICS (Click to Navigate)

Sunday, March 23, 2014

Parallel Join Technique - Partitioned Join

Partitioned Join

How does Partitioned Join work?

What type of joins can we perform in parallel using Partitioned Join?

Why Equi-Jion and Natural Jion?

Points to note:

Natural Join

Natural Join

Example:

Functional Dependency Quiz 1

Relational Database Design - Functional Dependency Quiz

Friday, March 21, 2014

ER Model Quiz 3

DBMS Basics and Entity-Relationship Model - Quiz 3

Featured Content

Multiple choice questions in Natural Language Processing Home

All time most popular contents

Report Abuse