Showing posts with label Fragmentation. Show all posts
Showing posts with label Fragmentation. Show all posts

Advantages of Data Fragmentation in Distributed Database


Advantages of Data Fragmentation / What are the advantages of fragmentation in distributed database management systems?


Advantages of Fragmentation


  • Parallelized execution of queries by different sites is possible. A query can be divided into several sub-queries and can be executed simultaneously at several sites.
  • A database is fragmented and fragments are made available at different sites. It reduces disk space utilization considerably.
  • Data management is easy as fragments are smaller compare to the complete database.
  • Increased availability of data to the users/queries that are local to the site in which the data stored.
  • Most of the applications work with views rather than entire relations; hence in terms of usage, fragmentation is advantageous.
  • As the data is available close to the place where it is mostly frequently used, the efficiency of the system in terms of query processing, transaction processing is increased.
  • Data that are not required by local applications are not stored locally. It leads to reduced data transfer between sites, and increased security.



Distributed Database - Fragmentation continued


Fragmentation in Distributed Database System / Horizontal Fragmentation in Distributed Database / Derived Horizontal Fragmentation Example / Derived Horizontal Fragmentation Explained


Derived Horizontal Fragmentation


The process of creating horizontal fragments of a table in question based on the already created horizontal fragments of another relation (for example, base table) is called Derived Horizontal Fragmentation.

In the previous post, we have seen about Primary Horizontal Fragmentation. We use the primary horizontal technique when we would like to horizontally fragment a table which is not dependent on any other table, or without considering any other table. That is, a table fragmented based on set of conditions where all the conditional attributes are part of that table only. This type of fragmentation is simple and straight forward. But in most of the cases, we need to fragment a database as a whole. For example, consider a relation which is connected with another relation using foreign key concept. That is, whenever a record is inserted into the child table, the foreign key column value of the inserted record must be verified for its availability in its parent table. In such condition, we cannot fragment the parent table (Table with primary key) and the child table (table with foreign key). If we fragment the tables separately, then for every insertion of records the table must verify the existence of one such value in the parent table. Hence, for this case the Primary Horizontal Fragmentation would not work.

Consider an example, where an organization maintains the information about its customers.They store information about the customer in CUSTOMER table and the customer addresses in C_ADDRESS table as follows;
CUSTOMER(CId, CName, Prod_Purchased, Shop_Location)
C_ADDRESS(CId, C_Address)
The table CUSTOMER stores information about the customer, the product purchased from their shop, and the shop location where the product is purchased. C_Address stores information about permanent and present addresses of the customer. Here, CUSTOMER is the owner relation and C_ADDRESS is the member relation.


Figure 1: CUSTOMER table
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C001
Ram
Air Conditioner
Mumbai
C002
Guru
Television
Chennai
C010
Murugan
Television
Coimbatore
C003
Yuvraj
DVD Player
Pune
C004
Gopinath
Washing machine
Coimbatore

Figure 2: C_ADDRESS table
CID
C_ADDRESS
C001
Bandra, Mumbai
C001
XYZ, Pune
C002
T.Nagar, Chennai
C002
Kovil street, Madurai
C003
ABX, Pune
C004
Gandhipuram, Ooty
C004
North street, Erode
C010
Peelamedu, Coimbatore

If the organization would go for fragmenting the relation CUSTOMER on the shop_location attribute, it needs to create 4 fragments using horizontal fragmentation technique as given in Figure 3 below.

Figure 3: Horizontal fragments of Figure 1 on Shop_Location attribute
CUSTOMER1
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C001
Ram
Air Conditioner
Mumbai

CUSTOMER2
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C002
Guru
Television
Chennai

CUSTOMER3
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C010
Murugan
Television
Coimbatore
C004
Gopinath
Washing machine
Coimbatore

CUSTOMER4
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C003
Yuvraj
DVD Player
Pune

Now, it is necessary to fragment the second relation C_ADDRESS based on the fragment created on CUSTOMER relation. Because, in any other way, if we fragment the relation C_ADDRESS, then it may end in different location for different data. For example, if C_ADDRESS is fragmented on the last digit of the CID attribute, it will end up with more number of fragments and the data may not be stored in the same location where customer information are stored. That is, customer ‘Ram’ information is stored in Mumbai and his address information might be stored somewhere else. To avoid such confusion, the table C_ADDRESS which is actually a member table of CUSTOMER, must be fragmented into four fragments and based on the CUSTOMER table fragments given in Figure 3. This type of fragmentation based on owner relation is called Derived Horizontal Fragmentation. This will work for relations where an equi-join is required for joining two relations. Because, an equi-join can be represented as set of semi-joins.

The fragmentation of C_ADDRESS is done as follow as set of semi-joins as follows.
C_ADDRESS1 = C_ADDRESS CUSTOMER1
C_ADDRESS2 = C_ADDRESS CUSTOMER2
C_ADDRESS3 = C_ADDRESS CUSTOMER3
C_ADDRESS4 = C_ADDRESS CUSTOMER4

This will result in four fragments of C_ADDRESS where the customer address of all customers of fragment CUSTOMER1 will go into C_ADDRESS1, and the customer address of all customers of fragment CUSTOMER2 will go into C_ADDRESS2, and so on. The resultant fragment of C_ADDRESS will be the following.

Figure 4: Derived Horizontal fragments of Figure 2 as a member relation of the owner relation’s fragments from Figure 3
C_ADDRESS1
CID
C_ADDRESS
C001
Bandra, Mumbai
C001
XYZ, Pune

C_ADDRESS2
CID
C_ADDRESS
C002
T.Nagar, Chennai
C002
Kovil street, Madurai

C_ADDRESS3
CID
C_ADDRESS
C004
Gandhipuram, Ooty
C004
North street, Erode
C010
Peelamedu, Coimbatore

C_ADDRESS4
CID
C_ADDRESS
C003
ABX, Pune

 

Checking for correctness
Completeness: The completeness of a derived horizontal fragmentation is more difficult than primary horizontal fragmentation. Because, the predicates used are determining the fragmentation of two relations. Formally, for fragmentation of two relations R and S, such as {R1, R2, …, R3} and {S1, S2, …, S3}, there should be one common attribute such as A. Then, for each tuple t of Ri, there should be a tuple Si which have a common value for A. This is known as referential integrity.
The derived fragmentation of C_ADDRESS is complete. Because, the value of the common attributes CID for the fragments CUSTOMERi and C_ADDRESSi are the same. For example, the value present in CID of CUSTOMER1 is also and only present in C_ADDRESS1,  etc.

Reconstruction: Reconstruction of a relation from its fragments is performed by the union operator in both the primary and the derived horizontal fragmentation.

Disjointness: If the minterm predicates are mutually exclusive then the disjointness rule is satisfied for Primary Horizontal Fragmentation. For derived horizontal fragmentation, we state that the fragments are disjoint if the fragments were created using the mutually exclusive simple predicates of the base relation. Hence, in our example, as the simple predicates Shop_Location=’Mumbai’, etc are mutually exclusive, the derived fragments are also disjoint.

*************

Go to Distributed Database page
 
Go to Distributed Database - Fragmentation page

Go to Primary Horizontal Fragmentation page



Wikipedia

Search results