Important considerations in distributed database over Centralized database
When compared to the centralized database system, the distributed database system should be capable of or should have the following things.
- Data allocation - We need to know the answers for the following questions; What to store? Where to store? and How to store?
- Data fragmentation - It is about, How one should organize the data?
- Distributed queries and transactions - We must find a way to handle the data using queries and to handle transactions which are happening in multiple distributed sites (Here site means a server).
Data allocation deals with the establishment of servers and maintenance of data in any locations. Data allocation strategies can be made by keeping the following things in mind;
- The data should be available in or near a site where it is needed most.
- The storage of data in a site should increase the availability and reliability of data.
- The strategy chosen for data allocation should increase the performance. That is, some of the drawbacks like bottleneck problem of central server concept or limited usability of data should be avoided.
- The idea should reduce the cost involved in storage and manipulation of data
- There should be a much reduced traffic or utilization of network. It should also ensure that there should never be a unnecessary use of network provided the data available near.
2. Data fragmentation
Data fragmentation is about how to break a table into fragments?, how many fragments need to be created? A table can be fragmented based on a) what are the frequent applications accesses the data?, b) what conditions are frequently used to access the data?, and c) what is the simplest way of maintaining the table schema at any locations? Here, the questions (a) and (b) mean the attributes and their values used for accessing a table frequently. For example, for the query "SELECT * FROM student WHERE campus='Mumbai'", campus='Mumbai' is the attribute name and value combination.
Fragmentation is of two major types;
- Horizontal fragmentation
- Primary Horizontal fragmentation
- Derived Horizontal fragmentation
- Vertical fragmentation
When the data are fragmented or replicated and distributed over many sites in the network, then retrieval of the data involves the following;
- The identification of the location of requested data,
- A protocol to fetch the data, and
- A way to organize the data, if it was spread over multiple sites.
Hence, Distributed Database System must be able to handle the data over the network. It just needs a special way to handle the queries and transactions over the conventional centralized database. That is, the system must understand the query and the query components and must be able to locate the data over network.
Further discussions on these considerations will be soon.