Thursday, February 5, 2015

Components of Distributed DBMS



The components of Distributed DBMS architecture / List the components of peer-to-peer distributed database management systems / What are the components and sub-components of Distributed DBMS? / List the major and minor components of distributed dbms architecture


Components of Distributed DBMS Architecture


User processor and Data processor are the two major components of Distributed DBMS architecture. These major components handle different user requests using several sub-components in a Peer-to-Peer Distributed DBMS. Those are;

User Processor

  • User interface handler – interpreting user commands when they are given in, and formatting the result sets when the request is answered.
  • Semantic data controller – uses the Global Conceptual Schema to check the integrity constraints defined on database elements and also to check the authorizations on accessing the requested database.
  • Global query optimizer and decomposer – devises a best execution strategy to execute the given user requests in minimal cost (in terms of time, processor, memory). It is like Query Optimizer in Centralized database systems. Only exception is it has to devise a strategy which is globally optimal.
  • Distributed execution monitor – it is the Transaction manager. The Transaction managers of various sites that are participated in a query execution communicate with each other as part of execution monitoring.

Data Processor

  • Local query optimizer – it optimizes data access by choosing the best access path. For example, Local query optimizer decides which index to be used for optimally executing the given query.
  • Local recovery manager – deals with the consistency of the local database. In case of failure, local recovery manager is responsible for maintaining a consistent database.
  • Run-time support processor – it accesses the database physically according to the strategy suggested by the local query optimizer. The run-time support processor is the interface to the operating system and contains the database buffer (or cache) manager, which is responsible for maintaining the main memory buffers and managing the data accesses.


Figure 1 shown below depicts the major and minor components of Distributed DBMS and their communication links. [Image is taken from the book Principles of Distributed Database Systems, Third Edition, by M. Tamer Özsu and Patrick Valduriez.]

Wednesday, February 4, 2015

Keywords and Definitions in Distributed Database


Keywords in Distributed Database / Keywords defined in Distributed Database / Two Marks Questions with Answers in Distributed Database


Keywords and Definitions


Replication (in Database)

Storing identical copies of same relations (tables) in different sites in a distributed database is called Replication.
For example, a relation R is sent to several sites as R. At the end of replication in n different sites, the following will hold;

R in Site 1 = R in Site 2 = R in Site 3 = ... = R in Site n

Fragmentation (in Database)

Partitioning a relation (table) vertically or horizontally into several partitions and storing each partition in different site is called Fragmentation in distributed database.
For example, a relation R is horizontally fragmented into n fragments. At the end of Horizontal fragmentation, every fragment is sent to n sites as follows;

R = R1 U R2 U R3 U ... U Rn

Or, a relation R is vertically fragmented into n fragments, and sent to n sites as follows;

R = R1 \bowtie R2 \bowtie R3 \bowtie ... \bowtie Rn

In-doubt Transaction

When a site recovers from failure, the first job is to check its own log file to set right all those transactions that were executing during the failure. In this event, the recovering site checks for the transaction control messages like <commit T>, <abort T>, or <ready T>.

In this process, if any transaction have only <ready T> messages and no <commit T> or <abort T>, then the recovering site cannot decide the fate of that particular transaction on its own. It means, the site has to contact other sites for deciding the fate of the transaction. Such a transaction is called as In-doubt transaction.

The Purpose of Commit Protocol in Distributed Database

To ensure Atomicity. That is, a distributed transaction would mean a single transaction which is executed at various sites simultaneously. At the end of the execution of that transaction, all sites must perform a commit or abort without any conflicts. Failing which would lead to an inconsistent database.


***********


Saturday, January 31, 2015

Dependency Preservation


Define dependency preservation / What is dependency preservation? / Why do we need dependency preserving decomposition? / The need for dependency preserving decomposition / dependency preserving decomposition example / What is dependency Preservation property for decomposition? / Why dependency preservation is important? / Dependency preservation property of normalization process


Dependency Preservation


A decomposition of a relation R into R1, R2, R3, …, Rn is dependency preserving decomposition with respect to the set of Functional Dependencies F that hold on R only if the following is hold;

(F1 U F2 U F3 U … U Fn)+ = F+
where,
F1, F2, F3, …, FnSets of Functional dependencies of relations R1, R2, R3, …, Rn.

(F1 U F2 U F3 U … U Fn)+ - Closure of Union of all sets of functional dependencies.

F+ - Closure of set of functional dependency F of R.

If the closure of set of functional dependencies of individual relations R1, R2, R3, …, Rn are equal to the set of functional dependencies of the main relation R (before decomposition), then we would say the decomposition D is lossless dependency preserving decomposition.

Discussion with Example of Non-dependency preserving decomposition:


Dependency preservation is a concept that is very much related to Normalization process. Remember that the solution for converting a relation into a higher normal form is to decompose the relation into two or more relations. This is done using the set of functional dependencies identified in the lower normal form state. 

For example, let us assume a relation R (A, B, C, D) with set of functional dependencies F = {AB, BC, CD}. There is no partial dependency in the given set F. Hence, this relation is in 2NF. 

Is R (A, B, C, D) in 3NF? No. The reason is Transitive Functional Dependency. How do we convert R into 3NF? The solution is decomposition. 

Assume that we decompose R(A, B, C, D) into R1(A, C, D) and R2(B, C).
In R1(A, C, D), the FD CD holds. In R2(B, C), the FD BC holds. But, there is no trace of the FD AB. Hence, this decomposition does not preserve dependency.

What wrong would the above decomposition cause?

In R, the following were held;

  • value of B depends on value of A,
  • value of C depends on value of B,
  • value of D depends on value of C.
after decomposition, the relations R1 and R2 are holding the following;

  • value of C depends on value of B,
  • value of D depends on value of C.

The dependency AB is missing. This causes acceptance of any values for B in R2. It causes duplicate values to be entered into R2 for B which may not depend on A. If we would like to avoid this type of duplication, then we need to perform a join operation between R1 and R2 to accept a new value for B which is costlier operation. Hence, we demand the decomposition to be a dependency preserving decomposition.

Few key points;

  • We would like to check easily that updates to the database do not result in illegal relations being created.
  • It would be nice if our design allowed us to check updates without having to compute natural joins.
  • We can permit a non-dependency preserving decomposition if the database is static. That is, if there is no new insertion or update in the future.
Example:

Assume R(A, B, C, D) with FDs AB, BC, CD.
Let us decompose R into R1 and R2 as follows;
R1(A, B, C)
R2(C, D)
The FDs AB, and BC are hold in R1.
The FD CD holds in R2.
 All the functional dependencies hold here. Hence, this decomposition is dependency preserving. 


Friday, January 30, 2015

Operating Systems Concepts, Questions with Answers


All Operating Systems Questions with Clear Answers / Operating Systems Exercises Solved / Solutions for Exercises in Operating Systems / Operating Systems selected Questions with Answers / Operating Systems Notes / Index for various online materials on Operating Systems



1. Processes and Threads

2. Inter-process Communication

3. Inter-process Communication in Linux

4. Process Scheduling

5. Process Scheduling in Linux

6. Process Synchronization

7. Deadlocks

8. Memory Management

9. Memory Management in Linux

10. File System Concepts

11. File System Implementation

12. File System in Linux

13. File System in Windows

Advanced Concepts in Operating System

1. Distributed OS

2. Distributed Deadlock Handling

3. Distributed Shared Memory

4. Multiprocessor OS


1. Peterson’s algorithm


2. Inter Process Communication (IPC) through Shared Memory, Race Conditions, Mutual Exclusion, Peterson's Algorithms etc.


3. Scheduling Algorithms


4. Paging and TLB


5. Deadlock




8. Memory Management


9. Distributed Operating Systems
With the advent of computer networks, in which many computers are linked together and are able to communicate with one another, distributed computing became feasible. A distributed computation is one that is carried out on more than one machine in a cooperative manner. A group of linked computers working cooperatively on tasks, referred to as a distributed system, often requires a distributed operating system to manage the distributed resources. Distributed operating systems must handle all the usual problems of operating systems, such as deadlock. Distributed deadlock is very difficult to prevent; it is not feasible to number all the resources in a distributed system. Hence, deadlock must be detected by some scheme that incorporates substantial communication among network sites and careful synchronization, lest network delays cause deadlocks to be falsely detected and processes aborted unnecessarily. Interprocess communication must be extended to processes residing on different network hosts, since the loosely coupled architecture of computer networks requires that all communication be done by message passing. Important systems concerns unique to the distributed case are workload sharing, which attempts to take advantage of access to multiple computers to complete jobs faster; task migration, which supports workload sharing by efficiently moving jobs among machines; and automatic task replication at different sites for greater reliability. 




Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents