Showing posts with label Distributed Database Question Bank. Show all posts
Showing posts with label Distributed Database Question Bank. Show all posts

Monday, May 4, 2020

Distributed Database Question Bank with Answers 12

Distributed Database Question Bank with Answers



1. In the Two-Phase Commit Protocol (2PC), why can blocking never be completely eliminated, even when the participants elect a new coordinator?

After the election, the new coordinator may crash as well. In this case, the remaining participants may not be able to reach a final decision, because this requires the vote from the newly elected coordinator, just as before.
The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an agreement message to the coordinator, it will block until a commit or rollback is received.

2. How should a small/large relation be partitioned?

It is better to assign a small relation to a single disk (single partition). Partitioning across several sites will never improve the performance.
A large relation should be partitioned across all the available disks (to increase the use of parallelism). In case of large relations, partitioning will be helpful a lot.

3. Which among the three architectures (share disk, shared memory, shared nothing) supports the failure of one processor?

Shared disk architecture is fault-tolerant to the extent that if a processor fails, other processors can take over its tasks as all processors can access the database resident on the disks.

4. Which form of parallelism (inter-query, intra-query, inter-operation, intra-operation) is most likely to advance the following goals and why?

i) Increasing the throughput and response time of the system when there are lots of smaller queries.
Inter-query parallelism is about executing multiple queries simultaneously. When there are smaller queries, this parallelism will increase the throughput, that is, the number of queries executed per unit time will be increased. Other type of parallelism don’t help much for small queries.

ii) Increasing the throughput and response time of the system, when there are a few large queries and the systems consist of several disks and processors.
Intraquery parallelism is likely to reduce throughput times efficiently in the cases of a few large queries. A large number of processors is best made use of using intraoperation parallelism, these queries typically contain  only a few operations (when there are few queries, then the number of simultaneous operation may even be less than the number of processors), but each operation usually refers to large number of tuples.
An example would be parallel sort: each processor executes the same operation, in this case sorting, simultaneously with the other processors. There is no interaction with other processors.

5. Suppose a company which processes transactions is growing rapidly and needs to choose a new, parallel computer. What would be more important, speedup or transaction scale-up? Why?

Since the scale of operations increases, one can expect the number of transactions submitted per unit time to increase. Individual transactions don’t have to take more time, so scale-up is the most important thing.


*******************


Download



Related links:



Distributed database multiple choice questions with answers

important quiz questions in DDB

one mark questions in distributed database for university examinations

Distributed Database Question Bank with Answers 13

Distributed Database Question Bank with Answers



1. Is pipelined parallelism part of interoperation parallelism or intra-operation parallelism?

Pipelining is a basic form of interoperation parallelism since a single query’s different operations are executed in parallel: there are many processors that are each performing one step in a multi-step process.
(The other form of interoperation parallelism is independent parallelism where operations of a query that don’t depend on one another are run in parallel.)

2. List a few relational operators that could be run in parallel.

Most common are:
Scan (combination of SQL’s select and where-clauses, known as project and select in relational terms),
Sort,
Join, and
Aggregate operators (max, min, sum, count).

3. Why are parallel databases usually homogenous?

Because having a different database schema would make it difficult to processes the same queries over different processors.
Parallel database architecture helps in improving the speed of query / transaction execution. Homogenous means identical software, and identical database design.
If we execute a query on multiple machines, then each must be identical to improve the performance. If database schema is different for different machines, the conversion of divided work and compilation of results will be overhead.

4. In what way does shared nothing architecture resemble a distributed database?

Each node has its own processor, disk(s), memory and runs its own database management software and operating system. Communication with other nodes is made through the interconnection network.
Most solutions for distributed databases such as distributed transactions (2PC) and fragmentation can be used in a shared-nothing architecture.

5. Why is inter-query parallelism in shared memory architecture very straightforward to implement? What problem can occur if inter-query parallelism is implemented in shared disk architecture?

With inter-query parallelism, the different queries or transactions execute in parallel.  Hence, traditional DBMS tricks will work since the DBMS already has the capacity for multi-threading.
In shared disk architecture, there is the problem of cache coherence since each processor can access any part of the database. The system must ensure that each processor has the latest version of the data in its memory.

******************


Download



Related links:



Distributed database multiple choice questions with answers

important quiz questions in DDB

one mark questions in distributed database for university examinations

Is pipelined parallelism part of interoperation parallelism or intra-operation parallelism? List a few relational operators that could be run in parallel. In what way does shared nothing architecture resemble a distributed database?




Sunday, May 3, 2020

Distributed Database Question Bank with Answers 11

Distributed Database Question Bank with Answers



1. What is a transaction? What about a distributed transaction?
A transaction is a logical unit (which must obey the ACID properties) which consists of read/write operations on the database. A distributed transaction is a transaction that operates at two or more different geographical sites (servers).

2. What is the use of location transparency?
Location transparency means that the user and/or the application need not know at what site each relation is stored at. The user can behave as if the site were stored at his/her site. This not only simplifies application programming, but as new ways of using DB come up, data movement is made easier.

3. Why can fragmenting a relation be often more useful than duplicating (replicating) the whole relation?

Fragmenting a relation is often more useful than replicating the whole relation because the queries related to the fragmented relation can be efficiently processed in parallel (assuming several processors). Fragmentation can also reduce the traffic load over the network and help to make many operations local.
Replicating a relation involves additional overhead in case of all data manipulation.

4. A global lock table stored at the Server contains what information?
The global lock table stores at the Server the locks for each client as the tuple (c, x, m, d) where x is the name of the lock, m is the lock-type (e.g write or read lock), d its duration such as commit-duration, short-duration(=kestoaika) and c is an identifier for the client.

5. What are the two main tasks of buffer coherency control in shared-disk system?
i) It must detect when a page on disk becomes invalid (i.e. is no longer clean as a result of being updated in the buffer of one node)
ii) It is responsible for providing the node with the most-up-to-date version of the page, meaning when a node needs a new version of a given page, the buffer coherency control must know from where to get it. The owner of the page is the only node authorized to write a dirty page back to the disk and if requested, must provide other nodes with the up-to-date version of that page.


Download




Related links:



Distributed database multiple choice questions with answers

important quiz questions in DDB

one mark questions in distributed database for university examinations

Friday, May 1, 2020

Distributed Database Question Bank with Answers 01

Distributed  and parallel database questions and answers for University exams


Question:

What is a distributed database? What is it not? If all sites use a DBMS from the same vendor, what do we call it?


Answer:
Distributed Database System (DDBS)
A DDBS is made of several sites, each with a DBMS. Each site has one or more users so that each site can access data from any other site: each site can generate queries and acts a provider of data for other sites. A desirable feature is that each site maintains local autonomy without any reliance on a central site. So site A should be able to complete a database operation (assuming it has the necessary data) even if site B is down. 

What is not a DDBS?
A distributed database is NOT just a loose collection of database files that are spread at different locations and accessed from a specific DBMS.

Homogeneous DDBS
When all sites use a DBMS from the same vendor it is a homogeneous DDBS.  That means, all sites have identical database management system software, are aware of one another, and agree to cooperate in processing users’ requests

**********************

Related Questions:

 

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery