A distributed database system distributes data across multiple sites connected by a network. This provides benefits like increased reliability, local data access, modular scalability, and improved performance through parallelism. However, distributed databases also introduce greater complexity for software, transaction processing, data integrity, security, and failure handling compared to a centralized database.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
185 views
Distributed Database I
A distributed database system distributes data across multiple sites connected by a network. This provides benefits like increased reliability, local data access, modular scalability, and improved performance through parallelism. However, distributed databases also introduce greater complexity for software, transaction processing, data integrity, security, and failure handling compared to a centralized database.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20
Introduction
A database which is distributed over some form of network
to bring down the cost or the difficulty in accessing data and to increase the efficiency of the whole system. Homogeneous Distributed Database Identical software are used in all the sites A site refers to a server which is part of the distributed database system Software would mean OS, DBMS software, and even the structure of the database used In some cases, identical hardware are also used
All sites are well known.
As they are similar in terms of DBMS software and hardware used Partial control over the data of other sites As we know the structure of databases, software and hardware used in other sites. Hence the partial control over the data is possible Looks like a single central database system Different sites uses different database software The structure of databases reside in different sites may be different (because of data partitions) Co-operation between sites are limited. That is, it is not easy to alter the structure of the database or any other software used. Some reasons for Distributed databases are; Data are always available to end users, i.e., they are easily accessible. The availability makes the total system reliable. Distributed database increases the performance of the overall system. Because, the servers are available near the place where it is very much needed. Support organizational growth. Because, the distributed database structure would not cause stopping of all ongoing services. Only new distributed server may need to be established to handle the new details. Handling addition of any server, modification of existing modules etc. are easy. Distributed data handling increases the parallelism. That is, a number of queries can be handled simultaneously over multiple distributed server when compared to the central server approach. Let us consider the scenario of XYZ bank which is headquartered in Islamabad. Also, assume that the bank maintains its server in its head office. Now, all the bank transactions done at all the branches of XYZ bank must reach the central server to access the data. For example, consider a customer who is trying to withdraw the money from his account through an ATM located in Lahore. His withdrawal request must be sent to the central server, processed in central server, and money will be disbursed in the ATM. The following image shows the Central Server approach for any database for any organization. The requests initiated are shown in YELLOW lines. Now assume that, XYZ bank established several servers which are distributed throughout the country, say 6 different servers. Now, any request generated from the ATM from any part of the country will be forwarded to the server available in that part of the country. For any reason, if the requested data is not available with the local server, the server searches for the actual location of the requested data and forwards the request to that server, and routes the answer to the initiator. The image on next slide shows the distributed server concept. It shows a set of DSs(Distributed Servers), a set of Nodes (not all are labeled), and a set of links which shows the request generated from node to the DS. The dashed line shows that the request generated by a node which is local to some other DS and the received DS forwarded to other DS where the intended data would be available. Here, the main advantage is consumption of network bandwidth is controlled, .i.e., network traffic reduced. Availability of the data and the server increased, as they are very close and accessible The following image shows the Distributed Server approach for the above given scenario. Distributed Database System 1. Increased reliability and availability – A distributed database system is robust to failure to some extent. Hence, it is reliable when compared to a Centralized database system. 2. Local control – The data is distributed in such a way that every portion of it is local to some sites (servers). The site in which the portion of data is stored is the owner of the data. 3. Modular growth (resilient) – Growth is easier. We do not need to interrupt any of the functioning sites to introduce (add) a new site. Hence, the expansion of the whole system is easier. Removal of site is also does not cause much problems. 4. Lower communication costs (More Economical) – Data are distributed in such a way that they are available near to the location where they are needed more. This reduces the communication cost much more compared to a centralized system. 5. Faster response – Most of the data are local and in close proximity to where they are needed. Hence, the requests can be answered quickly compared to a centralized system. 6. Secured management of distributed data – Various transparencies like network transparency, fragmentation transparency, and replication transparency are implemented to hide the actual implementation details of the whole distributed system. In such way, Distributed database provides security for data. 7. Reflects the organizational structure – Normally, database is fragmented into various locations wherever we have controls. 8. Robust – The system is continued to work in case of failures. For example, replicated distributed database performs in spite of failure of other sites. 9. Complied with ACID properties – Distributed transactions demands Atomicity, Consistency, Isolation, and Reliability. 10. Improved performance and Parallelism in executing transactions can be achieved. – Parallel transaction processing saves a lot of time so overall performance increases. Distributed Database System 1. Complex Software – Complex implementation. Costs more in terms of software cost compared to a centralized system. Additional software might be needed in most of the cases over a centralized system. 2. Increased Processing overhead – It costs many messages to be shared between sites to complete a distributed transaction. 3. Data integrity – Data integrity becomes complex. Too much network resources may be used. 4. Different data formats might be used – This may cost time. 5. Deadlock is difficult to handle compared to a centralized system. 6. May cause much more network traffic in case of write operation in a replicated form of distributed database. 7. Distributed System supported Operating System is required to implement distributed database system. 8. The data shared between sites over networks are vulnerable to attack. Hence, network oriented security protocols to be used based on the sensitivity of data shared. 9. More complex in terms database design – According to various applications, we may need to fragment a database, or replicate a database or both. 10. Handling failures is a difficult task. In some cases, we may not distinguish site failure, network partition, and link failure.