Distributed Databases

A distributed database is one stored at multiple locations on a network that usually appear to users and applications as a single database. In a distributed database system a user, usually via the application interface, can seamlessly, subject to permissions and locks, access and modify the data in the databases at any network location.

In my experience distributed database systems are used in large, decentralised, organisations that have the financial resources to put everything in place, including the expensive technical and management expertise, necessary to provide a positive benefit versus cost.

The advantages of a distributed database system are largely for decentralised organisations as they are generally structured by business function (e.g. accounts, sales, etc.) and therefore the database can be designed and distributed accordingly improving performance, as data can be located closer (in terms of communication speed/bandwidth, not geography), autonomy for the business units in question as they can manage their own requirements without impinging on other business units (unless they use the same sections of data) and if other unrelated data systems or communication lines fail they can continue to work. In addition, as data can be replicated across a distributed system and adding new modules is relatively simple both reliability and business continuity gain positively.

The disadvantages of a distributed database system are largely associated with complexity and hence cost. As distributed systems are largely bespoke (i.e. they are not standard based designs), the success can rest on the experience of the designer and their ability to understand the organisation, fragment the data implement an efficient solution. Such experience is rarer to find and hence more expensive to implement. In addition, as a distributed database system usually relies on external carriers (communication line, virtualisation, outsourcing, etc.) security becomes an issue and performance can suffer – not to the extent that it negates the performance advantage from having data located near demand. The resulting implementation and systems management costs, despite per user and local computing costs being generally lower as the organisation does not have to provide a single powerful system, are often a barrier to implementation to all but the largest of installations.

In conclusion, although the industry has yet to see widespread adoption of a general purposes distributed database system, in my opinion and experience the advantages of a distributed database system vastly outweigh the disadvantages in a large scale, multi-site, complex organisation if the installation is provided with the financial and experience resources necessary to make it a success.

References

Coronel, Morris & Rob (2009) Database Systems: Design, Implementation, and Management (9th Edition). Cengage Learning.

Oracle (2001) 29 Distributed Database Concepts [Online]. Available at http://www.stanford.edu/dept/itss/docs/oracle/10g/server.101/b10739/ds_concepts.htm (Accessed 09 May 2010).

Tamma, V (2001) University of Liverpool COMP332 Course Notes: Connolly & Begg Distributed Databases [Online]. Available at http://www.csc.liv.ac.uk/~dirk/Comp332/COMP332-DDB-notes.pdf (Accessed 09 May 2010).