Data Warehouse v Data Mart

The data warehouse versus data mart debate is driven by benefit to the organisation and is often evaluated on the basis of scale. A data mart represents a subset of data whereas a data warehouse represents the organisational or enterprise view as a whole. Essentially the data warehouse represents every information aspect regarding the content of the databases subjects for the whole organisation (e.g. customers, vendors, employees, sales, etc.) and a data marts is a subset of this data which is used for a particular purpose (e.g. the marketing department would only use data for customers and sales). This implies that, from an enterprise point of view, a data warehouse is designed from a top-down perspective and a data mart from a bottom-up perspective. Top-down design requires a great deal of time and resources to be put into business modelling, design and build whereas, in the respect of data marts, fast returns can be made at departmental level.

Bill Inmon and Ralph Kimball present the two different paradigms of data warehousing with Inmon stipulating that “An enterprise has one data warehouse, and data marts source their information from the data warehouse” whereas Kimball states “Data warehouse is the conglomerate of all data marts within the enterprise”.

Kimball’s paradigm is often cited as the one that more closely reflects the reality of data warehouses in the industry as many data warehouses began as data marts that used subsets of data to address a departmental need and as more departments did this and the enterprise developed a collection of data marts, the system becomes a data warehouse. However, Kimball’s paradigm implies a bottom-up approach to database design and therefore inherits the problems associated with this approach; have the subsets been designed to address organisational needs as a whole and is it possible to use the data marts developed to achieve these higher goals? This is the greatest problem with data marts that grow into enterprise level data warehouses and this presents computing professionals with further challenges:

“IT must wrestle with and overcome the time and standardization issues if they are to build a flexible, expansive, data warehousing architecture that meets the future needs of the organization.” Griffin (1998)

The subject of choice between data warehouse versus data mart is therefore a question of future scalability (and having the experience on hand to deliver the solution) when considering the resources and time required to provide cost benefit to the business; an extremely difficult choice for large scale database installations.

References

1keydata.com (2010) Bill Inmon vs. Ralph Kimball [Online]. Available at http://www.1keydata.com/datawarehousing/inmon-kimball.html (Accessed 23 May 2010).

Coronel, Morris & Rob (2009) Database Systems: Design, Implementation, and Management (9th Edition). Cengage Learning.

Griffin, J (1998) Information Management Magazine: Data Mart vs. Data Warehouse [Online]. Available at http://www.information-management.com/issues/19980201/815-1.html (Accessed 23 May 2010).