Star Schemas

Star schemas are a common data model used in data marts and warehouse databases where a central table of “facts” is used which are viewed in terms of dimensions (a perspective on a given fact, such as time) and each dimension table contains the attributes of that dimension. For example, if a fact was a transaction to sell a product, then dimensions could be the product description, the customer details, the date of the sale, etc. where a primary key exists for each dimension and queries are conducted by joining the fact and the dimension tables on these primary and any foreign keys.

There are four usual ways to improve performance in star schemas (Coronel et al, 2009):

  1. Normalisation of Dimensional Tables – by simplifying dimensional relationships to third normal form, end-user navigation through the system is simpler and the system has to present fewer records to the user thereby increasing performance; however this can increase the complexity of queries by having to join multiple tables which, if such as query is often run, may not be a performance compromise the designer is prepared to make.
  2. Multiple Fact Tables – in order to save processor cycles at run time, tables with levels of aggregation (e.g. city, county, country) can be split by these levels of aggregation, ensuring that the data is loaded before run time. The designer of the database needs to carefully consider the data usage in this respect to ensure that only those aggregates that are commonly used are maintained separately to achieve this performance enhancement.
  3. De-Normalising Fact Tables – improves data access performance and saves data storage space by allowing faster access in reporting conditions.
  4. Partitioning and Replication of Tables – this applies when working with geographically disperse tables as partitioning splits table into subsets of rows or columns and allows the designer to place the subset of data locally to the client machine intending to use the data, thereby removing system bottlenecks which improves access time.

References

Coronel, Morris & Rob (2009) Database Systems: Design, Implementation, and Management (9th Edition). Cengage Learning.