Fundamental -- <Three> The Star Schema

Dimension Tables
In a star schema, a dimension table contains columns representing dimensions. As discussed, these columns will provide context for facts. The dimension tables serve to provide the rich context needed for the study of facts. In queries and reports, the dimensions will be used to specify how facts will be rolled up—their level of aggregation. Dimension values may be used to filter reports. They will be used to provide context for each measurement, usually in the form of textual labels that precede facts on each row of a report. They may also be used to drive master-detail relationships, subtotaling, cross-tabulation, or sorts.

The dimension tables are not in third normal form. Remember, though, that a dimensional model serves a different purpose from an ER model. It is not necessary to isolate repeating values in an environment that does not support transaction processing.

 

Keys and history
In a star schema, each dimension table is given a surrogate key. This column is a unique identifier, created exclusively for the data warehouse. Surrogate keys are assigned and maintained as part of the process that loads the star schema. The surrogate key has no intrinsic meaning; it is typically an integer. Surrogate keys are sometimes referred to as warehouse keys. The surrogate key is the primary key of the dimension table.

 

Dimension tables also contain key columns that uniquely identify something in an operational system.

The separation of surrogate keys and natural keys allows the data warehouse to track changes, even if the originating operational system does not. For example, suppose that customer ABC Wholesalers is identified by customer_id 10711 in an operational system. If the customer changes its headquarters location, the operational system may simply overwrite the address for customer_id 10711. For analytic purposes, however, it may be useful to track the history of ABC Wholesalers. Since the star schema does not rely on customer_id to identify a unique row in the customer dimension, it is possible to store multiple versions of ABC Wholesalers, even though both have the same customer_id—10711. The two versions
can be distinguished by different surrogate key values.

=============================================================================================

Fact Tables
At the core of a star schema is the fact table. In addition to presenting the facts, the fact table includes surrogate keys that refer to each of the associated dimension tables.

Each row in the fact table stores facts at a specific level of detail. This level of detail is known as the fact table’s grain.

 

 

posted on 2012-02-27 15:40  Jia  阅读(209)  评论(0编辑  收藏  举报