Star Schema Article Index for
Star
Shopping
Schema
Website Links For
Star
 

Information About

Star Schema




The star schema (sometimes referenced as star join schema) is the simplest style of Data Warehouse Schema , consisting of a few " Fact Table s" (possibly only one, justifying the name) referencing any number of " Dimension Tables ". The "facts" that the data warehouse helps analyze are classified along different "dimensions": the fact tables hold the main data, while the usually smaller dimension tables describe each value of a dimension and can be joined to fact tables as needed.

Dimension tables have a simple Primary Key , while fact tables have a compound Primary Key consisting of the aggregate of relevant dimension keys.

It is common for dimension tables to consolidate redundant data and be in Second Normal Form , while fact tables are usually in Third Normal Form because all data depend on either one dimension or all of them, not on combinations of a few dimensions.

The star schema is a way to implement multi-dimensional database ( MDDB ) functionality using a mainstream Relational Database : given the typical commitment to relational databases of most organization, a specialized multidimensional DBMS is likely to be both expensive and inconvenient.

Another reason for using a star schema is its simplicity from the users' point of view: queries are never complex because the only joins and conditions involve a fact table and a single level of dimension tables, without the indirect dependencies to other tables that are possible in a better normalized Snowflake Schema .


EXAMPLE

A database of sales, perhaps from a store chain, classified by date, store and product: f_sales is the fact table and there are three dimension tables d_date, d_store and d_product.

Each dimension table has a primary key called id, corresponding to a three-column primary key (date_id, store_id, product_id) in f_sales.

Data columns include f_sales.units_sold (and sale price, discounts etc.); d_store.country (and other store address components); d_date.year (and other date components); d_product.category and d_product.brand (and product name etc.).

The following very straightforward query extracts how many TV sets have been sold, for each brand and country, in 1997.

SELECT
d_product.brand, d_store.country, sum (f_sales.units_sold)
FROM
f_sales, d_date, d_store, d_product
WHERE
f_sales.date_id = d_date.id AND f_sales.store_id = d_store.id AND f_sales.product_id = d_product.id
AND d_date.year = 1997
AND d_product.category = 'tv'
GROUP BY
d_product.brand, d_store.country



SEE ALSO



EXTERNAL LINKS