Star Schema vs. Snowflake Schema: Data Warehouse Modeling, Study notes of Data Warehousing

An overview of two common ways to organize data marts or entire data warehouses using relational databases: the star schema and the snowflake schema. Both schemas use dimension tables to describe data aggregated in a fact table. The sales model is used as an example to illustrate the differences between the two schemas.

Typology: Study notes

2021/2022

Uploaded on 09/27/2022

anarghya
anarghya 🇺🇸

4.2

(21)

254 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Warehouse Modeling
Star Schema vs. Snowflake Schema
The star schema and the snowflake schema are ways to organize data marts or entire data warehouses using
relational databases. Both of them use dimension tables to describe data aggregated in a fact table.
Everyone sells something, be it knowledge, a product, or a service. Storing this information, either in an
operational system or in a reporting system, is also a need. So we can expect to find some type of sales
model inside the data warehouse of nearly every company.
Let’s take one more look at the sales model in both the star and snowflake schemas.
The Star Schema
pf3

Partial preview of the text

Download Star Schema vs. Snowflake Schema: Data Warehouse Modeling and more Study notes Data Warehousing in PDF only on Docsity!

Data Warehouse Modeling

Star Schema vs. Snowflake Schema

The star schema and the snowflake schema are ways to organize data marts or entire data warehouses using relational databases. Both of them use dimension tables to describe data aggregated in a fact table. Everyone sells something, be it knowledge, a product, or a service. Storing this information, either in an operational system or in a reporting system, is also a need. So we can expect to find some type of sales model inside the data warehouse of nearly every company. Let’s take one more look at the sales model in both the star and snowflake schemas.

The Star Schema

The most obvious characteristic of the star schema is that dimension tables are not normalized. In the model above, the pink fact_sales table stores aggregated data created from our operational database(s). The light blue tables are dimension tables. We decided to use these five dimensions because we need to create reports using them as parameters. The granulation inside each dimension is also determined by our reporting needs. From this model, we can easily see why this schema is called the ‘star schema’: it looks like a star, with the dimension tables surrounding the central fact table.

The Snowflake Schema