Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Denormalization - Data Warehousing - Lecture Handouts, Lecture notes of Data Warehousing

Quaid-i-Azam University (QAU)Data Warehousing

Denormalization, Striking a balance between normalization and denormalization, Denormalization in DSS, Guidelines for Denormalization, Areas for Applying Denormalization Techniques are discusses points of document. This handout is for data warehousing course from Virtual University of Pakistan.

Typology: Lecture notes

2011/2012

Uploaded on 11/06/2012

ahsen 🇵🇰

4.6

(88)

84 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

Data Warehousing Course Code: CS614

[email protected]

Virtual University of Pakistan

Lecture Handout

Data Warehousing

Lecture No. 07

De-Normalization

Striking a balance between “good” & “evil”

Figure-7.1: Striking a balance between normalization and de-normalization

There should be a balance between normalized and de-normalized forms. In a fully

normalized form, too many joins are required and in a totally de-normalized form, we

have a big, wide single table. Database should be aligned someplace in between so as to

strike a balance, especially in the context of the queries and the application domain.

A “pure” normalized design is a good starting point for a data model and is a great thing

to do and achieve in academia. However, as briefly mentioned in the previous lecture, in

Discover Lecture notes of Data Warehousing Quaid-i-Azam University (QAU)

Partial preview of the text

Download Denormalization - Data Warehousing - Lecture Handouts and more Lecture notes Data Warehousing in PDF only on Docsity!

[email protected]

Virtual University of Pakistan

Lecture Handout

Data Warehousing

Lecture No. 07

De-Normalization

Striking a balance between “good” & “evil”

Figure-7.1: Striking a balance between normalization and de-normalization

There should be a balance between normalized and de-normalized forms. In a fully normalized form, too many joins are required and in a totally de-normalized form, we have a big, wide single table. Database should be aligned someplace in between so as to strike a balance, especially in the context of the queries and the application domain.

A “pure” normalized design is a good starting point for a data model and is a great thing to do and achieve in academia. However, as briefly mentioned in the previous lecture, in

[email protected]

Virtual University of Pakistan

the reality of the “real world”, the enhancement in performance delivered by some selective de-normalization technique can be a very valuable tool. The key to success is to undertake de-normalization as a design technique very cautiously and consciously. Do not let proliferation of the technique take over your data warehouse or you will end up with a single big flat file!

What is De-Normalization?

It is not chaos, more like a “controlled crash” with the aim of performance enhancement without loss of information.

Normalization is a rule of thumb in DBMS, but in DSS ease of use is achieved by way of denormalization.

De-normalization comes in many flavors, such as combining tables, splitting tables, adding data etc., but all done very carefully.

‘Denormalization’ does not mean that anything and everything goes. Denormalization does not mean chaos or disorder or indiscipline. The development of properly denormalized data structures follows software engineering principles, which insure that information will not be lost. De-normalization is the process of selectively transforming normalized relations into un-normalized physical record specifications, with the aim of reducing query processing time. Another fundamental purpose of denormalization is to reduce the number of physical tables, that must be accessed to retrieve the desired data by reducing the number of joins required to answer a query. Some people tend to confuse dimensional modeling with de-normalization. This will become clear when we will cover dimensional modeling, where indeed tables are collapsed together.

Why De-Normalization in DSS?

[email protected]

Virtual University of Pakistan

consider whether the physical records should exactly match the normalized relations for a DSS or not?

How De-Normalization improves performance?

De-normalization specifically improves performance by either:

Reducing the number of tables and hence the reliance on joins, which consequently speeds up performance.

Reducing the number of joins required during query execution, or

Reducing the number of rows to be retrieved from the Primary Data Table.

The higher the level of normalization, the greater will be the number of tables in the DSS as the depth of snowflake schema would increase. The greater the number of tables in the DSS, the more joins are necessary for data manipulation. Joins slow performance, especially for very large tables for large data extractions, which is a norm in DSS not an exception. De-normalization reduces the number of tables and hence the reliance on joins, which consequently speeds up performance.

De-normalization can help minimize joins and foreign keys and help resolve aggregates. By storing values that would otherwise need to be retrieved (repeatedly), one may be able to reduce the number of indexes and even tables required to process queries.

4 Guidelines for De-normalization

Carefully do a cost-benefit analysis (frequency of use, additional storage, join time).
1. Do a data requirement and storage analysis.

[email protected]

Virtual University of Pakistan

Weigh against the maintenance issue of the redundant data (triggers used).
When in doubt, don’t denormalize. Guidelines for Denormalization:- Following are some of the basic guidelines to help determine whether it's time to denormalize the DSS design or not:

1. Balance the frequency of use of the data items in question, the cost of additional storage to duplicate the data, and the acquisition time of the join.

2. Understand how much data is involved in the typical query; the amount of data affects the amount of redundancy and additional storage requirements.

3. Remember that redundant data is a performance benefit at query time, but is a performance liability at update time because each copy of the data needs to be kept up to date. Typically triggers are written to maintain the integrity of the duplicated data.

4. De-normalization usually speeds up data retrieval, but it can slow the data modification processes. It may be noted that both on-line and batch system performance is adversely affected by a high degree of de-normalization. Hence the golden rule is: When in doubt, don’t denormalize.

Areas for Applying De-Normalization Techniques

Dealing with the abundance of star schemas.

Fast access of time series data for analysis.

Fast aggregate (sum, average etc.) results and complicated calculations.

[email protected]

Virtual University of Pakistan

Splitting Tables (Horizontal/Vertical Splitting).
Adding Redundant Columns (Reference Data).
Derived Attributes (Summary, Total, Balance etc).

Now we will discuss de-normalization techniques that have been commonly adopted by experienced database designers. These techniques can be classified into four prevalent strategies for denormalization which are:

Collapsing Tables.

Two entities with a One-to-One relationship.
Two entities with a Many-to-Many relationship.

Pre-joining.
Splitting Tables (Horizontal/Vertical Splitting).
Adding Redundant Columns (Reference Data).
Derived Attributes (Summary, Total, Balance etc).

Collapsing Tables

[email protected]

Virtual University of Pakistan

Figure-7.1: Collapsing Tables

1. Collapsing Tables One of the most common and safe denormalization techniques is combining of One-to- One relationships. This situation occurs when for each row of entity A, there is only one related row in entity B. While the key attributes for the entities may or may not be the same, their equal participation in a relationship indicates that they can be treated as a single unit. For example, if users frequently need to see COLA, COLB, and COLC together and the data from the two tables are in a One-to-One relationship, the solution is to collapse the two tables into one. For example, SID and gender in one table, and SID and degree in the other table.

In general, collapsing tables in One-to-One relationship has fewer drawbacks than others. There are several advantages of this technique, some of the obvious ones being reduced storage space, reduced amount of time for data update, some of the other not so apparent advantages are reduced number of foreign keys on tables, reduced number of indexes (since most indexes are created based on primary/foreign keys). Furthermore, combining

Denormalization - Data Warehousing - Lecture Handouts, Lecture notes of Data Warehousing

Related documents

Partial preview of the text

Download Denormalization - Data Warehousing - Lecture Handouts and more Lecture notes Data Warehousing in PDF only on Docsity!

Lecture Handout

Data Warehousing

Lecture No. 07

De-Normalization