












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
its all about data warehousing
Typology: Slides
1 / 20
This page cannot be seen from the preview
Don't miss anything!













Roll-up. Also known as consolidation, this operation summarizes the data along the dimension. Drill-down. This allows analysts to navigate deeper among the dimensions of data, for example drilling down from "time period" to "years" and "months" to chart sales growth for a product. Slice. This enables an analyst to take one level of information for display, such as "sales in 2017." Dice. This allows an analyst to select data from multiple dimensions to analyze, such as "sales of modem in New Delhi in Q1." Pivot. Analysts can gain a new view of data by rotating the data axes of the cube.
Location Medals New York 2 Patiala 3 LosAngels 5 New Delhi 5 roll upon Location from cities to countries Location Medals India 8 America 7
LosAngels 5 New Delhi 5
Location Medals Patiala 3 New York 2
How data is prepared for OLAP?
Data Warehouse Implementation: Efficient Data cube computation (^) Data cube computation is an essential task in data warehouse implementation. (^) The pre computation of all or part of a data cube can greatly reduce the response time and enhance the performance of on-line analytical processing. (^) At the core of multidimensional data analysis is the efficient computation of aggregations across many sets of dimensions. (^) In SQL terms, these aggregations are referred to as group-by’s. (^) Each group-by can be represented by a cuboid, where the set of group-by’s forms a lattice of cuboids defining a data cube.
The Compute Cube Operator and the Curse of Dimensionality:
One approach to cube computation extends SQL so as to include a compute cube operator. The compute cube operator aggregates over all subsets of the dimensions specified in the operation. (^) This can require excessive storage space, especially for large number of dimensions.
Data Cube Computation A data cube is a lattice of cuboids. Suppose that you want to create a data cube for a company sales that contains the following: city, item, year, and sales in dollars. Query “Compute the sum of sales, grouping by city and item.” “Compute the sum of sales, grouping by city.” “Compute the sum of sales, grouping by item.” What is the total number of cuboids, or
All possible groupby’s
Cube Lattice for a 3-dimensional data cube
Data Cube Computation
The apex cuboid is the most generalized (least specific) of the cuboids, and is often denoted “as all”.
If we start at the apex cuboid and explore downward in the lattice, this is equivalent to drilling down within the data cube.
If we start at the base cuboid and
Data Cube Computation The storage requirements are even more excessive when many of the dimensions have associated concept hierarchies, each with multiple levels. This problem is referred to as the curse of dimensionality. “How many cuboids are there in an n- dimensional data cube?”
19 Efficient Data Cube Computation Data cube can be viewed as a lattice of cuboids ◦ (^) The bottom-most cuboid is the base cuboid ◦ (^) The top-most cuboid (apex) contains only one cell ◦ (^) How many cuboids in an n-dimensional cube with L levels? (^) Materialization of data cube ◦ (^) Materialize every (cuboid) ( full materialization ), none ( no materialization ), or some ( partial materialization ) 1 ) 1 ^ (^ n i i T L
Data Cube Computation
10
6