Aggregate Functions - Data Warehousing - Lecture Slide, Slides of Data Warehousing

Some concept of Data Warehousing are Aggregate Functions, Applications and Trends in Data Mining, Classification and Prediction, Cluster Analysis, Data Mining Primitives, Data Warehousing Design. Main points of this lecture are: Aggregate Functions , Values InAttribute, Department, Foreign Key References, Salary DNO, Maximum Salary, Average Salary, Minimum Salary, Employees Working, Database

Typology: Slides

2012/2013

Uploaded on 04/25/2013

khushia
khushia 🇮🇳

4.1

(10)

110 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Warehousing/Mining
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Aggregate Functions - Data Warehousing - Lecture Slide and more Slides Data Warehousing in PDF only on Docsity!

Data Warehousing/Mining

Aggregate Functions in SQL

Aggregation is an operation that computes a single

value from all the values of an attribute.

 SQL provides five functions that apply to an attribute

of a relation and produce some aggregation of that column.

  • SUM : Computes the sum of values in a attribute.
  • AVG : Computes the average of values in a attribute.
  • MIN : Computes the least value in a attribute.
  • MAX : Computes the greatest value in a attribute.
  • COUNT : Computes the number of values in a attribute (including duplicates unless they are explicitly eliminated with DISTINCT ).

Example Database

  •  id1 idJohnlc 45,000  ENUMBER NAME SALARY DNO
  •  id2 idMarylc 50,000
  •  id3 idNicklc 42,000
  •  id4 idPaullc 43,000
  •  id5 idLaurale 55,000
  •  id6 idAndreals 31,000
  •  id7 idBrianll 25,000
  •  id8 idAlonll 26,000

Aggregate Functions in SQL

(cont)

 Query: Find the sum of the salaries of all

employees, the maximum salary, the

minimum salary and the average salary.

 This query can be expressed in SQL as

follows:

– SELECT SUM (SALARY), MAX (SALARY), MIN

(SALARY), AVG (SALARY) FROM EMPLOYEE;

 This query will return the following relation:

SUM(SALARY) MAX(SALARY) MIN(SALARY) AVG(SALARY)

317,000 55,000 25,000 39,

Aggregate Functions in SQL

(cont)

 Query: Retrieve the total number of

employees in the “Research” department.

 This query can be expressed in SQL as

follows:

– SELECT COUNT (*) FROM EMPLOYEE,

DEPARTMENT WHERE DNO=DNUMBER AND

DNAME='Research';

 Warning: Only the aggregate function

COUNT is allowed to apply to whole tuples. It

does not make sense to apply any other

aggregate functions to more than a single

attribute. Docsity.com

Aggregate Functions in SQL

(cont)

 Query: Count the number of distinct salary

values in the database.

 This query can be expressed in SQL as

follows:

– SELECT COUNT ( DISTINCT SALARY) FROM

EMPLOYEE;

 What would the effect of COUNT (SALARY)

in the above query be?

The GROUP BY Clause (cont)

Query: For each department, retrieve the department

number, the number of employees in the department and their average salary.

 This query can be expressed in SQL as follows:

  • SELECT DNO, COUNT (*), AVG (SALARY) FROM EMPLOYEE GROUP BY DNO;

 The result of this query will be:

 DNO COUNT(*) AVG(SALARY)

 5 4 36,

 4 3 39,

 1 1 55,

The GROUP BY Clause (cont)

 The following query shows how to use a

GROUP BY in conjunction with JOIN.

 Query: For each project, retrieve the project

number, the project name and the number of

employees who work on the project.

 This query can be expressed in SQL as

follows:

– SELECT PNUMBER, PNAME, COUNT (*) FROM

PROJECT, WORKS ON WHERE

PNUMBER=PNO GROUP BY PNUMBER,

PNAME

 The grouping and aggregation are applied

after joining the relations. Docsity.com

The GROUP BY Clause (cont)

 It is possible to use a GROUP BY clause in

conjunction with a SELECT clause that does

not use any aggregation function:

– SELECT SALARY FROM EMPLOYEE GROUP

BY SALARY

 Has the same effect as:

– SELECT DISTINCT SALARY FROM EMPLOYEE

The HAVING Clause

 Sometimes we want to choose groups of

tuples based on some aggregate property of

the group itself. In this case we have to use

the HAVING clause together with the GROUP

BY clause.

 The syntax of the HAVING clause is:

  • HAVING < condition >  where < condition > is a Boolean expression formed by comparison conditions as in the WHERE clause.

The HAVING Clause (cont)

 The result of this query is:

 PNUMBER PNAME COUNT(*)

 2 ProductY 3

 10 Computerization 3

 20 Reorganization 3

 30 Newbenets 3

Interpreting SQL Queries

 The result of an SQL query involving aggregate functions, GROUP BY and HAVING can be computed as follows:

 1. Evaluate the relation R implied by the FROM and WHERE clauses. R is the Cartesian product of the relations specified in the FROM clause, to which the selection of the WHERE clause is applied.

 2. Group the tuples of R according to the attributes in the GROUP BY clause.

 3. Filter out the tuples of R not satisfying the condition of the HAVING clause to compute a new relation R.

 4. Apply to R the projections and aggregations specified in the SELECT clause to compute the final result. Docsity.com

Interpreting SQL Queries

 The correct formulation of the query can be

expressed in SQL as follows:

– SELECT DNAME, COUNT (*) FROM

DEPARTMENT, EMPLOYEE WHERE

DNUMBER=DNO AND SALARY > 40000 AND

DNO IN ( SELECT DNO FROM EMPLOYEE

GROUP BY DNO HAVING COUNT (*) > 2 )

GROUP BY DNAME;