Aggregating Data Using Group Functions in SQL, Exams of Pre-Calculus

It discusses how to group rows in a table into smaller sets and how to specify search criteria for groups of rows. Unlike single-row functions, group functions ...

Typology: Exams

2022/2023

Uploaded on 03/01/2023

karthur
karthur 🇺🇸

4.8

(8)

230 documents

1 / 26

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
5
5
Aggregating Data
Using Group Functions
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a

Partial preview of the text

Download Aggregating Data Using Group Functions in SQL and more Exams Pre-Calculus in PDF only on Docsity!

Aggregating Data

Using Group Functions

Lesson Aim

This lesson further addresses functions. It focuses on obtaining summary information, such as averages, for groups of rows. It discusses how to group rows in a table into smaller sets and how to specify search criteria for groups of rows.

5-

Objectives

After completing this lesson, you shouldAfter completing this lesson, you should

be able to do the following:be able to do the following:

  • Identify the available group functions
  • Describe the use of group functions
  • Group data using the GROUP BY clause
  • Include or exclude grouped rows by

using the HAVING clause

Group Functions (continued)

Each of the functions accepts an argument. The following table identifies the options that you can use in the syntax:

5-

Types of Group Functions

  • AVG
  • COUNT
  • MAX
  • MIN
  • STDDEV
  • SUM
  • VARIANCE

Function Description AVG([DISTINCT|ALL] n ) Average value of n , ignoring null values COUNT({*|[DISTINCT|ALL] expr }) Number of rows, where expr evaluates to something other than null (Count all selected rows using *, including duplicates and rows with nulls.) MAX([DISTINCT|ALL] expr ) Maximum value of^ expr , ignoring null values

MIN([DISTINCT|ALL] expr ) Minimum value of expr , ignoring null values STDDEV([DISTINCT|ALL] x ) Standard deviation of n , ignoring null values SUM([DISTINCT|ALL] n ) Sum values of n , ignoring null values VARIANCE([DISTINCT|ALL] x ) Variance of n , ignoring null values

Guidelines for Using Group Functions

  • DISTINCT makes the function consider only nonduplicate values; ALL makes it consider every value including duplicates. The default is ALL and therefore does not need to be specified.
  • The datatypes for the arguments may be CHAR, VARCHAR2, NUMBER, or DATE where expr is listed.
  • All group functions except COUNT(*) ignore null values. To substitute a value for null values, use the NVL function.
  • The Oracle Server implicitly sorts the result set in ascending order when using a GROUP BY clause. To override this default ordering, DESC can be used in an ORDER BY clause.

5-

Using Group Functions

SELECT [ column ,] group_function(column) FROM table [WHERE condition ] [GROUP BY column ] [ORDER BY column ];

5-

Using MIN and MAX Functions

You can use MIN and MAX for anyYou can use MIN and MAX for any datatypedatatype..

SQL> SELECT MIN(hiredate), MAX(hiredate) 2 FROM emp ;

MIN(HIRED MAX(HIRED
17-DEC-80 12-JAN-

Group Functions (continued)

You can use MAX and MIN functions for any datatype. The slide example displays the most junior and most senior employee. The following example displays the employee name that is first and the employee name that is the last in an alphabetized list of all employees.

Note: AVG, SUM, VARIANCE, and STDDEV functions can be used only with numeric datatypes.

SQL> SELECT MIN(ename), MAX(ename) 2 FROM emp;

MIN(ENAME) MAX(ENAME)
ADAMS WARD

The COUNT Function

The COUNT function has two formats:

  • COUNT(*)
  • COUNT( expr ) COUNT() returns the number of rows in a table, including duplicate rows and rows containing null values in any of the columns. If a WHERE clause is included in the SELECT statement, COUNT() returns the number of rows that satisfies the condition in the WHERE clause. In contrast, COUNT( expr ) returns the number of nonnull rows in the column identified by expr. The slide example displays the number of employees in department 30.

5-

Using the COUNT Function

COUNT(*)
SQL> SELECT COUNT(*)

2 FROM emp 3 WHERE deptno = 30;

COUNT() returns the number of rows in aCOUNT() returns the number of rows in a

table.table.

5-

Group Functions and Null Values

Group functions ignore null values in theGroup functions ignore null values in the

column.column.

SQL> SELECT AVG(comm) 2 FROM emp;

AVG(COMM)

Group Functions and Null Values

All group functions except COUNT (*) ignore null values in the column. In the slide example, the average is calculated based only on the rows in the table where a valid value is stored in the COMM column. The average is calculated as total commission being paid to all employees divided by the number of employees receiving commission (4).

5-

Using the NVL Function

with Group Functions

The NVL function forces group functionsThe NVL function forces group functions

to include null values.to include null values.

SQL> SELECT AVG(NVL(comm,0)) 2 FROM emp;

AVG(NVL(COMM,0))

Group Functions and Null Values (continued)

The NVL function forces group functions to include null values. In the slide example, the average is calculated based on all rows in the table regardless of whether null values are stored in the COMM column. The average is calculated as total commission being paid to all employees divided by the total number of employees in the company (14).

The GROUP BY Clause

You can use the GROUP BY clause to divide the rows in a table into groups. You can then use the group functions to return summary information for each group. In the syntax: group_by_expression specifies columns whose values determine the basis for grouping rows

Guidelines

  • If you include a group function in a SELECT clause, you cannot select individual results as well unless the individual column appears in the GROUP BY clause. You will receive an error message if you fail to include the column list.
  • Using a WHERE clause, you can preexclude rows before dividing them into groups.
  • You must include the columns in the GROUP BY clause.
  • You cannot use the column alias in the GROUP BY clause.
  • By default, rows are sorted by ascending order of the columns included in the GROUP BY list. You can override this by using the ORDER BY clause.

5-

Creating Groups of Data:

GROUP BY Clause

SELECT column , group_function(column) FROM table [WHERE condition ] [GROUP BY group_by_expression ] [ORDER BY column ];

Divide rows in a table into smaller groups Divide rows in a table into smaller groups

by using the GROUP BY clause.by using the GROUP BY clause.

The GROUP BY Clause (continued)

When using the GROUP BY clause, make sure that all columns in the SELECT list that are not in the group functions are included in the GROUP BY clause. The example on the slide displays the department number and the average salary for each department. Here is how this SELECT statement, containing a GROUP BY clause, is evaluated:

  • The SELECT clause specifies the columns to be retrieved:
    • Department number column in the EMP table
    • The average of all the salaries in the group you specified in the GROUP BY clause
  • The FROM clause specifies the tables that the database must access: the EMP table.
  • The WHERE clause specifies the rows to be retrieved. Since there is no WHERE clause, by default all rows are retrieved.
  • The GROUP BY clause specifies how the rows should be grouped. The rows are being grouped by department number, so the AVG function that is being applied to the salary column will calculate the average salary for each department.

5-

Using the GROUP BY Clause

All columns in the SELECT list that are notAll columns in the SELECT list that are not

in group functions must be in the GROUPin group functions must be in the GROUP

BY clause.BY clause.

SQL> SELECT deptno, AVG(sal) 2 FROM emp 3 GROUP BY deptno;

DEPTNO AVG(SAL)

5-

Grouping by More

EMPEMP Than One Column

““sum salaries insum salaries in the EMP tablethe EMP table for each job,for each job, grouped bygrouped by department”department”

DEPTNO JOB SAL


10 MANAGER 2450 10 PRESIDENT 5000 10 CLERK 1300 20 CLERK 800 20 CLERK 1100 20 ANALYST 3000 20 ANALYST 3000 20 MANAGER 2975 30 SALESMAN 1600 30 MANAGER 2850 30 SALESMAN 1250 30 CLERK 950 30 SALESMAN 1500 30 SALESMAN 1250

JOB SUM(SAL)


CLERK 1300 MANAGER 2450 PRESIDENT 5000 ANALYST 6000 CLERK 1900 MANAGER 2975 CLERK 950 MANAGER 2850 SALESMAN 5600

DEPTNO

10 10 10 20 20 20 30 30 30

Groups Within Groups

Sometimes there is a need to see results for groups within groups. The slide shows a report that displays the total salary being paid to each job title, within each department. The EMP table is grouped first by department number, and within that grouping, it is grouped by job title. For example, the two clerks in department 20 are grouped together and a single result (total salary) is produced for all salespeople within the group.

5-

Using the GROUP BY Clause

on Multiple Columns

SQL> SELECT deptno, job, sum(sal) 2 FROM emp 3 GROUP BY deptno, job;

DEPTNO JOB SUM(SAL)
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
20 ANALYST 6000
20 CLERK 1900

9 rows selected.

Groups Within Groups (continued)

You can return summary results for groups and subgroups by listing more than one GROUP BY column. You can determine the default sort order of the results by the order of the columns in the GROUP BY clause. Here is how the SELECT statement on the slide, containing a GROUP BY clause, is evaluated:

  • The SELECT clause specifies the column to be retrieved:
    • Department number in the EMP table
    • Job title in the EMP table
    • The sum of all the salaries in the group that you specified in the GROUP BY clause
  • The FROM clause specifies the tables that the database must access: the EMP table.
  • The GROUP BY clause specifies how you must group the rows:
    • First, the rows are grouped by department number.
    • Second, within the department number groups, the rows are grouped by job title. So the SUM function is being applied to the salary column for all job titles within each department number group.

5-

Illegal Queries

Using Group Functions

  • You cannot use the WHERE clause to restrict

groups.

  • You use the HAVING clause to restrict groups.

SQL> SELECT deptno, AVG(sal) 2 FROM emp 3 WHERE AVG(sal) > 2000 4 GROUP BY deptno;

WHERE AVG(sal) > 2000

ERROR at line 3: ORA-00934: group function is not allowed here

Cannot use the WHERE clauseCannot use the WHERE clauseto restrict groupsto restrict groups

Illegal Queries Using Group Functions (continued)

The WHERE clause cannot be used to restrict groups. The SELECT statement on the slide results in an error because it uses the WHERE clause to restrict the display of average salaries of those departments that have an average salary greater than $2000. You can correct the slide error by using the HAVING clause to restrict groups. SQL> SELECT deptno, AVG(sal) 2 FROM emp 3 GROUP BY deptno 4 HAVING AVG(sal) > 2000;

DEPTNO AVG(SAL)


10 2916. 20 2175

5-

Excluding Group Results

““maximummaximum salarysalary per departmentper department greater thangreater than $2900”$2900”

EMPEMP

DEPTNO SAL


10 2450 10 5000 10 1300 20 800 20 1100 20 3000 20 3000 20 2975 30 1600 30 2850 30 1250 30 950 30 1500 30 1250

DEPTNO MAX(SAL)


10 5000 20 3000

Restricting Group Results

In the same way that you use the WHERE clause to restrict the rows that you select, you use the HAVING clause to restrict groups. To find the maximum salary of each department, but show only the departments that have a maximum salary of more than $2900, you need to do the following:

  • Find the average salary for each department by grouping by department number.
  • Restrict the groups to those departments with a maximum salary greater than $2900.