0

# Group Functions

January 1, 2022
Share

Welcome to this section of group functions. Unlike single-row functions, group functions or sometimes referred to as aggregate functions work on multiple rows and return a single value. Don’t worry if it still sounds awkward ? . Lets use an example. Assuming we need to know the total salary spent on employees in an organisation, what we do is to add every salary amount and we get the total. That’s what maths gives us at least. We had to add values from all rows of salary and we get one single value (total). This is exactly how group functions operate. They operate on multiple rows and return a single value. Don’t get surprised when we say they are used to get stuff like the average, total, maximum, minimum, etcetera as we are going to see.

There are many group functions in Oracle but we are going to look at the most common ones. Just like the single-row functions, they also take parameters or arguments. Most of them take a single parameter. Group functions do not consider NULL values except for the COUNT function. This could sometime be a big trouble when misleading results are obtained ? . To solve this, we can use the NVL function to substitute NULL values with some value like 0. Group functions can be used in the ORDER BY clause as we shall be seeing. For now, let have a look at them.

#### SUM

Its syntax is SUM(x). This returns the total of x. Lets use the employees table to get the total salary spent on employees.

``SELECT SUM(salary) AS "TOTAL SALARY" FROM employees;``

#### COUNT

Its syntax is COUNT(x). This returns the number of rows in a query. Lets count the number of departments using the departments table. In the query below, we are using the * to count every row in the departments table.

``SELECT COUNT(*) AS "DEPARTMENTS" FROM departments;``

The COUNT function counts every row including the NULLS and duplicate values. We need to use the DISTINCT keyword to remove the duplicate values from our counting. Lets use our previous example on the employees table to count the number of departments (using the department_id column).

``SELECT COUNT(department_id) AS "DEPARTMENTS" FROM employees;``

In this example, 107 is the answer because COUNT considered every row in the employees table. (There are 107 employees). So it counted all the department ids for each employee but some employees belong to the same department. That means, it didn’t give us the actual number of departments. To fix this, lets modify our query to include the keyword DISTINCT to remove the duplicate values. You can read more about DISTINCT on this page.

``SELECT COUNT(DISTINCT department_id) AS "DEPARTMENTS" FROM employees;``

You can see that the answer changed to 11 because the COUNT function now considered only unique department ids.

#### AVG

Its syntax is AVG(x). This returns the average value in the range. Lets find the average salary of employees using the employees table.

``SELECT AVG(salary) AS "AVERAGE SALARY" FROM employees;``

#### MAX

Its syntax is MAX(x). This returns the highest value in the range. It works with numbers to return the largest number, dates to return the latest date and also with string characters (A-Z) to return the last value if values are sorted in alphabetical order. Lets find the highest salary, latest date and the last name in the employees table.

``SELECT MAX(salary) AS "HIGHEST SALARY" FROM employees;``
``SELECT MAX(hire_date) AS "LATEST DATE" FROM employees;``
``SELECT MAX(first_name) AS "LAST NAME" FROM employees;``

#### MIN

Its syntax is MIN(x). This returns the lowest value in the range. It works with numbers to return the smallest number, dates to return the earliest date and also with string characters (A-Z) to return the first value if values are sorted in alphabetical order. Lets find the lowest salary, earliest date and the first name in the employees table.

``SELECT MIN(salary) AS "LOWEST SALARY" FROM employees;``
``SELECT MIN(hire_date) AS "EARLIEST DATE" FROM employees;``
``SELECT MAX(first_name) AS "FIST NAME" FROM employees;``

So far so good. If you’re wondering how we can get the total salary spent on each department or even the average salary per department or even the minimum salary per department. This is where the “parry” starts. This introduces us to data grouping which is one the most exciting facts about group functions.

## GROUPING DATA—GROUP BY

Whenever we use group functions to group data, the keyword GROUP BY is used. It allows us to inform the database on how data should be grouped. For example, we need the total salary spent per department. Oracle groups the total salary returned for each department. Therefore, GROUP BY helps in grouping the results of a query using a certain criteria. The GROUP BY comes after the table name and the WHERE clause (if any) and before the ORDER BY clause of the SELECT statement.

``````SELECT column_name FROM table_name
{WHERE condition}
GROUP BY columns
{ORDER BY columns}``````

Let us get the following examples to understand how to group data using GROUP BY keyword.

Example: Assuming we need to know total salary spent on employees per department. Remember the SUM function we’ve just learnt about? We’re going to use that alongside the department id column in the employees table to get the result.

``SELECT department_id, SUM(salary) AS TOTAL_SALARY FROM employees GROUP BY department_id;``

From the output above, we can see that results are grouped per each department. This is the power of GROUP BY.

Another example could be from the employees table again. We need to know the minimum and maximum salary for the clerk job.

``SELECT job_id, MIN(salary), MAX(salary) FROM employees WHERE job_id LIKE '%CLERK%' GROUP BY job_id;``

We can sort the results using group function columns.

For example, lets sort our previous query using the total number of employees.

``SELECT department_id, SUM(salary) AS TOTAL_SALARY FROM employees GROUP BY department_id ORDER BY SUM(salary);``

GROUP BY keyword doesn’t support the use of column alias or column positions. Therefore, the query below returns an error.

Example: We need to the occurrence of each first name in the database. We can still use the count for this.

``SELECT first_name, COUNT(*) "Number of Occurrences" FROM employees GROUP BY 1;``
``SELECT first_name, COUNT(*) "Number of Occurrences" FROM employees GROUP BY first_name;``

Whenever we use group functions in a SELECT statement along with non-grouping columns (those that don’t have group functions), you MUST place all non-grouping columns in the GROUP BY clause.

Example: We need to how many employees in each job position per department. We are going to use job_id and department_id columns of the employee table.

``SELECT department_id, job_id, COUNT(*) FROM employees GROUP BY department_id;``

This returns an error. Why? You guessed it. We have two non-grouping columns in the SELECT clause (department_id and job_id) but in the GROUP BY clause, we’re only including department_id. So how do we fix this? Very good of you ? . We MUST place all non-grouping columns in the GROUP BY clause.

``SELECT department_id, job_id, COUNT(*) FROM employees GROUP BY department_id, job_id;``

Good. Let us also get the average salary for each job position for departments 80, 100 and 50.

``SELECT department_id, job_id, AVG(salary) AS AVERAGE_SALARY FROM employees WHERE department_id IN (80, 50, 100) GROUP BY department_id, job_id;``

We can also use single row functions within group functions. For example, let us round off the average salary in the previous example.

``SELECT department_id, job_id, ROUND(AVG(salary)) AS AVERAGE_SALARY FROM employees WHERE department_id IN (80, 50, 100) GROUP BY department_id, job_id;``

## Filtering Grouped Data—HAVING

Assuming we need to know the departments whose total salary is above 100,000. Let us provide the solution for this.

``SELECT department_id, SUM(salary) FROM employees WHERE SUM(salary) > 100000 GROUP BY department_id;``

This returns an ORA-00934 error. Why? The WHERE clause doesn’t filter grouped data. In brief, the WHERE clause doesn’t work with group functions. To solve this, Oracle offers another keyword called HAVING.

HAVING clause is used to filter grouped datasets. It comes after the GROUP BY clause and before the ORDER BY clause. So, let us modify our previous query.

``SELECT department_id, SUM(salary) FROM employees GROUP BY department_id HAVING SUM(salary) > 100000;``

From the output above, we can see that whenever we need to filter grouped data, we use HAVING clause and not the WHERE clause. However, it doesn’t mean we can’t use WHERE clause in the same statement. We can use the WHERE clause to filter other non-grouped conditions.

For example, Let us also get the job positions whose average salary is between 2000 and 3000 in departments 80, 50 and 100.

``SELECT department_id, job_id, AVG(salary) AS AVERAGE_SALARY FROM employees WHERE department_id IN (80, 50, 100) GROUP BY department_id, job_id HAVING AVG(salary) BETWEEN 1000 AND 4000;``

We can’t look at all functions in SQL (Oracle) because they’re too many. We have managed to look at a few of them we think are so important. In case you need more, you can look at all functions in Oracle here.

## Summary

1. Group functions return a single value from multiple rows.

2. Group functions can be in the SELECT, ORDER BY, GROUP BY and HAVING clauses of the SELECT statement.

3. They work with GROUP BY clause for grouping data sets.

4. All non-grouping columns MUST be included in the GROUP BY clause.

5. HAVING clause is used for filtering grouped data.