**SQL Subquery, also known as inner query or nested query, are used to query data from one or more tables and then used in another SQL statement. They are an integral part of SQL since they allow complex database queries to be handled in a simpler way.**

A subquery can be used in the `SELECT`

, `INSERT`

, `UPDATE`

, or `DELETE`

statement or inside another subquery. They are usually placed within parentheses and located in the `WHERE`

or `HAVING`

clause.

Let’s take an example to illustrate the concept of subqueries.

**Also Read | SQL Tutorial – A Simple and Intuitive Guide**

In this lesson we will cover:

- Understanding the need for SQL Subquery with an example
- Solution Approach
- Sub Query – Tougher example 1
- Solution using a separate table (without subquery)
- Solution using Sub Query Approach
- Sub Query – Tougher Example 2
- Solve by applying subquery

## 1. Understanding the need for SQL Subqueries

There are certain situations where you will absolutely need to write a Subquery. This will be clear if you understand with an example.

Consider the following three tables, the contents of which is shown below in code blocks.

**Employees****Departments****Sales**

Question: **Find out the employees who earn more than the overall average salary.**

These tables can be created in SQLite DB using this schema.

**Let’s print the contents of the tables**

The `Employees table`

```
SELECT * from Employees;
```

ID | Name | Salary | DepartmentID |
---|---|---|---|

1 | Alice | 35000 | 1 |

2 | Bob | 40000 | 2 |

3 | Carol | 55000 | 3 |

4 | David | 60000 | 2 |

5 | Ethan | 70000 | 1 |

The `Departments`

table

```
SELECT * from Departments;
```

ID | Name |
---|---|

1 | HR |

2 | Finance |

3 | Marketing |

The `Sales`

table.

```
SELECT * from Sales;
```

EmployeeID | Product | Quantity |
---|---|---|

1 | A | 10 |

1 | B | 5 |

2 | A | 7 |

2 | C | 3 |

3 | B | 8 |

3 | C | 2 |

4 | A | 9 |

4 | B | 6 |

5 | C | 4 |

5 | A | 3 |

Remember, we want to find out the employees who earn *more than* the overall average salary.

## 2. Solution Approach

To solve this, we select the `name`

and `salary`

from `Employees`

table. But we need only the employees salaries that are more than the average.

So the plan is to use the `WHERE`

condition, to filter those observations where salary is more than average.

Let’s try it out.

```
SELECT Name, Salary
FROM Employees E1
WHERE Salary > AVG(Salary);
```

But this does NOT work!

This operation errors out giving an operational error – “Misuse of AVG(Salary) function”.

**Why?**

Since AVG() is a group function, **you can only use it in a SELECT statement or in a HAVING clause.**

This is quite important to understand.

So, how do we tackle this?

We need to use Subquery.

This can be done using two approaches.

**Approach 1:** Create query to compute the average and store in a new table. Then, reference it as a subquery.

**Approach 2:** Direct sub query

Let’s see both approaches.

**APPROACH 1:**

<--PAUSE AND THINK ABOUT HOW YOU WOULD DO APPROACH 1 -->

Use this link to solve hands-on.

- Create a new table that will contain the average salary.

```
DROP TABLE IF EXISTS AVGSALARY;
CREATE TABLE AVGSALARY AS
SELECT AVG(salary) as AVG_SALARY
FROM EMPLOYEES;
SELECT * FROM AVGSALARY;
```

AVG_SALARY |
---|

52000.0 |

Now, let’s write the query in the WHERE condition to check if the `SALARY > AVG(SALARY)`

.

```
SELECT Name, Salary
FROM Employees E1
WHERE Salary > (SELECT AVG_SALARY from AVGSALARY);
```

Name | Salary |
---|---|

Carol | 55000 |

David | 60000 |

Ethan | 70000 |

**APPROACH 2: Directly subquery from the employees table**

Alternately, you can directly subquery it instead of storing the average value in a separate table.

```
SELECT Name, Salary
FROM Employees E1
WHERE Salary > (SELECT AVG(SALARY) from Employees);
```

Name | Salary |
---|---|

Carol | 55000 |

David | 60000 |

Ethan | 70000 |

Hope you got the idea behind the need for Subqueries. Now, let’s look at a slightly tougher example.

In above example, we selected employees whose salary was above overall average. *What if you want to select employees whose salary is above their respective department average?*

## 3. Sub Query – Tougher example 1

**Task:** Find out the employees who earn more than (or equal to) the average salary *in their respective department*.

To do this, we select the name and salary from Employees table. But we need only the employees salaries that are more than the average *in their respective department..*

So in WHERE condition, we need to filter those observations where salary is more than the department average and NOT the total average.

Let’s first find the department average.

## 4. Solution using a separate table (without subquery)

Let’s first try to solve this without a subquery first. Then we will attempt to create the subquery.

That way, you will be clear how subquery works.

In the beginning, it is a good idea to create intermediate temp tables to store the result, just to make sure your query does what it is meant to do. You can create such temp tables, verify the values and drop the tables when they are no longer needed.

**Approach 1**:

1. Step 1: Create a new table that will hold the average department salaries.

2. Step 2: Merge this table with the Employees table

3. Step 3: Do the filtering.

**First, Print the Employees table**

```
SELECT * FROM EMPLOYEES;
```

ID | Name | Salary | DepartmentID |
---|---|---|---|

1 | Alice | 35000 | 1 |

2 | Bob | 40000 | 2 |

3 | Carol | 55000 | 3 |

4 | David | 60000 | 2 |

5 | Ethan | 70000 | 1 |

**Compute the Average Salary by Department (in a new table)**

```
DROP TABLE IF EXISTS DEPARTMENTSALARY;
CREATE TABLE DEPARTMENTSALARY AS
SELECT DEPARTMENTID,
AVG(SALARY) as AVG_SALARY
FROM EMPLOYEES
GROUP BY DepartmentID;
```

**Show DepartmentSalary Table**

```
SELECT * FROM DEPARTMENTSALARY;
```

DepartmentID | AVG_SALARY |
---|---|

1 | 52500.0 |

2 | 50000.0 |

3 | 55000.0 |

**Merge this with the Employees Table**

```
SELECT A.*, B.AVG_SALARY
FROM EMPLOYEES as A
LEFT JOIN
DEPARTMENTSALARY as B
ON A.DepartmentID = B.DepartmentID;
```

ID | Name | Salary | DepartmentID | AVG_SALARY |
---|---|---|---|---|

1 | Alice | 35000 | 1 | 52500.0 |

2 | Bob | 40000 | 2 | 50000.0 |

3 | Carol | 55000 | 3 | 55000.0 |

4 | David | 60000 | 2 | 50000.0 |

5 | Ethan | 70000 | 1 | 52500.0 |

**Now, Let’s filter out salaries less than dept salaries**

```
SELECT A.NAME, A.SALARY
FROM EMPLOYEES as A
LEFT JOIN
DEPARTMENTSALARY as B
ON A.DepartmentID = B.DepartmentID
WHERE SALARY >= AVG_SALARY;
```

Name | Salary |
---|---|

Carol | 55000 |

David | 60000 |

Ethan | 70000 |

So, that’s the step by step logic.

How do we do it without having to create a new table as seen in 1 Step?

## 5. Solution using Sub Query Approach

Observe the `WHERE`

condition here. Let’s try to make sense of it.

```
SELECT E1.Name, E1.Salary
FROM Employees E1
WHERE Salary >= (
SELECT AVG(E2.Salary)
FROM Employees E2
WHERE E1.DepartmentID = E2.DepartmentID
);
```

Name | Salary |
---|---|

Carol | 55000 |

David | 60000 |

Ethan | 70000 |

**How to understand this query?**

We want to select those salaries that are greater than department average. So, we need to apply this in this WHERE clause.

Think of it like this:

The filter condition applied in the WHERE clause gets applied to every row of the table. So, the query gets evaluated for every row to compute the department average.

The subquery `SELECT AVG(Salary) FROM Employees GROUP BY DepartmentID`

will first calculate the average salary for each department. Then the outer query will compare each employee’s salary with the average salary of their respective department and return those who earn more than the average.

## 6. Sub Query – Tougher Example 2

**Task:** Find out the employees who sold Product A AND sold more than the average quantity sold by all employees.

How to solve this?

To do this, this query requires information about both product as well as employees. This information does not lie in one table.

So, first step is to **“Figure out what tables the data you need resides in.”**

We have the following Tables:

- Sales (EmployeeID, Product, Quantity)
- Departments (ID, Name)
- Employees (ID, Name, Salary, DepartmentID)

Sales contains the `EmployeeID`

but the `EmployeeName`

has to come from the `Employees`

table.

<---PAUSE AND THINK HOW TO SOLVE FOR NEXT 120 seconds.--->

SOLVE HandsOn here.

## 7. Solve by applying subquery

**Steps:**

- First, find the total quantity sold by each employee, for product A.
- Then Filter those records greater than avg quantity sold.

**First, Compute average total quantity sold by each employee for product A**

```
SELECT E.Name,
S.Product,
SUM(S.Quantity) as TotalQuantity
FROM Employees E
JOIN Sales S ON E.ID = S.EmployeeID
WHERE S.PRODUCT = "A"
GROUP BY E.Name, S.Product;
```

Name | Product | TotalQuantity |
---|---|---|

Alice | A | 10 |

Bob | A | 7 |

David | A | 9 |

Ethan | A | 3 |

**Then, filter those records where the Total Quantity (of product A) sold by an employees is greater than total average (of product A) sold by an employee?**

```
SELECT E.Name,
S.Product,
SUM(S.Quantity) as TotalQuantity
FROM Employees E
JOIN Sales S ON E.ID = S.EmployeeID
WHERE S.PRODUCT = "A"
GROUP BY E.Name, S.Product
HAVING TotalQuantity > ("Average of total qty of prdt A sold by an employee")
);
```

Now, We need to compute “Average of total qty of prdt A sold by an employee”.

So, first compute total qty of product A sold by each employee. Then compute the average.

```
SELECT SUM(QUANTITY)
FROM SALES
WHERE PRODUCT="A"
GROUP BY EMPLOYEEID;
```

SUM(QUANTITY) |
---|

10 |

7 |

9 |

3 |

But we want the average of these. So, put a query above this.

```
SELECT AVG(SUMQTY) as AVG_A
FROM (SELECT SUM(QUANTITY) as SUMQTY
FROM SALES
WHERE PRODUCT="A"
GROUP BY EMPLOYEEID)
```

AVG_A |
---|

7.25 |

**Now, apply this subquery inside the main query**

```
SELECT E.Name,
S.Product,
SUM(S.Quantity) as TotalQuantity
FROM Employees E
JOIN Sales S ON E.ID = S.EmployeeID
WHERE S.PRODUCT = "A"
GROUP BY E.Name, S.Product
HAVING TotalQuantity > (SELECT AVG(SUMQTY) as AVG_A
FROM (SELECT SUM(QUANTITY) as SUMQTY
FROM SALES
WHERE PRODUCT="A"
GROUP BY EMPLOYEEID) as subquery
);
```

Name | Product | TotalQuantity |
---|---|---|

Alice | A | 10 |

David | A | 9 |

Here, the subquery calculates the average quantity sold of product ‘A’ by all employees. Then the outer query sums up the quantity of product ‘A’ sold by each employee and compares it with the average quantity calculated by the subquery.