Problem
You have a table with multiple rows of data for each group, and you want to select the first row from each group.
From the orders
table below, select the first row in each group.
Input
order_id | customer_id | order_date | product_name | quantity |
---|---|---|---|---|
1 | 1 | 2023-01-01 | Laptop | 1 |
2 | 1 | 2023-01-02 | Mouse | 2 |
3 | 2 | 2023-01-03 | Keyboard | 1 |
4 | 2 | 2023-01-04 | Monitor | 1 |
5 | 3 | 2023-01-05 | Mouse Pad | 5 |
6 | 3 | 2023-01-06 | USB Cable | 3 |
Try Hands-On: Fiddle
Create Input Table: Gist
Desired Output
We want only the first record for each customer_id.
order_id | customer_id | order_date | product_name | quantity |
---|---|---|---|---|
1 | 1 | 2023-01-01 | Laptop | 1 |
3 | 2 | 2023-01-03 | Keyboard | 1 |
5 | 3 | 2023-01-05 | Mouse Pad | 5 |
There are multiple ways to do this. Let’s look at some of them.
Solution 1:
Using GROUP BY and JOIN
To achieve this, we can use a combination of JOIN and subquery. The idea is to find the earliest order date for each customer and then join that result with the main table.
SELECT o1.*
FROM orders o1
JOIN (
SELECT customer_id, MIN(order_date) as first_order_date
FROM orders
GROUP BY customer_id
) o2 ON o1.customer_id = o2.customer_id AND o1.order_date = o2.first_order_date;
Explanation:
This query first finds the earliest order_date for each customer_id in the subquery. Then, it joins this result with the main orders table to retrieve the details of the first order for each customer.
Solution 2:
Using ROW_NUMBER() window functions
MySQL does not have ROW_NUMBER()
function. So, switch to a different DB like SQLite before running the below code.
The idea is to assign a row number to each row within a partition (in this case, for each customer_id) based on order_date, and then filter out the rows with row number 1.
WITH NumberedOrders AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY order_date) AS rn
FROM orders
)
SELECT *
FROM NumberedOrders
WHERE rn = 1;
Explanation:
The Common Table Expression (CTE) NumberedOrders
assigns row numbers to each row in the orders table, partitioned by customer_id and ordered by order_date.
The main query then filters the results to only select rows where the row number rn
is 1, which would be the first order for each customer.
This approach takes advantage of the capabilities of MySQL 8.0+, where window functions like ROW_NUMBER() were introduced.
Solution 3:
Using Correlated Subquery.
SELECT o.*
FROM orders o
WHERE order_date = (
SELECT MIN(order_date)
FROM orders o_inner
WHERE o_inner.customer_id = o.customer_id
);
Explanation:
For each row in the orders table (represented by the alias o), we perform a subquery.
This subquery (using the alias o_inner) retrieves the smallest (or “first” in terms of time) order_date for that specific customer_id.
If the order_date of the current row matches the result of this subquery, the row is included in the output.