Menu

How to get the rows which have the max value for a column for each group in another column in SQL?

Problem

You want to find the rows that have the maximum value for a specific column within each group in another column.

For example, you have a table with data about products and their prices, and you want to find the most expensive product in each category.

Input

product_id product_name category_id price
1 Product A 1 10.99
2 Product B 1 15.99
3 Product C 2 8.50
4 Product D 2 12.75
5 Product E 3 25.99
6 Product F 3 22.49

Try Hands-On: Fiddle

Create Input Table: Gist

Desired Output

product_id product_name category_id price
2 Product B 1 15.99
4 Product D 2 12.75
5 Product E 3 25.99

There are multiple ways to do this. Let’s look at some of them.

Solution 1:

Using INNER JOIN and GROUP BY

    SELECT p.*
    FROM products p
    INNER JOIN (
        SELECT category_id, MAX(price) AS max_price
        FROM products
        GROUP BY category_id
    ) subquery
    ON p.category_id = subquery.category_id AND p.price = subquery.max_price;

Explanation:

This query first creates a subquery that calculates the maximum price for each category.

Then, it joins the original table with this subquery on both the category_id and the maximum price to retrieve the rows that match the criteria of having the maximum price within each category.

Solution 2:

By using a correlated subquery

SELECT p.*
FROM products p
WHERE p.price = (
    SELECT MAX(price)
    FROM products
    WHERE category_id = p.category_id
);

Explanation:

This query uses a correlated subquery in the WHERE clause.

For each row in the outer query (aliased as “p”), the subquery calculates the maximum price for products in the same category as the current row.

If the price of the current row matches the maximum price for that category, it will be included in the result set.

Solution 3:

Using RANK()

Note: Use MySQL 8 or above since this uses RANK() function.

SELECT product_id, product_name, category_id, price
FROM (
    SELECT 
        product_id, 
        product_name, 
        category_id, 
        price, 
        RANK() OVER (PARTITION BY category_id ORDER BY price DESC) AS price_rank
    FROM products
) ranked_products
WHERE price_rank = 1;

Explanation:

This approach leverages the power of window functions to rank products within categories based on the price column and select only those with a rank of 1, which corresponds to the maximum price within each category.

We create a subquery that includes the product details along with a calculated rank for each product within its category based on the price. The RANK() function assigns a rank of 1 to the product with the highest price within each category.

In the outer query, we select only those rows where the price_rank is equal to 1, which corresponds to the products with the maximum price in their respective categories.

Recommended Courses

  1. SQL for Data Science – Level 1
  2. SQL for Data Science – Level 2
  3. SQL for Data Science – Level 3

Recommended Tutorial

  1. Introduction to SQL
  2. SQL Window Functons – Made Simple and Easy
  3. SQL Subquery

More SQL Questions

  1. How to select only rows with max value on a column?
  2. How to transpose columns to rows in SQL?
  3. How to select first row in each GROUP BY group?

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science