Menu

How to select only rows with max value on a column?

Problem

How to select only rows with max value on a column for each distinct value in another categorical column in MySQL?

From Items table, select rows with max value on the Price column for each distinct value in the Category column.

Input

id Category Price
1 Fruit 2.50
2 Fruit 3.00
3 Veg 1.00
4 Veg 1.50
5 Grain 0.50
6 Grain 0.60

Try Hands-Om: Fiddle

Create Input Table: Gist

Desired Output

We want only those records containing the maximum value of Price for each category.

Category Price
Fruit 3.00
Veg 1.50
Grain 0.60

There are multiple ways to do this. Let’s look at some of them.

Solution 1:

Using Left Join

    SELECT i1.Category, i1.Price
    FROM Items i1
    LEFT JOIN Items i2
    ON i1.Category = i2.Category AND i1.Price < i2.Price
    WHERE i2.Price IS NULL;

Explanation:

This solution uses a self join. For each row in the table Items (aliased as i1), we join it with the same table (aliased as i2) on the Category column, but only for rows where i1’s price is less than i2’s price.

The WHERE clause then filters out rows where we found a higher price in i2, leaving only the rows with the maximum price for each category.

Solution 2:

Using window functions

The following solution will not work on MySQL but works on other DB’s like SQLite, because MySQL does not support ranking functions like row_number.

SELECT Category, Price
FROM (
    SELECT 
        Category, 
        Price, 
        ROW_NUMBER() OVER (PARTITION BY Category ORDER BY Price DESC) as rnum
    FROM Items
) AS temp
WHERE rnum = 1;

Explanation:

The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition of a result set. The PARTITION BY Category partitions data by Category. The ORDER BY Price DESC orders the records in each partition based on the Price in descending order.

The outer query then filters the results to include only rows where the row number is 1, i.e., the rows with the highest price in each category.

Solution 3:

Using Group By

SELECT i.Category, i.Price 
FROM Items i
INNER JOIN (
    SELECT Category, MAX(Price) as MaxPrice
    FROM Items
    GROUP BY Category
) grouped 
ON i.Category = grouped.Category AND i.Price = grouped.MaxPrice;

Explanation

  1. The subquery (aliased as grouped) groups the items by Category and selects the maximum price for each category.

  2. The main query then joins the original Items table with this subquery on both Category and Price columns (Max Price). This ensures that only the rows with the maximum price for each category are selected.

Recommended Courses

  1. SQL for Data Science – Level 1
  2. SQL for Data Science – Level 2
  3. SQL for Data Science – Level 3

Recommended Tutorial

  1. Introduction to SQL
  2. SQL Window Functons – Made Simple and Easy
  3. SQL Subquery

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science