Menu

How to get the top 1 row of each group in SQL?

Problem

From the test_scores table below, select the top 1 row in each group.

For example, you have a table with data about ‘students’ test scores, and you want to find the highest-scoring student in each subject.

Input

student_id subject score
1 Math 90
2 Math 85
3 Math 95
1 Science 88
2 Science 92
3 Science 89
1 History 78
2 History 88
3 History 92

Try Hands-On: Fiddle

Create Input Table: Gist

Desired Output

student_id subject score
3 History 92
3 Math 95
2 Science 92

There are multiple ways to do this. Let’s look at some of them.

Solution 1:

Using ROW_NUMBER()

In this approach, we consider the Top 1 row are the first row in the table, ordered based on score.

You can use a common table expression (CTE) along with the ROW_NUMBER() function to achieve this. Here’s the SQL query to retrieve the top 1 row from each subject group.

Note: ROW_NUMBER() works on MySQL only from v8 onwards.

    WITH RankedScores AS (
        SELECT
            student_id,
            subject,
            score,
            ROW_NUMBER() OVER (PARTITION BY subject ORDER BY score DESC) AS row_num
        FROM
            test_scores
    )
    SELECT
        student_id,
        subject,
        score
    FROM
        RankedScores
    WHERE
        row_num = 1;

Explanation:

This query first assigns a row number to each row within each subject group, based on the score in descending order. Then, it selects only the rows where the row number is 1, which represents the top-scoring student in each subject.

Solution 2:

By using a subquery with a JOIN

In this approach, we condider the top 1 row to be the row with maximum score.

SELECT t1.student_id, t1.subject, t1.score
FROM test_scores t1
JOIN (
    SELECT subject, MAX(score) AS max_score
    FROM test_scores
    GROUP BY subject
) t2
ON t1.subject = t2.subject AND t1.score = t2.max_score;

Explanation:

This query first creates a subquery (aliased as t2) that calculates the maximum score for each subject group using the MAX() function and GROUP BY.

Then, it joins the original table test_scores with this subquery based on both subject and score.

This way, it retrieves the rows where the score matches the maximum score for each subject group.

Solution 3:

Using a Correlated Subquery

SELECT ts.student_id, ts.subject, ts.score
FROM test_scores ts
WHERE ts.score = (
    SELECT MAX(score)
    FROM test_scores
    WHERE subject = ts.subject
);

Explanation:

In this query, we use a correlated subquery in the WHERE clause.

For each row in the main query (aliased as ts), the subquery finds the maximum score for the same subject in the test_scores table.

If the score of the current row matches the maximum score for that subject, the row is included in the result.

Recommended Courses

  1. SQL for Data Science – Level 1
  2. SQL for Data Science – Level 2
  3. SQL for Data Science – Level 3

Recommended Tutorial

  1. Introduction to SQL
  2. SQL Window Functons – Made Simple and Easy
  3. SQL Subquery

More SQL Questions

  1. How to select only rows with max value on a column?
  2. How to transpose columns to rows in SQL?
  3. How to select first row in each GROUP BY group?

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science