Problem
From the test_scores
table below, select the top 1 row in each group.
For example, you have a table with data about ‘students’ test scores, and you want to find the highest-scoring student in each subject.
Input
student_id | subject | score |
---|---|---|
1 | Math | 90 |
2 | Math | 85 |
3 | Math | 95 |
1 | Science | 88 |
2 | Science | 92 |
3 | Science | 89 |
1 | History | 78 |
2 | History | 88 |
3 | History | 92 |
Try Hands-On: Fiddle
Create Input Table: Gist
Desired Output
student_id | subject | score |
---|---|---|
3 | History | 92 |
3 | Math | 95 |
2 | Science | 92 |
There are multiple ways to do this. Let’s look at some of them.
Solution 1:
Using ROW_NUMBER()
In this approach, we consider the Top 1 row are the first row in the table, ordered based on score.
You can use a common table expression (CTE) along with the ROW_NUMBER() function to achieve this. Here’s the SQL query to retrieve the top 1 row from each subject group.
Note: ROW_NUMBER()
works on MySQL only from v8 onwards.
WITH RankedScores AS (
SELECT
student_id,
subject,
score,
ROW_NUMBER() OVER (PARTITION BY subject ORDER BY score DESC) AS row_num
FROM
test_scores
)
SELECT
student_id,
subject,
score
FROM
RankedScores
WHERE
row_num = 1;
Explanation:
This query first assigns a row number to each row within each subject group, based on the score in descending order. Then, it selects only the rows where the row number is 1, which represents the top-scoring student in each subject.
Solution 2:
By using a subquery with a JOIN
In this approach, we condider the top 1 row to be the row with maximum score.
SELECT t1.student_id, t1.subject, t1.score
FROM test_scores t1
JOIN (
SELECT subject, MAX(score) AS max_score
FROM test_scores
GROUP BY subject
) t2
ON t1.subject = t2.subject AND t1.score = t2.max_score;
Explanation:
This query first creates a subquery (aliased as t2) that calculates the maximum score for each subject group using the MAX() function and GROUP BY.
Then, it joins the original table test_scores with this subquery based on both subject and score.
This way, it retrieves the rows where the score matches the maximum score for each subject group.
Solution 3:
Using a Correlated Subquery
SELECT ts.student_id, ts.subject, ts.score
FROM test_scores ts
WHERE ts.score = (
SELECT MAX(score)
FROM test_scores
WHERE subject = ts.subject
);
Explanation:
In this query, we use a correlated subquery in the WHERE clause.
For each row in the main query (aliased as ts), the subquery finds the maximum score for the same subject in the test_scores table.
If the score of the current row matches the maximum score for that subject, the row is included in the result.