The following query:
SELECT
year, id, rate
FROM h
WHERE year BETWEEN 2000 AND 2009
AND id IN (SELECT rid FROM table2)
GROUP BY id, year
ORDER BY id, rate DESC
yields:
year id rate
2006 p01 8
2003 p01 7.4
2008 p01 6.8
2001 p01 5.9
2007 p01 5.3
2009 p01 4.4
2002 p01 3.9
2004 p01 3.5
2005 p01 2.1
2000 p01 0.8
2001 p02 12.5
2004 p02 12.4
2002 p02 12.2
2003 p02 10.3
2000 p02 8.7
2006 p02 4.6
2007 p02 3.3
What I’d like is only the top 5 results for each id:
2006 p01 8
2003 p01 7.4
2008 p01 6.8
2001 p01 5.9
2007 p01 5.3
2001 p02 12.5
2004 p02 12.4
2002 p02 12.2
2003 p02 10.3
2000 p02 8.7
Is there a way to do this using some kind of LIMIT like modifier that works within the GROUP BY?
You want to find top n rows per group. This answer provides a generic solution using example data that is different from OP.
In MySQL 8 or later you can use the ROW_NUMBER
, RANK
or DENSE_RANK
function depending on the exact definition of top 5. Below are the numbers generated by these functions based on value
sorted descending. Notice how ties are handled:
pkid |
catid |
value |
row_number |
rank |
dense_rank |
1 |
p01 |
100 |
*1 |
*1 |
*1 |
2 |
p01 |
90 |
*2 |
*2 |
*2 |
3 |
p01 |
90 |
*3 |
*2 |
*2 |
4 |
p01 |
80 |
*4 |
*4 |
*3 |
5 |
p01 |
80 |
*5 |
*4 |
*3 |
6 |
p01 |
80 |
6 |
*4 |
*3 |
7 |
p01 |
70 |
7 |
7 |
*4 |
8 |
p01 |
60 |
8 |
8 |
*5 |
9 |
p01 |
50 |
9 |
9 |
6 |
10 |
p01 |
40 |
10 |
10 |
7 |
Once you have chosen the function, use it like so:
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY value DESC) AS n
FROM t
) AS x
WHERE n <= 5
DB<>Fiddle
In MySQL 5.x you can use poor man’s rank over partition to achieve desired result: outer join the table with itself and for each row, count the number of rows before it (e.g. the before row could be the one with higher value).
The following will produce results similar to RANK
function:
SELECT t.pkid, t.catid, t.value, COUNT(b.value) + 1 AS rank
FROM t
LEFT JOIN t AS b ON b.catid = t.catid AND b.value > t.value
GROUP BY t.pkid, t.catid, t.value
HAVING COUNT(b.value) + 1 <= 5
ORDER BY t.catid, t.value DESC, t.pkid
Make the following change to produce results similar to DENSE_RANK
function:
COUNT(DISTINCT b.value)
Or make the following change to produce results similar to ROW_NUMBER
function:
ON b.catid = t.catid AND (b.value > t.value OR b.value = t.value AND b.pkid < t.pkid)
DB<>Fiddle