Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
605 views
in Technique[技术] by (71.8m points)

sqlite - sql query - how to apply limit within group by

I have a table named t1 with following fields: ROWID, CID, PID, Score, SortKey

it has the following data:

1, C1, P1, 10, 1
2, C1, P2, 20, 2
3, C1, P3, 30, 3

4, C2, P4, 20, 3
5, C2, P5, 30, 2

6, C3, P6, 10, 1
7, C3, P7, 20, 2

what query do I write so that it applies group by on CID, but instead of returning me 1 single result per group, it returns me a max of 2 results per group. also where condition is score >= 20 and I want the results ordered by CID and SortKey.

If I had to run my query on above data, I would expect the following result:

RESULTS FOR C1 - note: ROWID 1 is not considered as its score < 20

C1, P2, 20, 2
C1, P3, 30, 3

RESULTS FOR C2 - note: ROWID 5 appears before ROWID 4 as ROWID 5 has lesser value SortKey

C2, P5, 30, 2
C2, P4, 20, 3

RESULTS FOR C3 - note: ROWID 6 does not appear as its score is less than 20 so only 1 record returned here

C3, P7, 20, 2

IN SHORT, I WANT A LIMIT WITHIN A GROUP BY. I want the simplest solution and want to avoid temp tables. sub queries are fine. Also note I am using SQLite for this.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's a fairly portable query to do what you want:

SELECT *
FROM table1 a 
WHERE a."ROWID" IN (
    SELECT b."ROWID" 
    FROM table1 b 
    WHERE b."Score" >= 20 
      AND b."ROWID" IS NOT NULL 
      AND a."CID" = b."CID" 
    ORDER BY b."CID", b."SortKey" 
    LIMIT 2
)
ORDER BY a."CID", a."SortKey";

The query uses a correlated subquery with a sort and limit to produce a list of ROWIDs that should appear in the final result. Because the correlated subquery is executed for every row, whether or not it's included in the result, it may not be as efficient as the window function version given below - but unlike that version it'll work on SQLite3, which doesn't support window functions.

This query requires that ROWID is unique (can be used as a primary key).

I tested the above in PostgreSQL 9.2 and in SQLite3 3.7.11 ; it works fine in both. It won't work on MySQL 5.5 or the latest 5.6 milestone because MySQL doesn't support LIMIT in a subquery used with IN.

SQLFiddle demos:

SQLite demo showing it works just fine on the SQLite3 command line: http://pastebin.com/26n4NiUC

Output (PostgreSQL):

 ROWID | CID | PID | Score | SortKey 
-------+-----+-----+-------+---------
     2 | C1  | P2  |    20 |       2
     3 | C1  | P3  |    30 |       3
     5 | C2  | P5  |    30 |       2
     4 | C2  | P4  |    20 |       3
     7 | C3  | P7  |    20 |       2
(5 rows)

If you want to filter for a particular CID, just add AND "CID" = 'C1' or whatever to the outer WHERE clause.

Here's a closely related answer with more detailed examples: https://stackoverflow.com/a/13411138/398670


Since this was originally tagged just SQL (no SQLite)... just for completeness, in PostgreSQL or other DBs with SQL-standard window function support I'd probably do this:

SELECT "ROWID", "CID", "PID", "Score", "SortKey"
FROM (
  SELECT *, row_number() OVER (PARTITION BY "CID" ORDER BY "SortKey") AS n
  FROM table1
  WHERE "Score" >= 20
) x
WHERE n < 3
ORDER BY "CID", "SortKey";

which produces the same result. SQLFiddle, including extra C1 row to demonstrate that the limiting filter actually works: http://sqlfiddle.com/#!12/22829/1

If you want to filter for a particular CID, just add AND "CID" = 'C1' or whatever to the inner WHERE clause.


BTW, your test data is insufficient, since it can never have more than two rows for any CID with score > 20 anyway.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...