This is my solution using WINDOW functions
. I used the lag
and lead
functions. Both returns a value from a column from a row in offset from the current row. lag
goes back and lead
goes next in the offset.
SELECT tokcat.text
FROM (
SELECT text, category, chartype, lag(category,1) OVER w as previousCategory, lead(category,1) OVER w as nextCategory
FROM token t, textBlockHasToken tb
WHERE tb.tokenId = t.id
WINDOW w AS (
PARTITION BY textBlockId, sentence
ORDER BY textBlockId, sentence, position
)
) tokcat
WHERE 'NAME' = ANY(previousCategory)
AND 'NAME' = ANY(nextCategory)
AND 'NAME' <> ANY(category)
Simplified version:
SELECT text
FROM (
SELECT text
,category
,lag(category) OVER w as previous_cat
,lead(category) OVER w as next_cat
FROM token t
JOIN textblockhastoken tb ON tb.tokenid = t.id
WINDOW w AS (PARTITION BY textblockid, sentence ORDER BY position)
) tokcat
WHERE category <> 'NAME'
AND previous_cat = 'NAME'
AND next_cat = 'NAME';
Major points
= ANY()
is not needed, the window function returns a single value
- some redundant fields in the subquery
- no need to order by columns, that you
PARTITION BY
- the ORDER BY applies within partitions
- Don't use mixed case identifiers without quoting, it only leads to confusion. (Better yet: don't use mixed case identifiers in PostgreSQL ever)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…