I know this is a rather old question, but I just went through a small example in my head which helped me understand why Postgres has this seemingly odd restriction on SELECT DISTINCT / ORDER BY columns.
Imagine you have the following data in your Rsvp table:
event_id | start_time
----------+------------------------
0 | Mar 17, 2013 12:00:00
1 | Jan 1, 1970 00:00:00
1 | Aug 21, 2013 16:30:00
2 | Jun 9, 2012 08:45:00
Now you want to grab a list of distinct event_ids, ordered by their respective start_times. But where should 1
go? Should it come first, because the one tuple starts on Jan 1, 1970, or should it go last because of the Aug 21, 2013?
As the database system can't make that decision for you and the syntax of the query can't depend on the actual data it might be operating on (assuming event_id
is unique), we are restricted to ordering only by columns from the SELECT
clause.
As for the actual question - an alternative to Matthew's answer is using an aggregate function like MIN
or MAX
for the sorting:
SELECT event_id
FROM Rsvp
GROUP BY event_id
ORDER BY MIN(start_time)
The explicit grouping and aggregation on start_time
permit the database to come up with a unambiguous ordering of the result tuples. Note however, that readability is definitely an issue in this case ;)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…