The following answers all use date_trunc('day',date_time)
or just cast to date
to truncate a timestamp to a date. There's no need to jump through hoops with date formatting and strings. See Date/time functions in the manual.
This SQLFiddle shows three possible answers: http://sqlfiddle.com/#!12/0fd51/14, all of which produce the same result for the input data (but not necessarily the same result if date_time
can have duplicates in it).
To solve your problem you could use a correlated subquery with a limit to generate an IN-list to filter on:
SELECT a.date_time, a.other_column
FROM table1 a
WHERE a.date_time IN (
SELECT b.date_time
FROM table1 b
WHERE b.date_time IS NOT NULL
AND a.date_time::date = b.date_time::date
ORDER BY b.date_time
LIMIT 3
)
AND a.date_time::date BETWEEN '2012-11-02' AND '2012-11-05';
This should be the most portable approach - though it won't work with MySQL (at least as of 5.5) because MySQL doesn't support LIMIT
in a subquery used in an IN
clause. It works in SQLite3 and PostgreSQL, though, and should work in most other DBs.
Another option would be to select the range of dates you wanted, annotate the rows within the range with a row number using a window function, then filter the output to exclude excess rows:
SELECT date_time, other_column
FROM (
SELECT
date_time,
other_column,
rank() OVER (PARTITION BY date_trunc('day',date_time) ORDER BY date_time) AS n
FROM Table1
WHERE date_trunc('day',date_time) BETWEEN '2012-11-02' AND '2012-11-05'
ORDER BY date_time
) numbered_rows
WHERE n < 4;
If ties are a possibility, ie if date_time
is not unique, then consider using either the rank
or dense_rank
window functions instead of row_number
to get deterministic results, or add an additional clause to the ORDER BY
in row_number
to break the tie.
If you use rank
then it'll include none of the rows if it can't fit all of them in; if you use dense_rank
it'll include all of them even if it has to go over the 3-row-per-day limit to do so.
All sorts of other processing are possible this way too, using the window specification.
Here's yet another formulation that uses array aggregation and slicing, which is completely PostgreSQL specific but fun.
SELECT b.date_time, b.other_column
FROM (
SELECT array_agg(a.date_time ORDER BY a.date_time)
FROM table1 a
WHERE a.date_time::date BETWEEN '2012-11-02'
AND '2012-11-05'
GROUP BY a.date_time::date
) x(arr)
INNER JOIN table1 b ON (b.date_time = ANY (arr[1:3]));