I have data like this:
table1
_____________
id way time
1 1 00:01
2 1 00:02
3 2 00:03
4 2 00:04
5 2 00:05
6 3 00:06
7 3 00:07
8 1 00:08
9 1 00:09
I would like to know in which time interval I was on which way:
desired output
_________________
id way from to
1 1 00:01 00:02
3 2 00:03 00:05
6 3 00:06 00:07
8 1 00:08 00:09
I tried to use a window function:
SELECT DISTINCT
first_value(id) OVER w AS id,
first_value(way) OVER w as way,
first_value(time) OVER w as from,
last_value(time) OVER w as to
FROM table1
WINDOW w AS (
PARTITION BY way ORDER BY ID
range between unbounded preceding and unbounded following);
What I get is:
ID way from to
1 1 00:01 00:09
3 2 00:03 00:05
6 3 00:06 00:07
And this is not correct, because on way 1 I wasn't from 00:01 to 00:09. Is there a possibility to do the partition according to the order, means grouping only following attributes, that are equal?
If your case is as simple as the example values suggest, @Giorgos' answer serves nicely.
However, that's typically not the case. If the
id
column is aserial
, you cannot rely on the assumption that a row with an earliertime
also has a smallerid
.Also,
time
values (ortimestamp
like you probably have) can easily be duplicates, you need to make the sort order unambiguous.Assuming both can happen, and you want the
id
from the row with the earliesttime
per time slice (actually, the smallestid
for the earliest time, there could be ties), this query would deal with the situation properly:ORDER BY time, id
to be unambiguous. Assuming time is not unique, add the (assumed unique)id
to avoid arbitrary results - that could change between queries in sneaky ways.max(time) OVER (PARTITION BY way, grp)
: withoutORDER BY
, the window frame spans all rows of the PARTITION, so we get the absolute maximum per time slice.The outer query layer is only necessary to produce the desired sort order in the result, since we are bound to a different
ORDER BY
in the subquerysub
by usingDISTINCT ON
. Details:SQL Fiddle demonstrating the use case.
If you are looking to optimize performance, a plpgsql function could be faster in such a case. Closely related answer:
Aside: don't use the basic type name
time
as identifier (also a reserved word in standard SQL).