The itertools
pairwise recipe is one way to solve the problem. Applied with itertools.groupby
, groups of pairs whose mathematical difference are equivalent can be created. The first and last items of each group are then selected for multi-item groups or the last item is selected for singleton groups:
from itertools import groupby, tee, izip
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
def grouper(lst):
result = []
for k, g in groupby(pairwise(lst), key=lambda x: x[1] - x[0]):
g = list(g)
if len(g) > 1:
try:
if g[0][0] == result[-1]:
del result[-1]
elif g[0][0] == result[-1][1]:
g = g[1:] # patch for duplicate start and/or end
except (IndexError, TypeError):
pass
result.append((g[0][0], g[-1][-1], k))
else:
result.append(g[0][-1]) if result else result.append(g[0])
return result
Trial: input -> grouper(lst) -> output
Input: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 5, 1), (12, 17, 1), 20]
Input: [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), (12, 17, 1), 20]
Input: [2, 4, 6, 8, 12, 12.4, 12.9, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), 12, 12.4, 12.9, (13, 17, 1), 20] # 12 does not appear in the second group
Update: (patch for duplicate start and/or end values)
s1 = [i + 10 for i in xrange(0, 11, 2)]; s2 = [30]; s3 = [i + 40 for i in xrange(45)]
Input: s1+s2+s3
Output: [(10, 20, 2), (30, 40, 10), (41, 84, 1)]
# to make 30 appear as an entry instead of a group change main if condition to len(g) > 2
Input: s1+s2+s3
Output: [(10, 20, 2), 30, (41, 84, 1)]
Input: [2, 4, 6, 8, 10, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 12, 2), (13, 17, 1), 20]