Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
184 views
in Technique[技术] by (71.8m points)

python - Identify groups of varying continuous numbers in a list

In this other SO post, a Python user asked how to group continuous numbers such that any sequences could just be represented by its start/end and any stragglers would be displayed as single items. The accepted answer works brilliantly for continuous sequences.

I need to be able to adapt a similar solution but for a sequence of numbers that have potentially (not always) varying increments. Ideally, how I represent that will also include the increment (so they'll know if it was every 3, 4, 5, nth)

Referencing the original question, the user asked for the following input/output

[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]  # input
[(2,5), (12,17), 20]

What I would like is the following (Note: I wrote a tuple as the output for clarity but xrange would be preferred using its step variable):

[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]  # input
[(2,5,1), (12,17,1), 20]  # note, the last element in the tuple would be the step value

And it could also handle the following input

[2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]  # input
[(2,8,2), (12,17,1), 20]  # note, the last element in the tuple would be the increment

I know that xrange() supports a step so it may be possible to even use a variant of the other user's answer. I tried making some edits based on what they wrote in the explanation but I wasn't able to get the result I was looking for.

For anyone that doesn't want to click the original link, the code that was originally posted by Nadia Alramli is:

ranges = []
for key, group in groupby(enumerate(data), lambda (index, item): index - item):
    group = map(itemgetter(1), group)
    if len(group) > 1:
        ranges.append(xrange(group[0], group[-1]))
    else:
        ranges.append(group[0])
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The itertools pairwise recipe is one way to solve the problem. Applied with itertools.groupby, groups of pairs whose mathematical difference are equivalent can be created. The first and last items of each group are then selected for multi-item groups or the last item is selected for singleton groups:

from itertools import groupby, tee, izip


def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

def grouper(lst):
    result = []
    for k, g in groupby(pairwise(lst), key=lambda x: x[1] - x[0]):
        g  = list(g)
        if len(g) > 1:
            try:
                if g[0][0] == result[-1]:
                    del result[-1]
                elif g[0][0] == result[-1][1]:
                    g = g[1:] # patch for duplicate start and/or end
            except (IndexError, TypeError):
                pass
            result.append((g[0][0], g[-1][-1], k))
        else:
            result.append(g[0][-1]) if result else result.append(g[0])
    return result

Trial: input -> grouper(lst) -> output

Input: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 5, 1), (12, 17, 1), 20]

Input: [2, 4, 6, 8, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), (12, 17, 1), 20]

Input: [2, 4, 6, 8, 12, 12.4, 12.9, 13, 14, 15, 16, 17, 20]
Output: [(2, 8, 2), 12, 12.4, 12.9, (13, 17, 1), 20] # 12 does not appear in the second group

Update: (patch for duplicate start and/or end values)

s1 = [i + 10 for i in xrange(0, 11, 2)]; s2 = [30]; s3 = [i + 40 for i in xrange(45)]

Input: s1+s2+s3
Output: [(10, 20, 2), (30, 40, 10), (41, 84, 1)]

# to make 30 appear as an entry instead of a group change main if condition to len(g) > 2
Input: s1+s2+s3
Output: [(10, 20, 2), 30, (41, 84, 1)]

Input: [2, 4, 6, 8, 10, 12, 13, 14, 15, 16, 17, 20]
Output: [(2, 12, 2), (13, 17, 1), 20]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...