Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
78 views
in Technique[技术] by (71.8m points)

python - I want to find the most often sequence of symbols from the list of lists

I want to find the most often sequence of symbols from the list of [lists]

CATEGORIES = ["0","1","2","3","4","5","6","7","8","9",
              "A","B","C","D","E","F","G","H","I","J",
              "K","L","M","N","O","P","R","S","T","U",
              "V","W","X","Y","Z"]

KR8877J = [[0.002,0.006,0.004,0.045,0.002,0.017,0.006,0.077,0.001,0.035,0.042,0.005,0.004,0.039,0.001,0.002,0.001,0.008,0.058,0.352,0.002,0.007,0.017,0.004,0.007,0.007,0.007,0.004,0.005,0.009,0.089,0.036,0.053,0.041,0.004],[0.003,0.007,0.005,0.075,0.001,0.020,0.006,0.044,0.002,0.035,0.026,0.004,0.004,0.033,0.001,0.001,0.003,0.008,0.049,0.360,0.002,0.007,0.021,0.005,0.009,0.003,0.008,0.007,0.003,0.014,0.092,0.048,0.058,0.031,0.004],[0.002,0.000,0.025,0.012,0.006,0.002,0.001,0.627,0.006,0.021,0.022,0.008,0.004,0.006,0.004,0.033,0.000,0.006,0.011,0.009,0.002,0.002,0.009,0.000,0.002,0.040,0.007,0.005,0.015,0.000,0.035,0.001,0.008,0.015,0.053],[0.056,0.008,0.023,0.038,0.015,0.007,0.050,0.006,0.412,0.004,0.005,0.027,0.011,0.005,0.021,0.007,0.073,0.024,0.012,0.005,0.013,0.005,0.027,0.003,0.015,0.001,0.005,0.074,0.002,0.022,0.005,0.011,0.002,0.001,0.006],[0.025,0.011,0.025,0.034,0.018,0.027,0.090,0.008,0.258,0.006,0.007,0.026,0.016,0.008,0.026,0.011,0.079,0.030,0.026,0.008,0.018,0.011,0.033,0.003,0.016,0.001,0.003,0.106,0.004,0.021,0.012,0.013,0.003,0.005,0.014],[0.048,0.027,0.019,0.002,0.028,0.002,0.008,0.017,0.041,0.014,0.012,0.022,0.031,0.005,0.045,0.100,0.004,0.031,0.033,0.002,0.029,0.006,0.021,0.032,0.008,0.038,0.317,0.007,0.017,0.004,0.018,0.005,0.003,0.004,0.002],[0.013,0.002,0.002,0.000,0.164,0.001,0.060,0.004,0.006,0.002,0.018,0.003,0.035,0.002,0.008,0.008,0.001,0.008,0.028,0.005,0.383,0.013,0.063,0.010,0.004,0.002,0.014,0.016,0.002,0.005,0.048,0.011,0.028,0.017,0.012]]

KR8877J_1 = [[0.004,some data]]
KR8877J_2 = [[0.002,some data]]
KR8877J_3 = [[somedata]
KR8877J_4 = [[0.006,some data,0.008]]
KR8877J_5 = [[some data]]
KR8877J_6 = [[some data]]

def readable(x):
    tag = []
    for lst in x:
        index = max(enumerate(lst), key=lambda pair: pair[1])[0]
        tag.append(CATEGORIES[index])
    tag.reverse()
    str = tag 
    print(str)
    #print(tag)


for i in (KR8877J, KR8877J_1,KR8877J_2,KR8877J_3,KR8877J_4,KR8877J_5,KR8877J_6): 
    readable(i)
    

def compare_bitwise(a,b): 
    a_set = set(a) 
    b_set = set(b) 
    if (a_set & b_set): 
         return True 
    else: 
        return False


for i in (KR8877J, KR8877J_1,KR8877J_2,KR8877J_3,KR8877J_4,KR8877J_5,KR8877J_6):
    print(compare_bitwise(i, i+=1)) 

Problem with iteration is here: print(compare_bitwise(i, i+=1)) I give an example of data only in the first list because all of them are the same expert one which is have output ['K', 'R', '8', '8', 'J', '7', 'J'] instead ['K', 'R', '8', '8', '7', 'J', 'J']

question from:https://stackoverflow.com/questions/66059786/i-want-to-find-the-most-often-sequence-of-symbols-from-the-list-of-lists

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Without examples of some data, it's hard to tell what's expected. In any case, for the line print(compare_bitwise(i, i+=1)) - if you're trying to compare each KR8877J_... sequence with the next one, so KR8877J_1 with KR8877J_2, KR8877J_2 with KR8877J_3, etc. - then the smallest change would be: Assign that sequence to a list or tuple and then index accordingly.

seq = (KR8877J_1,KR8877J_2,KR8877J_3,KR8877J_4,KR8877J_5,KR8877J_6)
for i in range(len(seq) - 1):
    print(compare_bitwise(seq[i], seq[i+1]))

But indexing with range() and len() isn't isn't good or efficient in Python. Instead, it's better to use zip() with seq and seq[1:]:

seq = (KR8877J_1,KR8877J_2,KR8877J_3,KR8877J_4,KR8877J_5,KR8877J_6)
for i, j in zip(seq, seq[1:]):
    print(compare_bitwise(i, j))

Edit: I don't fully understand what you mean but you can convert the tag list to a tuple using tag_tup = tuple(tag) and return the value from readable() instead of just printing it. Or just return tag. And then feed that as a key for collections.Counter by doing counter = collections.Counter() before looping for readable(i) and then counter.update({tag_tup: 1}) inside loop, along with the tag_tup.

def readable(x):
    tag = []
    for lst in x:
        index = max(enumerate(lst), key=lambda pair: pair[1])[0]
        tag.append(CATEGORIES[index])
    tag.reverse()
    print(tag)
    return tag


import collections
counter = collections.Counter()
for i in (KR8877J, KR8877J_1,KR8877J_2,KR8877J_3,KR8877J_4,KR8877J_5,KR8877J_6):
    tag_tup = tuple(readable(i))
    counter.update({tag_tup: 1})

print(counter)

See the collections.Counter docs (linked above) to then get the most_common combinations.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...