Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
84 views
in Technique[技术] by (71.8m points)

python - how to split a list by 2nd recurrence

Sorry if the title is confusing, I hope my explanation will be more clearer. I also have a working method, but I hope it could be improved, particularly because there's a flaw. I also searched my problem extensively to my ability and didn't find something that matched closely.

So I have the following list of dicts that looks like

names = [
  {"name" : "1231 GROUP LLC,.", "address": ""},
  {"name" : "Brick Pizza", "address": ""},
  {"name": "Zone Fitness", "address": ""},
  {"name": "Alderson, Kevin", "address": ""},
  {"name": "Alderson, Joanne", "address": ""}
  {"name": "Ave, John", "address": ""},
  {"name": "Zow, Peter", "address": ""}
]

the first three entries are businesses, and the last three are individuals. I'm trying to split them as such by their key name and whether they are a business or individual [as in, sort by businesses vs. individuals]

Its' known that the dataset starts with businesses, followed by individuals, and each set being sorted alphanumerically [or alphabetically]. Individuals can't have numbers in their name, so the first portion [businesses] of the set is alphanumeric while individuals are strictly alphabetic, but would maybe prefer to treat both as alphanumeric, particularly because the flaw [explained below.]

Second, it should be noted that the first character in the name may repeat again i.e. there's three individuals who have their last name start with A, but taking into account their full name, Alderson comes before Ave. Also, two individuals have the same exact last name. And the same could happen in the business section. It shouldn't split at the second recurrence of 'A' under business but the first recurrence after cycling through the alphabet.

As said, I do have a working method but I view it as inefficient and a hack. The parameter response is a list of dicts, where they all have the same keys, and one of those keys being 'name'.

def sort_politics(response):   
    #names is also equal to the the list-dict format provided.
    names = [v for dic in response for k, v in dic.items() if k == 'name']
    first_name = names[0]
    second_name = ""
    for name in names:
        if name == first_name:
            continue
        if name >= second_name:
            second_name = name
        elif name <= second_name:
            print("start's over with {0}".format(name))
            second_name = name
            break
        continue
    businesses = response[0:names.index(second_name)]
    individuals = response[names.index(second_name):]
    print(businesses)
    print(individuals)

The reason I view it as a hack is it doesn't actually sort through the list of dicts, but pulls each name from the list of dicts and then sorts through that using a for-loop and if conditions. Then after, I have to find the index to split it by. Further, there is the flaw that it will only split once. If, theoretically, it the alphabenumeric were to reset after individuals, it'd be included in individuals as per line #16 (or #17 if you count the note).

question from:https://stackoverflow.com/questions/65867537/how-to-split-a-list-by-2nd-recurrence

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I believe this should do what you want it to:

def sort_politics(response):
    businesses = []
    individuals = []

    current_name = response[0]['name']

    for index, item in enumerate(response):
        if item['name'] < current_name:
            businesses = response[:index]
            individuals = response[index:]
            break
        else:
            current_name = item['name']

    return businesses, individuals

So instead of making a whole new list of names, I just go through the list, dictionary by dictionary, and compare the name attribute.

If the name is less than the current_name (which is first set to the very first dictionary's name) then we set the businesses and individuals list based on the index of the item and break out of the for loop. Otherwise, their name is set to the current_name.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...