I presently have the ability to remove duplicates if there is no key in front of the nested dictionary. An example of my list of dicts that works with this function is:
[{'asndb_prefix': '164.39.xxx.0/17',
'cidr': '164.39.xxx.0/17',
'cymru_asn': 'XXX',
'cymru_country': 'GB',
'cymru_owner': 'XXX , GB',
'cymru_prefix': '164.39.xxx.0/17',
'ips': ['164.39.xxx.xxx'],
'network_id': '164.39.xxx.xxx/24',},
{'asndb_prefix': '54.192.xxx.xxx/16',
'cidr': '54.192.0.0/16',
'cymru_asn': '16509',
'cymru_country': 'US',
'cymru_owner': 'AMAZON-02 - Amazon.com, Inc., US',
'cymru_prefix': '54.192.144.0/22',
'ips': ['54.192.xxx.xxx', '54.192.xxx.xxx'],
'network_id': '54.192.xxx.xxx/24',
}]
def remove_dict_duplicates(list_of_dicts):
"""
"" Remove duplicates in dict
"""
list_of_dicts = [dict(t) for t in set([tuple(d.items()) for d in list_of_dicts])]
# remove the {} before and after - not sure why these are placed as
# the first and last element
return list_of_dicts[1:-1]
However, I would like to be able to remove duplicates based on the key and all values associated within that dictionary. So if there the same key with different values inside I would like to not remove it, but if there is a complete copy then remove it.
[{'50.16.xxx.0/24': {'asndb_prefix': '50.16.0.0/16',
'cidr': '50.16.0.0/14',
'cymru_asn': 'xxxx',
'cymru_country': 'US',
'cymru_owner': 'AMAZON-AES - Amazon.com, Inc., US',
'cymru_prefix': '50.16.0.0/16',
'ip': '50.16.221.xxx',
'network_id': '50.16.xxx.0/24',
'pyasn_asn': xxxx,
'whois_asn': 'xxxx'}},
// This would be removed
{'50.16.xxx.0/24': {'asndb_prefix': '50.16.0.0/16',
'cidr': '50.16.0.0/14',
'cymru_asn': 'xxxxx',
'cymru_country': 'US',
'cymru_owner': 'AMAZON-AES - Amazon.com, Inc., US',
'cymru_prefix': '50.16.0.0/16',
'ip': '50.16.221.xxx',
'network_id': '50.16.xxx.0/24',
'pyasn_asn': xxxx,
'whois_asn': 'xxxx'}},
// This would NOT be removed
{'50.16.xxx.0/24': {'asndb_prefix': '50.999.0.0/16',
'cidr': '50.999.0.0/14',
'cymru_asn': 'xxxx',
'cymru_country': 'US',
'cymru_owner': 'AMAZON-AES - Amazon.com, Inc., US',
'cymru_prefix': '50.16.0.0/16',
'ip': '50.16.221.xxx',
'network_id': '50.16.xxx.0/24',
'pyasn_asn': xxxx,
'whois_asn': 'xxxx'}}]
How do I go about doing this? Thank you.
See Question&Answers more detail:
os