I'm parsing a JSON that kinda looks like this:
[{"acc":P1,"Lenght":855,..."MBDB-1":{"source_id":"2btp_A","regions":[[70,73],[231,234]],"content_fraction":0.033,"content_count":8},"MBDB-2":{...},"MDB-2":{...}},
{"acc":P2,"Lenght":145,...,"MBDB-14":{...},...}]
And I'm trying to generate a dictionary with only the information that I want (ie, "acc", "Lenght"
) and all the information INSIDE the keys that starts with "MBDB", no matter what comes after that (the actual file is huge, with a lot of information that I don't really need).
For the first two items, it's fairly easy. This is what I got:
import json
my_dict= dict.fromkeys(['ID', 'MISSING','LENGHT'])
with open("...mypathJson1.json") as f:
data = json.loads(f.read())
for i in data:
if "acc" in i:
my_dict["ID"]=i["acc"]
But I'm really lost on how to append each of the values of "MBDB-something" to the MISSING
key. As far as I understand, I can't use startswith()
, because I'm working with a dict (generated by json.loads()
).
This is what the result should look like:
ID LENGHT source_id regions content_count
0 P1 855 2btp_A [[70,73],[231,234]] 8
1 P1 855 ... [...] #
2 P2 145 ... [...] #
So I can later use .explode
and perform different operations on some of the information that these keys hold.
I feel that I'm out of my league to solve this issue, so any advice is welcome!
EDIT: I've edited the desired output to be the content of the different keys INSIDE all the "MBDB" keys.
question from:
https://stackoverflow.com/questions/65890539/trying-to-group-different-values-that-have-some-similarities-in-a-dictionary