Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
323 views
in Technique[技术] by (71.8m points)

python - how to split my string into a nested dict with delimiter exceptions?

I need to use .split to split a string and make a nested dictionary, and have used ',' however, as can be seen in my data below, , shows up several times in the 'Reviewed' field, causing Python to label values as keys incorrectly. The Review field is meant to be a list within the dict.

A sample of my data can be seen below:

{"Username": "bkpn1412", "DOB": "31.07.1983", "State": "Oregon", "Reviewed": ["cea76118f6a9110a893de2b7654319c0"]}
{"Username": "gqjs4414", "DOB": "27.07.1998", "State": "Massachusetts", "Reviewed": ["fa04fe6c0dd5189f54fe600838da43d3"]}
{"Username": "eehe1434", "DOB": "08.08.1950", "State": "Idaho", "Reviewed": []}
{"Username": "hkxj1334", "DOB": "03.08.1969", "State": "Florida", "Reviewed": ["f129b1803f447c2b1ce43508fb822810", "3b0c9bc0be65a3461893488314236116"]}
{"Username": "jjbd1412", "DOB": "26.07.2001", "State": "Georgia", "Reviewed": []}

my current code:

#converting list to string using list comprehension
pdict = ' '.join([str(item) for item in products_list]) 
print(type(pdict))

rdict = ' '.join([str(item) for item in reviewers_list]) 
print(type(rdict))

#converting string to list of string
plist  = pdict.split(',')
rlist = rdict.split(',')
print(type(plist))
print(type(rlist))

#list of string to dict
products_dicts = {}
for item in plist:
    t = products_dicts
    for part in item.split(':'):
        t = t.setdefault(part, {})
print(type(products_dicts))

reviewers_dicts = {}
for item in rlist:
    t = reviewers_dicts
    for part in item.split(':'):
        t = t.setdefault(part, {})
print(type(reviewers_dicts))

I've tried using different delimiters but it hasn't worked, how exactly would I get around this issue (preferably without having to go through a large data set removing all unneeded commas manually).

Expected output should be similar to this:

{"Username": "bkpn1412",
"DOB": "31.07.1983",
"State": "Oregon",
"Reviewed": ["cea76118f6a9110a893de2b7654319c0"]}

{"Username": "hkxj1334",
"DOB": "03.08.1969",
"State": "Florida" ,
"Reviewed": ["f129b1803f447c2b1ce43508fb822810", "3b0c9bc0be65a3461893488314236116"]}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A way to solve this problem is to use the built-in function json.loads.

Suppose you have a file with your input data:

inputdata.txt

{"Username": "bkpn1412", "DOB": "31.07.1983", "State": "Oregon", "Reviewed": ["cea76118f6a9110a893de2b7654319c0"]}
{"Username": "gqjs4414", "DOB": "27.07.1998", "State": "Massachusetts", "Reviewed": ["fa04fe6c0dd5189f54fe600838da43d3"]}
{"Username": "eehe1434", "DOB": "08.08.1950", "State": "Idaho", "Reviewed": []}
{"Username": "hkxj1334", "DOB": "03.08.1969", "State": "Florida", "Reviewed": ["f129b1803f447c2b1ce43508fb822810", "3b0c9bc0be65a3461893488314236116"]}
{"Username": "jjbd1412", "DOB": "26.07.2001", "State": "Georgia", "Reviewed": []}

Implementing a parser to this data would be:

import json
filename = "inputdata.txt"
with open(filename) as f:
    for line in f.readlines():
        parsed_data = json.loads(line)
        print(parsed_data)

Working with one line at a time (without load all file in memory).

If you wouldn't like to load all file in the memory for processing, you can change the logic to use the function readline from default package in python.

import json
filename = "inputdata.txt"
with open(filename) as f:
    line = f.readline()
    while line:
        parsed_data = json.loads(line)
        print(parsed_data)
        line = f.readline()    

In this example we are using the context manager "with", for a good explanation why use it, check here. If you do not want to use with keywork as a context manager, after done working with file you have to explicitly call close() method (in order to avoid resource leak).

If you would like to know more about file handling, you can check in python official documentation about function open used in files.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...