I'm parsing files that containt json objects. The problem is that some files have multiple objects in one line. e.g.:
{"data1": {"data1_inside": "bla{bl"a"}}{"data1": {"data1_inside": "blabla["}}{"data1": {"data1_inside": "bla{bla"}}{"data1": {"data1_inside": "bla["}}
I've made a function that tries parsing a substring when there are no open brackets left, but there may be curly brackets in values. I've tried skipping values with checking the start and end of quotes, but there are also values with escaped quotes. Any ideas on how to deal with this?
My attempt:
def get_lines(data):
lines = []
open_brackets = 0
start = 0
is_comment = False
for index, c in enumerate(data):
if c == '"':
is_comment = not is_comment
elif not is_comment:
if c == '{':
if not open_brackets:
start = index
open_brackets += 1
if c == '}':
open_brackets -= 1
if not open_brackets:
lines.append(data[start: index+1])
return lines
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…