Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
116 views
in Technique[技术] by (71.8m points)

python - How to transform JSON SList to pandas dataframe?

a = ['{"type": "book",', 
     '"title": "sometitle",', 
     '"author": [{"name": "somename"}],', 
     '"year": "2000",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '',
     '{"type": "book",', '
     '"title": "sometitle2",', 
     '"author": [{"name": "somename2"}],', 
     '"year": "2001",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '']

I have this convoluted SList and I would like to ultimately get it into a tidy pandas dataframe.

I have tried a number of things, for example:

i = iter(a)
b = dict(zip(i, i))

Unfortunately, this creates a dictionary that looks even worse:

{'{"type": "book",':
...

Where I had an SList of dictionaries, I now have a dictionary of dictionaries.

I also tried

pd.json_normalize(a)

but this throws an error message AttributeError: 'str' object has no attribute 'values'

I also tried

r = json.dumps(a.l)
loaded_r = json.loads(r)
print(loaded_r)

but this yields a list

['{"type": "book",',
...

Again, in the end I'd like to have a pandas dataframe like this

type   title       author     year ...

book   sometitle   somename   2000 ...
book   sometitle2 somename2   2001

Obviously, I haven't really gotten to the point where I can feed the data to a pandas function. Everytime I did that, the functions screamed at me...

question from:https://stackoverflow.com/questions/65713072/how-to-transform-json-slist-to-pandas-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
a = ['{"type": "book",', 
     '"title": "sometitle",', 
     '"author": [{"name": "somename"}],', 
     '"year": "2000",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '',
     '{"type": "book",', 
     '"title": "sometitle2",', 
     '"author": [{"name": "somename2"}],', 
     '"year": "2001",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '']

b = "[%s]" % ''.join([',' if i == '' else i for i in a ]).strip(',')
data = json.loads(b)
df = pd.DataFrame(data)

print(df)

   type       title                   author  year  
0  book   sometitle   [{'name': 'somename'}]  2000   
1  book  sometitle2  [{'name': 'somename2'}]  2001   

                               identifier      publisher  
0  [{'type': 'ISBN', 'id': '1234567890'}]  somepublisher  
1  [{'type': 'ISBN', 'id': '1234567890'}]  somepublisher

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...