Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
242 views
in Technique[技术] by (71.8m points)

Python pandas, reorder a csv file based on a colum and write to a csv file

df = pd.read_csv("file.csv")
sorted_df = df.sort_values(by = 'index', ascending = False)
sorted_df.to_csv("output.csv", index = False)

Index is the name of column with which I have to sort the csv file However, I get a key error saying index column cannot be found

Before sorting:

index;name;result
1;John;Ok
2;Jacob;Ok
6;Philip;Nok
7;Joe;Nok
4;Stanley;Ok
5;Alfred;Ok
3;Jill;Nok

Expected result after sorting:

index;name;result
1;John;Ok
2;Jacob;Ok
3;Jill;Nok
4;Stanley;Ok
5;Alfred;Ok
6;Philip;Nok
7;Joe;Nok

question from:https://stackoverflow.com/questions/66048621/python-pandas-reorder-a-csv-file-based-on-a-colum-and-write-to-a-csv-file

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In Pandas Index is a keyword for the index of a dataframe. When sorting by indes pandas might be not sure if to use the real index of the dataframe or the column named index. Same when exporting the dataframe you are telling pandas not to export the dataframe index. But it is not affecting the column index. This might cause your trouble.

Let's take your data in a csv file seperated by semi-colon

index;name;result
1;John;Ok
2;Jacob;Ok
6;Philip;Nok
7;Joe;Nok
4;Stanley;Ok
5;Alfred;Ok

To show you the difference, I can read it directly with 3 data columns

df = pd.read_csv("/Users/aortner/Desktop/todelete.csv",delimiter=';')
print(df)

 index     name result
0      1     John     Ok
1      2    Jacob     Ok
2      6   Philip    Nok
3      7      Joe    Nok
4      4  Stanley     Ok
5      5   Alfred     Ok

Or I can use the first column of the csv file as index for the pandas dataframe by specifying the index index_col=

import pandas as pd
df = pd.read_csv("/Users/aortner/Desktop/todelete.csv",delimiter=';',index_col="index")
print(df)

          name result
index                
1         John     Ok
2        Jacob     Ok
6       Philip    Nok
7          Joe    Nok
4      Stanley     Ok
5       Alfred     Ok

this can be sorted by index

sorted_df = df.sort_values(by = 'index', ascending = False)
print(sorted_df)

          name result
index                
7          Joe    Nok
6       Philip    Nok
5       Alfred     Ok
4      Stanley     Ok
2        Jacob     Ok
1         John     Ok

and exported without index column

sorted_df.to_csv("output.csv", index = False)
!cat output.csv

name,result
Joe,Nok
Philip,Nok
Alfred,Ok
Stanley,Ok
Jacob,Ok
John,Ok

or with index column

sorted_df.to_csv("output.csv", index = True)
!cat output.csv

index,name,result
7,Joe,Nok
6,Philip,Nok
5,Alfred,Ok
4,Stanley,Ok
2,Jacob,Ok
1,John,Ok

Hope that solves your issue.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...