Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
97 views
in Technique[技术] by (71.8m points)

python - print the unique values in every column in a pandas dataframe

I have a dataframe (df) and want to print the unique values from each column in the dataframe.

I need to substitute the variable (i) [column name] into the print statement

column_list = df.columns.values.tolist()
for column_name in column_list:
    print(df."[column_name]".unique()

Update

When I use this: I get "Unexpected EOF Parsing" with no extra details.

column_list = sorted_data.columns.values.tolist()
for column_name in column_list:
      print(sorted_data[column_name].unique()

What is the difference between your syntax YS-L (above) and the below:

for column_name in sorted_data:
      print(column_name)
      s = sorted_data[column_name].unique()
      for i in s:
        print(str(i))
question from:https://stackoverflow.com/questions/27241253/print-the-unique-values-in-every-column-in-a-pandas-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It can be written more concisely like this:

for col in df:
    print(df[col].unique())

Generally, you can access a column of the DataFrame through indexing using the [] operator (e.g. df['col']), or through attribute (e.g. df.col).

Attribute accessing makes the code a bit more concise when the target column name is known beforehand, but has several caveats -- for example, it does not work when the column name is not a valid Python identifier (e.g. df.123), or clashes with the built-in DataFrame attribute (e.g. df.index). On the other hand, the [] notation should always work.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...