I am trying to filter out the dataframe that contains a list of product. However, I am getting the pandas - 'dataframe' object has no attribute 'str' error whenever I run the code.
Here is the line of code:
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
If anyone has any ideas of suggestions, please let me know. I've searched many times and I'm quite stuck.
Product is an object datatype.
EDIT:
import __future__
import os
import pandas as pd
import numpy as np
import tensorflow as tf
import math
data = pd.read_csv("FILE.csv", header = None)
headerName=["DRID","Product","M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]
cliques = [(Confidential)]
data.columns=[headerName]
log_df = data
log_df = np.log(1+data[["M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]])
copy = data[["DRID","Product"]].copy()
log_df = copy.join(log_df)
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
Here is the head:
ID PRODUCT M24 M23 M22 M21
0 123421 A 0.000000 0.000000 1.098612 0.0
1 141840 A 0.693147 1.098612 0.000000 0.0
2 212006 A 0.693147 0.000000 0.000000 0.0
3 216097 A 1.098612 0.000000 0.000000 0.0
4 219517 A 1.098612 0.693147 1.098612 0.0
edit 2: here is print(data), A is the product. it looks like A is not under the category product when I print it out.
DRID Product M24 M23 M22 M21 M20
0 52250 A 0.0 0.0 2.0 0.0 0.0
1 141840 A 1.0 2.0 0.0 0.0 0.0
2 212006 A 1.0 0.0 0.0 0.0 0.0
3 216097 A 2.0 0.0 0.0 0.0 0.0
See Question&Answers more detail:
os