Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
471 views
in Technique[技术] by (71.8m points)

python - Balanced Random Forest Classifier Producing 100% accuracy/F1. Unsure what is wrong

I am working on the Kaggle dataset, accessible below.

https://www.kaggle.com/anmolkumar/health-insurance-cross-sell-prediction

Given the imbalance of the data, I am running a balanced random forest classifier. However, the below code is giving me 100% accuracy, recall and precision and so is certainly incorrect.

from pandas import pd
from imblearn.ensemble import BalancedRandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score
from sklearn.ensemble import RandomForestClassifier

# import data
path = 'raw Data/'
df_train = pd.read_csv(path + 'train.csv')
df_train.head(3)

# Seperate features from label

X = df_train.drop(columns=['Response'])

y = df_train['Response']

# Get dummy varialbes
X = pd.get_dummies(df_train,columns = ['Gender','Region_Code','Vehicle_Age','Vehicle_Damage','Policy_Sales_Channel'],drop_first=True)

# #Split data into 3: 60% train, 20% validation, 20% test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)

# Run model
brfc = BalancedRandomForestClassifier(n_estimators=500,random_state=0).fit(X_train,y_train)
print("F1 Score for Balanced Random Forest Classifier is ", f1_score(y_test,brfc.predict(X_test)))
question from:https://stackoverflow.com/questions/66064046/balanced-random-forest-classifier-producing-100-accuracy-f1-unsure-what-is-wro

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...