python - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets sklearn

Question

Welcome To Ask or Share your Answers For Others

python - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets sklearn

posted Jan 29, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets sklearn

I use the random forest classifier algorithm to predict the belonging of my samples to different classes (5 different classes). However, after having made the prediction I cannot evaluate my model precisely because of the different classes. I saw in another post that it was necessary to use np.argmax(y_pred, axis=1) but I didn't really understand the usefulness and how to use this element nor even if it is required in my case. Can you please help me?

import numpy as np
import pandas as pd
from sklearn import metrics
from keras.utils import to_categorical
import sklearn as sk
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

X = pd.read_csv('/Users/lottie/desktop/1.csv', header=None)
Y = pd.read_csv('/Users/lottie/desktop/2.csv', header=None)

X.drop([0,0], inplace=True)
Y.drop([0,0], inplace=True)
del X[0]
del Y[0]

Y_encoded = list()
for i in Y.loc[0:,1] :
    if i == 'BRCA' : Y_encoded.append(0)
    if i == 'KIRC' : Y_encoded.append(1)
    if i == 'COAD' : Y_encoded.append(2)
    if i == 'LUAD' : Y_encoded.append(3)
    if i == 'PRAD' : Y_encoded.append(4)
Y_bis = to_categorical(Y_encoded)


X_train, X_test, y_train, y_test = train_test_split(X, Y_bis, test_size=0.30, random_state=42)

regressor = RandomForestRegressor(n_estimators=20, random_state=0)
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)


print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(accuracy_score(y_test, y_pred))

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-01-29T04:28:24+0000

You are using RandomForestRegressor. This model is for continuous variables (such as the price of a house), and your output is not continuous if you have classes.

If you have classes you have to use RandomForestClassifier. Obviously, you have to encode your output as number. One number for each different class. Then, when you predict, you will obtain the number of the class.

Categories

python - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets sklearn

python - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets sklearn

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags