python - Scikit-learn cross val score: too many indices for array

Question

Welcome To Ask or Share your Answers For Others

python - Scikit-learn cross val score: too many indices for array

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Scikit-learn cross val score: too many indices for array

I have the following code

 from sklearn.ensemble import ExtraTreesClassifier
 from sklearn.cross_validation import cross_val_score
 #split the dataset for train and test
 combnum['is_train'] = np.random.uniform(0, 1, len(combnum)) <= .75
 train, test = combnum[combnum['is_train']==True], combnum[combnum['is_train']==False]

 et = ExtraTreesClassifier(n_estimators=200, max_depth=None, min_samples_split=10, random_state=0)
 min_samples_split=10, random_state=0  )

 labels = train[list(label_columns)].values
 tlabels = test[list(label_columns)].values

 features = train[list(columns)].values
 tfeatures = test[list(columns)].values

 et_score = cross_val_score(et, features, labels, n_jobs=-1)
 print("{0} -> ET: {1})".format(label_columns, et_score))

Checking the shape of the arrays:

 features.shape
 Out[19]:(43069, 34)

And

labels.shape
Out[20]:(43069, 1)

and I'm getting:

IndexError: too many indices for array

and this relevant part of the traceback:

---> 22 et_score = cross_val_score(et, features, labels, n_jobs=-1)

I'm creating the data from Pandas dataframes and I searched here and saw some reference to possible errors via this method but can't figure out how to correct? What the data arrays look like: features

Out[21]:
array([[ 0.,  1.,  1., ...,  0.,  0.,  1.],
   [ 0.,  1.,  1., ...,  0.,  0.,  1.],
   [ 1.,  1.,  1., ...,  0.,  0.,  1.],
   ..., 
   [ 0.,  0.,  1., ...,  0.,  0.,  1.],
   [ 0.,  0.,  1., ...,  0.,  0.,  1.],
   [ 0.,  0.,  1., ...,  0.,  0.,  1.]])

labels

Out[22]:
array([[1],
   [1],
   [1],
   ..., 
   [1],
   [1],
   [1]])

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:02:54+0000

When we do cross validation in scikit-learn, the process requires an (R,) shape label instead of (R,1). Although they are the same thing to some extend, their indexing mechanisms are different. So in your case, just add:

c, r = labels.shape
labels = labels.reshape(c,)

before passing it to the cross-validation function.

Categories

python - Scikit-learn cross val score: too many indices for array

python - Scikit-learn cross val score: too many indices for array

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags