python - scikit-learn random state in splitting dataset

Question

Welcome To Ask or Share your Answers For Others

python - scikit-learn random state in splitting dataset

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - scikit-learn random state in splitting dataset

Can anyone tell me why we set random state to zero in splitting train and test set.

X_train, X_test, y_train, y_test = 
    train_test_split(X, y, test_size=0.30, random_state=0)

I have seen situations like this where random state is set to 1!

X_train, X_test, y_train, y_test = 
    train_test_split(X, y, test_size=0.30, random_state=1)

What is the consequence of this random state in cross validation as well?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:20:52+0000

It doesn't matter if the random_state is 0 or 1 or any other integer. What matters is that it should be set the same value, if you want to validate your processing over multiple runs of the code. By the way I have seen random_state=42 used in many official examples of scikit as well as elsewhere also.

random_state as the name suggests, is used for initializing the internal random number generator, which will decide the splitting of data into train and test indices in your case. In the documentation, it is stated that:

If random_state is None or np.random, then a randomly-initialized RandomState object is returned.

If random_state is an integer, then it is used to seed a new RandomState object.

If random_state is a RandomState object, then it is passed through.

This is to check and validate the data when running the code multiple times. Setting random_state a fixed value will guarantee that same sequence of random numbers are generated each time you run the code. And unless there is some other randomness present in the process, the results produced will be same as always. This helps in verifying the output.

Categories

python - scikit-learn random state in splitting dataset

python - scikit-learn random state in splitting dataset

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags