python - 使用keras平衡火车数据集(balancing train dataset using keras)

Question

Welcome To Ask or Share your Answers For Others

python - 使用keras平衡火车数据集(balancing train dataset using keras)

1 Reply

深蓝 · Answer 1 · 2021-03-06T04:20:58+0000

There are a number of ways and best-practices to deal with so called imbalanced data sets.

(有许多方法和最佳实践来处理所谓的不平衡数据集。)

Upsample the minority class (Drawback: possibly overfitting of minority class)
(提升少数族裔的样本（缺点：少数族裔可能过度拟合）)
Downsample the majority class (Drawback: loss of training data, information loss)
(降低大多数类别的采样率 （缺点：训练数据丢失，信息丢失）)

There are a number of techniques you can use for this, some even offer methods to overcome drawbacks (eg synthetic sampling).

(您可以使用多种技术，甚至可以提供克服缺点的方法（例如，合成采样）。)

Have a look at the imbalanced-learn package for a easy-to-use implementation.

(查看imbalanced-learn软件包，该软件包易于使用。)

Another thing you could use is to weight the loss of your model in order to tell the model that it should "pay more attention" to specific classes.

(您可以使用的另一件事是权衡模型的损失，以告知模型它应“更加关注”特定的类。)

This can be easily done by defining the optional argument class_weight in keras fit function.

(通过在keras fit函数中定义可选参数class_weight可以轻松完成此操作。)

The class weights can be easily computed by sklearns compute_class_weight function.

(类别权重可以通过sklearns compute_class_weight函数轻松计算。)

Categories

python - 使用keras平衡火车数据集(balancing train dataset using keras)

python - 使用keras平衡火车数据集(balancing train dataset using keras)

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags