Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
364 views
in Technique[技术] by (71.8m points)

r - R:将稀有值替换为“其他”(R: Replace rare values by “others”)

I have the following problem: I have a data frame df with many variables.

(我有以下问题:我有一个带有许多变量的数据框df 。)

One variable is df$size (non-numeric).

(一个变量是df $ size (非数字)。)

Now I want to replace all sizes with less than 20 observations by the term "other".

(现在,我要用“其他”一词替换少于20个观察值的所有大小。)

sort(table(df$size))

This gives me an overview of the values I want to replace.

(这为我提供了我要替换的值的概述。)

But how do I replace them in my df?

(但是,如何在df中替换它们?)

df$size[sort(table(df$size))<20]="other"

That does not work.

(那行不通。)

Thank you!

(谢谢!)

  ask by Christopher translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Works with something along this

(与此一起工作)

set.seed(123)
df <- data.frame(size = as.character(sample(1:5, size = 100, replace = TRUE)),
                 stringsAsFactors = FALSE)
tabs <- sort(table(df$size))
tab <- tabs[tabs < 20]

df$size[which(df$size %in% names(tab))] <- "other"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...