Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
658 views
in Technique[技术] by (71.8m points)

if statement - Ifelse in R with multiple categorical conditions

I have a data set, dt.train2 with 1500 different observations and 130 variables. One of them is languages and it can be english, french, arabic...

I want to create a ifelse string that gives attributes 1 for english, 2 for french, 3 for spanish and 0 for anything else. I have no idea how to do it.

dt.train2[, language_string := ifelse(language == "english", 
                                      1, 
                                      ifelse(language == "french", 
                                             2, 
                                             ifelse(language == "spanish",3)]

I'm using this to run linear model about sales.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You're almost there with the ifelse(), you just need the final else result (and a couple missing closing parentheses).

dt.train2[, language_string := ifelse(
  language == "english", 1,
    ifelse(language == "french", 2,
      ifelse(language == "spanish", 3, 0)
    )
  )
]

A couple other ways you could do this:

Make a lookup table and join:

# sample data
dt = data.table(language = c("english", "french", "spanish", "arabic", "chinese", "pig latin"))


lookup = data.table(language = c("english", "french", "spanish"),
                    language_string = c(1, 2, 3))

dt2 = merge(dt, lookup, by = "language", all.x = TRUE)
dt2[is.na(language_string), language_string := 0]

The above lookup table method is probably the nicest for scalability. However, for such a small number of encodings, you could also just set each of them:

# start with the default, 0
dt[, language_string := 0 ]
# then do each of the exceptions
dt[lanuage == "english", language_string := 1]
dt[language == "french", language_string := 2]
dt[language == "spanish", language_string := 3]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...