You're almost there with the ifelse()
, you just need the final else
result (and a couple missing closing parentheses).
dt.train2[, language_string := ifelse(
language == "english", 1,
ifelse(language == "french", 2,
ifelse(language == "spanish", 3, 0)
)
)
]
A couple other ways you could do this:
Make a lookup table and join:
# sample data
dt = data.table(language = c("english", "french", "spanish", "arabic", "chinese", "pig latin"))
lookup = data.table(language = c("english", "french", "spanish"),
language_string = c(1, 2, 3))
dt2 = merge(dt, lookup, by = "language", all.x = TRUE)
dt2[is.na(language_string), language_string := 0]
The above lookup table method is probably the nicest for scalability. However, for such a small number of encodings, you could also just set each of them:
# start with the default, 0
dt[, language_string := 0 ]
# then do each of the exceptions
dt[lanuage == "english", language_string := 1]
dt[language == "french", language_string := 2]
dt[language == "spanish", language_string := 3]
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…