Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
516 views
in Technique[技术] by (71.8m points)

r - How to reference column names that start with a number, in data.table

If the column names in data.table are in the form of number + character, for example: 4PCS, 5Y etc, how could this be referenced as j in x[i,j] so that it is interpreted as an unquoted column name.

I assume this would solve mine original problem. I wanted to add several column in 'data.table' which were in the form number + character.

M <- data.table('4PCS'=1:4,'5Y'=4:1,X5Y=2:5)
> M[,4PCS+5Y]
Error: unexpected symbol in "M[,4PCS"

The new column should be a sum of 4PSC and 5Y.

Is there a way how to refer to them in data.table in no quoted form? If these columns are referred in data.table with the quoted "logic" of data.frame :

> M[,'5Y',with=FALSE]
     5Y
[1,]  4
[2,]  3
[3,]  2
[4,]  1

then there will be a limitation in functionality of such reference. The addition would not work as it does not work in data.frame:

> M[,'4PCS'+'5Y',with=FALSE]  
Error in "4PCS" + "5Y" : non-numeric argument to binary operator

The data.table functionality would allow to operate over the columns. I would like to find a solution in the new data.table logic hence I can use its ability to transform the columns by column name referencing.

The question is:
How to quote the column name which start with number so that the data.table logic would understand that it is a column name.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think, this is what you're looking for, not sure. data.table is different from data.frame. Please have a look at the quick introduction, and then the FAQ (and also the reference manual if necessary).

require(data.table)
dt <- data.table("4PCS" = 1:3, y=3:1)
#  ? 4PCS y
# 1: ? ?1 3
# 2: ? ?2 2
# 3: ? ?3 1

# access column 4PCS
dt[, "4PCS"]

# returns a data.table
#    4PCS
# 1:    1
# 2:    2
# 3:    3

# to access multiple columns by name
dt[, c("4PCS", "y")]

Alternatively, if you need to access the column and not result in a data.table, rather a vector, then you can access using the $ notation:

dt$`4PCS` # notice the ` because the variable begins with a number
# [1] 1 2 3

# alternatively, as mnel mentioned under comments:
dt[, `4PCS`] 
# [1] 1 2 3

Or if you know the column number you can access using [[.]] as follows:

dt[[1]] # 4PCS is the first column here
# [1] 1 2 3

Edit:

Thanks @joran. I think you're looking for this:

dt[, `4PCS` + y]
# [1] 4 4 4

Fundamentally the issue is that 4CPS is not a valid variable name in R (try 4CPS <- 1, you'll get the same "Unexpected symbol" error). So to refer to it, we have to use backticks (compare`4CPS` <- 1)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...