Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
474 views
in Technique[技术] by (71.8m points)

r - Why are Xs added to data frame variable names when using read.csv?

When I use the read.csv() function in R to load data, I often find that an X has been added to variable names. I think I just about always see it it in the first variable, but I could be wrong.

At first, I thought R might be doing this because I had a space at the beginning of the variable name - I don't.

Second, I had read somewhere that if you have a variable that starts with a number, or is a very short variable name, R would add the X. The variable name is all text and the length of the name of this variable is 12 characters, so it's not short.

Now, this is purely an annoyance. I can rename the column, but it does add a step, albeit a small one.

Is there a way to prevent this from rogue X from infiltrating my data frame?

Here is my original code:

df <- read.csv("/file/location.filecsv", header=T, sep=",")

Here is the variable in question:

str(orders)
'data.frame':   2620276 obs. of  26 variables:
 $ X.OrderDetailID    : Factor w/ 2620193 levels "(2620182 row(s) affected)",..: 105845
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

read.table and read.csv have a check.names= argument that you can set to FALSE.

For example, try it with this input consisting of just a header:

> read.csv(text = "a,1,b")
[1] a  X1 b 
<0 rows> (or 0-length row.names)

versus

> read.csv(text = "a,1,b", check.names = FALSE)
[1] a 1 b
<0 rows> (or 0-length row.names)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...