Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
543 views
in Technique[技术] by (71.8m points)

r - Using ggplot2 with columns that have spaces in their names

I've the following data frame structure

df <- as.data.frame(A)
colnames(df)<- c("Sum of MAE", "Company")
df <- na.omit(df)
df2 <- df[order(df[,1]),]
df2 <- head(df2, n=10)
ggplot(df2, aes_string("Sum of MAE", "Company", group=1) + geom_line())
print(df2)

This is the structure of the data

 Sum of MAE Company
606   0.030156758080105    COCO
182  0.0600065426668421    APWC
836  0.0602272459239397     EDS
1043 0.0704327240953608    FREE
2722               0.09   VLYWW
1334 0.0900000000000001    IKAN
2420  0.104746328560384     SPU
860   0.106063964745531    ELON
2838  0.108373386847075    WTSL
1721  0.110086738825851    MTSL

The ggplot doesnt seem to be working. After a litany of errors the current one I'm getting is

Error in parse(text = x) : <text>:1:5: unexpected symbol
1: Sum of

Can someone help me getting the ggplot 2 working.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is a good reason you should always make sure you have valid column names. First, here's an easier-to-reproduce version of your dataset

df2 <- data.frame(`Sum of MAE` = c(0.030156758080105, 0.0600065426668421, 
   0.0602272459239397, 0.0704327240953608, 0.09, 0.0900000000000001, 
   0.104746328560384, 0.106063964745531, 0.108373386847075, 0.110086738825851
   ), Company = c("COCO", "APWC", "EDS", "FREE", "VLYWW", "IKAN", "SPU", "ELON", 
   "WTSL", "MTSL"), check.names=F)

ggplot(df2, aes_string("Sum of MAE", "Company", group=1) + geom_line())
# Error in parse(text = x) : <text>:1:5: unexpected symbol
# 1: Sum of
#         ^

The problem is that aes_string() uses parse() to turn your text expression into a proper R symbol that can be resolved within the data.frame. When you parse "Sum of MAE" that's not valid R syntax -- that is, it doesn't resolve to a single nice symbol name. If you use "bad" names like that, you can escape them with the back-tick to treat the expression (spaces and all) as a symbol. So you can do

ggplot(df2, aes_string("`Sum of MAE`", "Company", group=1)) + geom_line()
# or
ggplot(df2, aes(`Sum of MAE`, Company, group=1)) + geom_line()

but really it would be better to stick to using valid column names for your data.frame rather than bypassing the checks with colnames().

If you were changing the column names to get "nicer" axis labels, you should probably do what with xlab() instead. For example

df3 <- data.frame(df2)
names(df3)
# [1] "Sum.of.MAE" "Company" 
ggplot(df3, aes(Sum.of.MAE, Company, group=1)) + 
    geom_line() + 
    xlab("Sum of MAE values")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...