Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
222 views
in Technique[技术] by (71.8m points)

r - Reordering factor gives different results, depending on which packages are loaded

I wanted to create a barplot in which the bars were ordered by height rather than alphabetically by category. This worked fine when the only package I loaded was ggplot2. However, when I loaded a few more packages and ran the same code that created, sorted, and plotted my data frame, the bars had reverted to being sorted alphabetically again.

I checked the data frame each time using str() and it turned out that the attributes of the data frame were now different, even though I'd run the same code each time.

My code and output are listed below. Can anyone explain the differing behavior? Why does loading a few apparently unrelated packages (unrelated in the sense that none of the functions I'm using seem to be masked by the newly loaded packages) change the result of running the transform() function?

Case 1: Just ggplot2 loaded

library(ggplot2)

group = c("C","F","D","B","A","E")
num = c(12,11,7,7,2,1)
data = data.frame(group,num)
data1 = transform(data, group=reorder(group,-num))

> str(data1)
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "C","F","B","D",..: 1 2 4 3 5 6
  ..- attr(*, "scores")= num [1:6(1d)] -2 -7 -12 -7 -1 -11
  .. ..- attr(*, "dimnames")=List of 1
  .. .. ..$ : chr  "A" "B" "C" "D" ...
 $ num  : num  12 11 7 7 2 1

Case 2: Load several more packages, then run the same code again

library(plyr)
library(xtable)
library(Hmisc)
library(gmodels)
library(reshape2)
library(vcd)
library(lattice)

group = c("C","F","D","B","A","E")
num = c(12,11,7,7,2,1)
data = data.frame(group,num)
data1 = transform(data, group=reorder(group,-num))

> str(data1)
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "A","B","C","D",..: 3 6 4 2 1 5
 $ num  : num  12 11 7 7 2 1

UPDATE: SessionInfo()

Case 1: Ran sessionInfo() after loading ggplot2

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
  [1] C/en_US.UTF-8/C/C/C/C

attached base packages:
  [1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
  [1] ggplot2_0.9.1

loaded via a namespace (and not attached):
  [1] MASS_7.3-18        RColorBrewer_1.0-5 colorspace_1.1-1   dichromat_1.2-4    digest_0.5.2       grid_2.15.0       
[7] labeling_0.1       memoise_0.1        munsell_0.3        plyr_1.7.1         proto_0.3-9.2      reshape2_1.2.1    
[13] scales_0.2.1       stringr_0.6        tools_2.15.0

Case 2: Ran sessionInfo() after loading the additional packages

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
  [1] C/en_US.UTF-8/C/C/C/C

attached base packages:
  [1] grid      splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
  [1] lattice_0.20-6   vcd_1.2-13       colorspace_1.1-1 MASS_7.3-18      reshape2_1.2.1   gmodels_2.15.2  
[7] Hmisc_3.9-3      survival_2.36-14 xtable_1.7-0     plyr_1.7.1       ggplot2_0.9.1   

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.0-5 cluster_1.14.2     dichromat_1.2-4    digest_0.5.2       gdata_2.8.2        gtools_2.6.2      
[7] labeling_0.1       memoise_0.1        munsell_0.3        proto_0.3-9.2      scales_0.2.1       stringr_0.6       
[13] tools_2.15.0
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This happens because:

  1. gmodels imports gdata
  2. gdata creates a new method for reorder.factor

Start a clean session. Then:

methods("reorder")
[1] reorder.default*    reorder.dendrogram*

Now load gdata (or load gmodels, which has the same effect):

library(gdata)
methods("reorder")
[1] reorder.default*    reorder.dendrogram* reorder.factor 

Notice there is no masking, since reorder.factor doesn't exist in base

Recreate the problem, but this time explicitly call the different packages:

group = c("C","F","D","B","A","E")
num = c(12,11,7,7,2,1)
data = data.frame(group,num)

The base R version (using reorder.default):

str(transform(data, group=stats:::reorder.default(group,-num)))
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "C","F","B","D",..: 1 2 4 3 5 6
  ..- attr(*, "scores")= num [1:6(1d)] -2 -7 -12 -7 -1 -11
  .. ..- attr(*, "dimnames")=List of 1
  .. .. ..$ : chr  "A" "B" "C" "D" ...
 $ num  : num  12 11 7 7 2 1

The gdata version (using reorder.factor):

str(transform(data, group=gdata:::reorder.factor(group,-num)))
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "A","B","C","D",..: 3 6 4 2 1 5
 $ num  : num  12 11 7 7 2 1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...