Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
372 views
in Technique[技术] by (71.8m points)

r - Calculate correlation with cor(), only for numerical columns

I have a dataframe and would like to calculate the correlation (with Spearman, data is categorical and ranked) but only for a subset of columns. I tried with all, but R's cor() function only accepts numerical data (x must be numeric, says the error message), even if Spearman is used.

One brute approach is to delete the non-numerical columns from the dataframe. This is not as elegant, for speed I still don't want to calculate correlations between all columns.

I hope there is a way to simply say "calculate correlations for columns x, y, z". Column references could by number or by name. I suppose the flexible way to provide them would be through a vector.

Any suggestions are appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

if you have a dataframe where some columns are numeric and some are other (character or factor) and you only want to do the correlations for the numeric columns, you could do the following:

set.seed(10)

x = as.data.frame(matrix(rnorm(100), ncol = 10))
x$L1 = letters[1:10]
x$L2 = letters[11:20]

cor(x)

Error in cor(x) : 'x' must be numeric

but

cor(x[sapply(x, is.numeric)])

             V1         V2          V3          V4          V5          V6          V7
V1   1.00000000  0.3025766 -0.22473884 -0.72468776  0.18890578  0.14466161  0.05325308
V2   0.30257657  1.0000000 -0.27871430 -0.29075170  0.16095258  0.10538468 -0.15008158
V3  -0.22473884 -0.2787143  1.00000000 -0.22644156  0.07276013 -0.35725182 -0.05859479
V4  -0.72468776 -0.2907517 -0.22644156  1.00000000 -0.19305921  0.16948333 -0.01025698
V5   0.18890578  0.1609526  0.07276013 -0.19305921  1.00000000  0.07339531 -0.31837954
V6   0.14466161  0.1053847 -0.35725182  0.16948333  0.07339531  1.00000000  0.02514081
V7   0.05325308 -0.1500816 -0.05859479 -0.01025698 -0.31837954  0.02514081  1.00000000
V8   0.44705527  0.1698571  0.39970105 -0.42461411  0.63951574  0.23065830 -0.28967977
V9   0.21006372 -0.4418132 -0.18623823 -0.25272860  0.15921890  0.36182579 -0.18437981
V10  0.02326108  0.4618036 -0.25205899 -0.05117037  0.02408278  0.47630138 -0.38592733
              V8           V9         V10
V1   0.447055266  0.210063724  0.02326108
V2   0.169857120 -0.441813231  0.46180357
V3   0.399701054 -0.186238233 -0.25205899
V4  -0.424614107 -0.252728595 -0.05117037
V5   0.639515737  0.159218895  0.02408278
V6   0.230658298  0.361825786  0.47630138
V7  -0.289679766 -0.184379813 -0.38592733
V8   1.000000000  0.001023392  0.11436143
V9   0.001023392  1.000000000  0.15301699
V10  0.114361431  0.153016985  1.00000000

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...