Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
447 views
in Technique[技术] by (71.8m points)

r - How to use the 'sweep' function

When I look at the source of R Packages, i see the function sweep used quite often. Sometimes it's used when a simpler function would have sufficed (e.g., apply), other times, it's impossible to know exactly what it's is doing without spending a fair amount of time to step through the code block it's in.

The fact that I can reproduce sweep's effect using a simpler function suggests that i don't understand sweep's core use cases, and the fact that this function is used so often suggests that it's quite useful.

The context:

sweep is a function in R's standard library; its arguments are:

sweep(x, MARGIN, STATS, FUN="-", check.margin=T, ...)

# x is the data
# STATS refers to the summary statistics which you wish to 'sweep out'
# FUN is the function used to carry out the sweep, "-" is the default

As you can see, the arguments are similar to apply though sweep requires one more parameter, STATS.

Another key difference is that sweep returns an array of the same shape as the input array, whereas the result returned by apply depends on the function passed in.

sweep in action:

# e.g., use 'sweep' to express a given matrix in terms of distance from 
# the respective column mean

# create some data:
M = matrix( 1:12, ncol=3)

# calculate column-wise mean for M
dx = colMeans(M)

# now 'sweep' that summary statistic from M
sweep(M, 2, dx, FUN="-")

     [,1] [,2] [,3]
[1,] -1.5 -1.5 -1.5
[2,] -0.5 -0.5 -0.5
[3,]  0.5  0.5  0.5
[4,]  1.5  1.5  1.5

So in sum, what i'm looking for is an exemplary use case or two for sweep.

Please, do not recite or link to the R Documentation, mailing lists, or any of the 'primary' R sources--assume I've read them. What I'm interested in is how experienced R programmers/analysts use sweep in their own code.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

sweep() is typically used when you operate a matrix by row or by column, and the other input of the operation is a different value for each row / column. Whether you operate by row or column is defined by MARGIN, as for apply(). The values used for what I called "the other input" is defined by STATS. So, for each row (or column), you will take a value from STATS and use in the operation defined by FUN.

For instance, if you want to add 1 to the 1st row, 2 to the 2nd, etc. of the matrix you defined, you will do:

sweep (M, 1, c(1: 4), "+")

I frankly did not understand the definition in the R documentation either, I just learned by looking up examples.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...