r - Implementation of skyline query or efficient frontier

Question

Welcome To Ask or Share your Answers For Others

r - Implementation of skyline query or efficient frontier

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Implementation of skyline query or efficient frontier

I know there must be an easy answer to this but somehow I can't seem to find it...

I have a data frame with 2 numeric columns. I would like to remove from it, the rows, which have the property, that there exists at least one other row in the data frame, with both column values bigger than the ones in this row.

So if I have

    Col1 Col2  
1     2    3  
2     4    7  
3     5    6

I would like to remove the first row, because the second one fulfills the property and keep only rows 2 and 3.

Thanks a lot!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T03:05:34+0000

That problem is called a "skyline query" by database administrators (they may have other algorithms) and an "efficient frontier" by economists. Plotting the data can make it clear what we are looking for.

n <- 40
d <- data.frame(
  x = rnorm(n),
  y = rnorm(n)
)
# We want the "extreme" points in the following plot
par(mar=c(1,1,1,1))
plot(d, axes=FALSE, xlab="", ylab="")
for(i in 1:n) {
  polygon( c(-10,d$x[i],d$x[i],-10), c(-10,-10,d$y[i],d$y[i]), 
  col=rgb(.9,.9,.9,.2))
}

The algorithm is as follows: sort the points along the first coordinate, keep each observation unless it is worse than the last retained one.

d <- d[ order(d$x, decreasing=TRUE), ]
result <- d[1,]
for(i in seq_len(nrow(d))[-1] ) {
  if( d$y[i] > result$y[nrow(result)] ) {
    result <- rbind(result, d[i,])  # inefficient
  } 
}
points(result, cex=3, pch=15)

Skyline

Categories

r - Implementation of skyline query or efficient frontier

r - Implementation of skyline query or efficient frontier

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags