Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
574 views
in Technique[技术] by (71.8m points)

r - Collapse rows with overlapping ranges

I have a data.frame with start and end time:

ranges<- data.frame(start = c(65.72000,65.72187, 65.94312,73.75625,89.61625),stop = c(79.72187,79.72375,79.94312,87.75625,104.94062))

> ranges
     start      stop
1 65.72000  79.72187
2 65.72187  79.72375
3 65.94312  79.94312
4 73.75625  87.75625
5 89.61625 104.94062

In this example, the ranges in row 2 and 3 are entirely within the range between 'start' on row 1 and stop on row 4. Thus, the overlapping ranges 1-4 should be collapsed to one range:

> ranges
     start      stop
1 65.72000  87.75625
5 89.61625 104.94062

I tried this:

mdat <- outer(ranges$start, ranges$stop, function(x,y) y > x)
mdat[upper.tri(mdat)|col(mdat)==row(mdat)] <- NA
mdat

And now I just need to figure out how to combine all the true ones, but not sure if it's the best way to go

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can try this:

library(dplyr)
ranges %>% 
       arrange(start) %>% 
       group_by(g = cumsum(cummax(lag(stop, default = first(stop))) < start)) %>% 
       summarise(start = first(start), stop = max(stop))

# A tibble: 2 × 3
#      g    start      stop
#  <int>    <dbl>     <dbl>
#1     0 65.72000  87.75625
#2     1 89.61625 104.94062

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...