r - From a set of pairs, find all subsets s.t. no pair in the subset shares any element with a pair not in the subset

Question

Welcome To Ask or Share your Answers For Others

r - From a set of pairs, find all subsets s.t. no pair in the subset shares any element with a pair not in the subset

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - From a set of pairs, find all subsets s.t. no pair in the subset shares any element with a pair not in the subset

I have a set of pairs. Each pair is represented as [i,1:2]. That is, the ith pair are the numbers in the first and second column in the ith row.

I need to sort these pairs into distinct groups, s.t. there is no element in any pair in the jth group that is in any group not j. For example:

EXAMPLE 1: DATA

> col1 <- c(3, 4, 6, 7, 10, 8)
> col2 <- c(6, 7, 3, 4, 3,  1)
> 
> dat <- cbind(col1, col2)
> rownames(dat) <- 1:nrow(dat)
> 
> dat
  col1 col2
1    3    6
2    4    7
3    6    3
4    7    4
5   10    3
6    8    1

For all pairs, it doesn't matter if the number is in column 1 or column 2, the pairs should be sorted into groups s.t. every number in every pair in every group exists only in one group. So the solved example would look like this.

  col1 col2 groups
1    3    6      1
2    4    7      2
3    6    3      1
4    7    4      2
5   10    3      1
6    8    1      3

Rows 1, 3, and 5 are grouped together because 1 and 3 contain the same numbers and 5 shares the number 3, so it must be grouped with them. 2 and 4 share the same distinct numbers so they are grouped together and 6 has unique numbers so it is left alone.

If we change the data slightly, note the following.

EXAMPLE 2: NEW DATA

Note what happens when we add a row that shares an element with row 6 and row 5.

  col1 col2 groups
1    3    6      1
2    4    7      2
3    6    3      1
4    7    4      2
5   10    3      1
6    8    1      1
7    1   10      1

The 10 in the 7th row connects it to the first group because it shares an elements with the 5th row. It also shares an element with the 6th row (the number 1), so the 6th row would instead be in group 1.

PROBLEM

Is there a simple way to form the groups? A vector operation? A sorting algorithm? It gets very nasty very quickly if you try to do it with a loop, since each subsequent row can change the membership of earlier rows, as demonstrated in the example.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T21:17:16+0000

To draw on the old answer at: identify groups of linked episodes which chain together , which assigns a group to each individual value, you could try this to assign a group to each linked pair:

library(igraph)
g <- graph_from_data_frame(dat)
links <- data.frame(col1=V(g)$name,group=components(g)$membership)
merge(dat,links,by="col1",all.x=TRUE,sort=FALSE)

#  col1 col2 group
#1    3    6     1
#2    4    7     2
#3    6    3     1
#4    7    4     2
#5   10    3     1
#6    8    1     3

Categories

r - From a set of pairs, find all subsets s.t. no pair in the subset shares any element with a pair not in the subset

r - From a set of pairs, find all subsets s.t. no pair in the subset shares any element with a pair not in the subset

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags