I have a set of pairs. Each pair is represented as [i,1:2]
. That is, the ith
pair are the numbers in the first and second column in the ith
row.
I need to sort these pairs into distinct groups, s.t. there is no element in any pair in the jth
group that is in any group not j
. For example:
EXAMPLE 1: DATA
> col1 <- c(3, 4, 6, 7, 10, 8)
> col2 <- c(6, 7, 3, 4, 3, 1)
>
> dat <- cbind(col1, col2)
> rownames(dat) <- 1:nrow(dat)
>
> dat
col1 col2
1 3 6
2 4 7
3 6 3
4 7 4
5 10 3
6 8 1
For all pairs, it doesn't matter if the number is in column 1 or column 2, the pairs should be sorted into groups s.t. every number in every pair in every group exists only in one group. So the solved example would look like this.
col1 col2 groups
1 3 6 1
2 4 7 2
3 6 3 1
4 7 4 2
5 10 3 1
6 8 1 3
Rows 1, 3, and 5 are grouped together because 1 and 3 contain the same numbers and 5 shares the number 3, so it must be grouped with them. 2 and 4 share the same distinct numbers so they are grouped together and 6 has unique numbers so it is left alone.
If we change the data slightly, note the following.
EXAMPLE 2: NEW DATA
Note what happens when we add a row that shares an element with row 6 and row 5.
col1 col2 groups
1 3 6 1
2 4 7 2
3 6 3 1
4 7 4 2
5 10 3 1
6 8 1 1
7 1 10 1
The 10
in the 7th row connects it to the first group because it shares an elements with the 5th row. It also shares an element with the 6th row (the number 1
), so the 6th row would instead be in group 1.
PROBLEM
Is there a simple way to form the groups? A vector operation? A sorting algorithm? It gets very nasty very quickly if you try to do it with a loop, since each subsequent row can change the membership of earlier rows, as demonstrated in the example.
See Question&Answers more detail:
os