I have 2 datasets, one with around 4000 and the other with about 1200 rows. I wanna merge these two by their Column A to have a single file that shows the response column. But, in many rows of the 4000-row dataset, the names are a combination of several elements found in the other one and they are generally separated with and/or. For example:
Dataset 1:
(Column) A:
F123 and Y345
(Y2SD94 or X230S0) and S24KS
Y2300
.
.
.
Dataset 2:
(Column) A:
Y2SD94
X230S0
F123
Y345
Y308
Y239S
S24KS
.
.
.
(Column)Response:
23.3
25.3
22
21
25
26
35
.
.
.
I could merge these two files when there is no combination in the rows. But, I would like to know if there is any way to paste the values of these combinations, for all the elements in the final file? For example, for "(Y2SD94 or X230S0) and S24KS", I get three separate values as response next to its row in the destination file. I greatly appreciate any help with that.
Dataset 2 rows are here: https://gofile.io/d/KtEEQw
Outputs:
Dataset_1
structure(list(gene = c("(YDL174C and YEL039C) or (YDL174C and YJR048W) or (YEL039C and YEL071W) or (YEL071W and YJR048W)",
"(YDL178W and YEL039C) or (YDL178W and YJR048W)", "YAL060W",
"(YEL039C and YML054C) or (YJR048W and YML054C)", "YGR032W or YMR306W or (YLR342W and YCR034W)",
"YGR143W or YPR159W"), mean = c(NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
Dataset_2
structure(list(gene = c("YLR295C", "YLR299W", "YLR300W", "YLR303W",
"YLR304C", "YLR305C"), mean = c(11.04407422, 8.450484195, 10.55105282,
12.5135945, 11.02718148, 7.120348496)), row.names = c(NA, -6L
), class = c("tbl_df", "tbl", "data.frame"))