bash - unix: merge files based on column value

Question

Welcome To Ask or Share your Answers For Others

bash - unix: merge files based on column value

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

bash - unix: merge files based on column value

I have two files, that look like this:

File 1 (2 columns):

ID1 123
ID2 234
ID3 232
ID4 344
...

File 2 (>1 million columns)

ID2 A C ...
ID3 G T ...
ID1 C T ...
ID4 A C ... 
...

I want to add the values from column 2 of file 1 based on the ID to file 2 as the second column. So the merged file should look like this:

ID2 234 A C ...
ID3 232 G T ...
ID1 123 C T ...
ID4 344 A C ... 
...

So exactly the same as file 2 (same order of rows), but with the added 2nd column. The IDs are the values of the first column (present in both files). File 1 has more rows/IDs than file 2. All IDs from file 2 are in file 1, but not all IDs from file 1 are in file 2.

Does anyone know how to do this under unix/bash? Many thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T21:29:39+0000

$ join <(sort file1) <(sort file2)
ID1 123 C T ...
ID2 234 A C ...
ID3 232 G T ...
ID4 344 A C ...

If you want keep the order of file2

$ join -1 1 -2 2 <(sort file1) <(cat -n file2 | sort -k2,2) | sort -k3,3n | cut -d' ' -f1-2,4-
ID2 234 A C ...
ID3 232 G T ...
ID1 123 C T ...
ID4 344 A C ...

Categories

bash - unix: merge files based on column value

bash - unix: merge files based on column value

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags