Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
780 views
in Technique[技术] by (71.8m points)

bash - Join lines with the same value in the first column

I have a tab-delimited file with three columns (excerpt):

AC147602.5_FG004    IPR000146   Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004    IPR023079   Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001    IPR002110   Ankyrin repeat
AC148152.3_FG001    IPR026961   PGG domain

and I'd like to get this using bash:

AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110   Ankyrin repeat IPR026961    PGG domain

So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

give this one-liner a try:

 awk -F'' -v OFS='' '{x=$1;$1="";a[x]=a[x]$0}END{for(x in a)print x,a[x]}' file

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...