Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
578 views
in Technique[技术] by (71.8m points)

r - Delete columns in text files with specific string

I would like to delete collumns with a specific string "Gtype." from a .txt tab delimited file. I already have tried this command in R: df <- df[, -grep("GType.", colnames(df))] to do this task. However my matrix is too big (more than 13 GB), and R cannot deal with it. (Error: cannot allocate vector of size....)

My input file:

Log.NE122  Gtype.NE122  Log.NE144    Gtype.NE144
-0.33          AA          1.0           AB

My expected output:

   Log.NE122  Log.NE144  
    -0.33       1.0      

I am wondering that it works in bash. If someone have other options....

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using awk:

awk 'NR==1{for (i=1; i<=NF; i++) if ($i ~ /Gtype/) a[i]; 
     else printf "%s%s", $i, OFS; print ""; next}
     {for (i=1; i<=NF; i++) if (!(i in a)) printf "%s%s", $i, OFS; print "" }' file
Log.NE122 Log.NE144
-0.33     1.0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...