Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
183 views
in Technique[技术] by (71.8m points)

regex - How to delete newline if the line doesn't end with "

Sample data:

"data","123"
"data2","qwer"
"false","234
And i'm the culprit"
"data5","234567"

Output text should be

"data","123"
"data2","qwer"
"false","234And i'm the culprit"
"data5","234567"

In essence, I want to fix my csv file (which is very large)

I'm using sed so an answer in sed would help a lot :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

sed is always the wrong choice for any problem that involves multiple lines. Just use awk:

$ awk '{printf "%s%s", (prev~/"$/?RS:""), $0; prev=$0} END{print ""}' file
"data","123"
"data2","qwer"
"false","234And i'm the culprit"
"data5","234567"

The above just checks if the previous line ended with a " and if it did then it prints the default Record Separator (which is a newline - you could replace RS with ORS or a hard-coded " " if you prefer) but if it didn't then it doesn't print anything. Then it prints the current record without a newline after it. At the end of everything it prints a newline.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...