Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
460 views
in Technique[技术] by (71.8m points)

bash - SED Replace multiple second occurrence of a character

I have .srt files which are in the following format:

0
1
00:00:01,830 --> 00:00:04,740
corresponding text
1

2
00:00:05,280 --> 00:00:10,280
corresponding text
2

3
00:00:10,740 --> 00:00:14,640
corresponding text
3

4
00:00:15,510 --> 00:00:19,260
corresponding text
4

and that extra line with the line number is all the way through the subtitle (line 5, line 6...line 540). I tried the command sed '/^[0-9]/ s/.//' and as expected it replaces all the numbers, but I don't know how to make it replace only the second occurrence of each number in the range.

The expected result is:

0
1
00:00:01,830 --> 00:00:04,740
corresponding text

2
00:00:05,280 --> 00:00:10,280
corresponding text

3
00:00:10,740 --> 00:00:14,640
corresponding text

4
00:00:15,510 --> 00:00:19,260
corresponding text

How can I achieve it either with sed, awk or any tool that can do it in batches since there are several files with the same situation?

Thanks!

question from:https://stackoverflow.com/questions/65929229/sed-replace-multiple-second-occurrence-of-a-character

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using awk, you can set a variable whenever the line contains one field. If it does, use a variable to hold the last value of that field, and skip printing the line when they match.

awk 'NF == 1 {if (num != "" && $0 == num) next; else num = $0} 1'

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...