Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
208 views
in Technique[技术] by (71.8m points)

bash - Insert text at the beginning of each line from 2 to end - 5

I want to insert a word followed by a tab character at the start of each line in a file (in-place insertion) but starting from line number 2 to all the lines but last 5 lines.

So if a file has 10 lines, I want to insert from line number 2 to line number 5 - I want to keep lines 1 and 6-10 intact in this case.

The file can have lines in millions (currently upto 10 million)

sed -i "s/^/word	/" filename 

The above works, but I want to insert on the first and last 5 lines. Also given a line range, calculating the number of lines will be another operation. Since the line numbers can vary, this extra operation can become an overhead. Looking for an efficient solution. Here is what I have tried so far:

COUNT=$((`wc -l test_csnap_delta.csv | cut -d ' ' -f 1` - 5))
sed -n -i '2,$COUNT s/^/word	/' 

However the above is deleting the entire file data.

Thanks in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This works without precounting the number of lines in the file:

sed -ni '1{p;b}; 2{N;N;N;N}; $p; $!{N;s/^/word /;P;D}' filename

This buffers five lines and makes the substitution on the first line in the buffer and prints and deletes it. When the last line in the file is read, the buffer is printed without doing any substitutions.

  • 1{p;b} - read the first line, print it unchanged and branch to the end
  • 2{N;N;N;N} - when line 2 is read, append four more lines to create a five-line buffer
  • $p - when the last line of the file is read, print the lines that remain in the buffer unchanged
  • $! - when the current line is not the last line in the file...
  • N - append the next line to the buffer (pattern space)
  • s/^/word / - make the substitution on the first line in the buffer
  • P - print only the first line in the buffer
  • D - delete only the first line in the buffer

Note that this won't work properly for files that consist of fewer than 6 lines.

This is the same idea using AWK:

awk 'FNR == 1 {print; next} FNR == 2 {for (ptr = 0; ptr <= 4; ptr++) {buffer[ptr] = $0; getline}; ptr = 0} {sub(/^/, "word ", buffer[ptr]); print buffer[ptr]; buffer[ptr] = $0; ptr = (ptr + 1) % 5} END {for (i = 0; i <= 4; i++) {print buffer[(ptr + i) % 5]}}' filename > outputfile
mv outputfile filename

Here it is broken out on multiple lines:

FNR == 1 {
    print
    next
}
FNR == 2 {
    for (ptr = 0; ptr <= 4; ptr++) {
        buffer[ptr] = $0
        getline
    }
    ptr = 0
}
{
    sub(/^/, "word ", buffer[ptr])
    print buffer[ptr]
    buffer[ptr] = $0
    ptr = (ptr + 1) % 5
}
END {
    for (i = 0; i <= 4; i++) {
        print buffer[(ptr + i) % 5]
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...