You can do this by inserting a loop into your sed script:
sed -e '/<a href/{;:next;/</a>/!{N;b next;};s,<a href="(.*)">(.*)</a>,2 - 1,g;}' yourfile
As-is, that will leave an embedded newline in the output, and it wasn't clear if you wanted it that way or not. If not, just substitute out the newline:
sed -e '/<a href/{;:next;/</a>/!{N;b next;};s/
//g;s,<a href="(.*)">(.*)</a>,2 - 1,g;}' yourfile
And maybe clean up extra spaces:
sed -e '/<a href/{;:next;/</a>/!{N;b next;};s/
//g;s/s{2,}/ /g;s,<a href="(.*)">(.*)</a>,2 - 1,g;}' yourfile
Explanation: The /<a href/{...}
lets us ignore lines we don't care about. Once we find one we like, we check to see if it has the end marker. If not (/<a>/!
) we grab the next line and a newline (N) and branch (b) back to :next to see if we've found it yet. Once we find it we continue on with the substitutions.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…