A lot of solutions exist, but the specificity here is I need to be able to split within a line, the cut should occur just before the pattern. Ex:
Infile:
<?xml 1><blabla1>
<blabla><blabla2><blabla>
<blabla><blabla>
<blabla><blabla3><blabla><blabla>
<blabla><blabla><blabla><?xml 4>
<blabla>
<blabla><blabla><blabla>
<blabla><?xml 2><blabla><blabla>
Should become with pattern <?xml
Outfile1:
<?xml 1><blabla1>
<blabla><blabla2><blabla>
<blabla><blabla>
<blabla><blabla3><blabla><blabla>
<blabla><blabla><blabla>
Outfile2:
<?xml 4>
<blabla>
<blabla><blabla><blabla>
<blabla>
Outfile3:
<?xml 2><blabla><blabla>
Actually the perl
script in the validated answer here works fine for my little example. But it generates an error for my bigger (about 6GB) actual files. The error is:
panic: sv_setpvn called with negative strlen at /home/.../split.pl line 7, <> chunk 1.
I don't have the permissions to comment, that's why I started a new post.
And finally, a Python
solution would be even more appreciated, as I understand it better.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…