fread
does not (yet) have any capabilities for reading fixed-width files.
I, too, often come across files annoyingly stored like this. Feel free to add a feature request on the Github page.
It may not be so in your case, but your solution with sed
would not work on a lot of FWF I come across because there's no space between columns, e.g. you'll see strings like 00010 that actually comprise 3 fields.
If that's the case, you'll need a field width dictionary, at which point you have several options:
read.fwf
within R
- Write a
fwf
->csv
program (I use one I wrote in Python
and it's pretty fast, could share the code if you'd like)--basically the beefed up version of your initial approach, so that you never have to deal with the FWF again
- Open it in Excel / LibreOffice / etc; there's a native FWF reader that tries (usually poorly) to guess the widths of the columns, which at least does half the work of specifying the column widths for you. Then you can save it as .csv or whatever from there.
I personally stick with the second option most often. read.fwf
is not optimized like fread
so it will probably be slow. And if you've got a lot (say 20+) of FWF to read, the 3rd option is pretty tedious.
But I agree it would be nice to have something like this built in to fread
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…