I was wondering if there was a simple way in Tcl to read a double byte file (or so I think it is called). My problem is that I get files that look fine when opened in notepad (I'm on Win7) but when I read them in Tcl, there are spaces (or rather, null characters) between each and every character.
My current workaround has been to first run a string map
to remove all the null
string map { {}} $file
and then process the information normally, but is there a simpler way to do this, through fconfigure
, encoding
or another way?
I'm not familiar with encodings so I'm not sure what arguments I should use.
fconfigure $input -encoding double
of course fails because double
is not a valid encoding. Same with 'doublebyte'.
I'm actually working on big text files (above 2 GB) and doing my 'workaround' on a line by line basis, so I believe that this slows the process down.
EDIT: As pointed out by @mhawke, the file is UTF-16-LE encoded and this apparently is not a supported encoding. Is there an elegant way to circumvent this shortcoming, maybe through a proc
? Or would this make things more complex than using string map
?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…