Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
371 views
in Technique[技术] by (71.8m points)

python - Line reading chokes on 0x1A

I have the following file:

abcde
kwakwa
<0x1A>
line3
linllll

Where <0x1A> represents a byte with the hex value of 0x1A. When attempting to read this file in Python as:

for line in open('t.txt'):
    print line,

It only reads the first two lines, and exits the loop.

The solution seems to be to open the file in binary (or universal newline mode) - 'rb' or 'rU'. Can you explain this behavior ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

0x1A is Ctrl-Z, and DOS historically used that as an end-of-file marker. For example, try using a command prompt, and "type"ing your file. It will only display the content up the Ctrl-Z.

Python uses the Windows CRT function _wfopen, which implements the "Ctrl-Z is EOF" semantics.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...