Yes, extracting from an input stream will set the EOF bit if the extraction stops at the end-of-file, as demonstrated by the std::stringstream
example. If it were this simple, the loop with !file.eof()
as its condition would work just fine on a file like:
hello
world
The second extraction would eat world
, stopping at the end-of-file, and consequently setting the EOF bit. The next iteration wouldn't occur.
However, many text editors have a dirty secret. They're lying to you when you save a text file even as simple as that. What they don't tell you is that there's a hidden
at the end of the file. Every line in the file ends with a
, including the last one. So the file actually contains:
hello
world
This is what causes the last line to be duplicated when using !file.eof()
as the condition. Now that we know this, we can see that the second extraction will eat world
stopping at
and not setting the EOF bit (because we haven't gotten there yet). The loop will iterate for a third time but the next extraction will fail because it doesn't find a string to extract, only whitespace. The string is left with its previous value still hanging around and so we get the duplicated line.
You don't experience this with std::stringstream
because what you stick in the stream is exactly what you get. There's no
at the end of std::stringstream ss("hello")
, unlike in the file. If you were to do std::stringstream ss("hello
")
, you'd experience the same duplicate line issue.
So of course, we can see that we should never use !file.eof()
as the condition when extracting from a text file - but what's the real issue here? Why should we really never use that as our condition, regardless of whether we're extracting from a file or not?
The real problem is that eof()
gives us no idea whether the next read will fail or not. In the above case, we saw that even though eof()
was 0, the next extraction failed because there was no string to extract. The same situation would happen if we didn't associate a file stream with any file or if the stream was empty. The EOF bit wouldn't be set but there's nothing to read. We can't just blindly go ahead and extract from the file just because eof()
isn't set.
Using while (std::getline(...))
and related conditions works perfectly because just before the extraction starts, the formatted input function checks if any of the bad, fail, or EOF bits are set. If any of them are, it immediately ends, setting the fail bit in the process. It will also fail if it finds the end-of-file before it finds what it wants to extract, setting both the eof and fail bits.
Note: You can save a file without the extra
in vim if you do :set noeol
and :set binary
before saving.