Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
335 views
in Technique[技术] by (71.8m points)

c - Using int for character types when comparing with EOF

Quoting from Kernighan and Ritchie's 'The C Programming Language' Page 16 -

#include<stdio.h>

main()
{
int c;
c = getchar();

while(c!=EOF)
{
    putchar(c);
    c = getchar();
} 

getchar();
return 0;
}

"The type char is specifically meant for storing such character data, but any integer type can be used. We used int for a subtle but important reason. The problem is distinguishing the end of the input from valid data. The solution is that getchar returns a distinctive value when there is no more input, a value that cannot be confused with any real character. This value is called EOF, for "end of file". We must declare c to be a type big enough to hold any value that getchar returns. We can't use char since c must be big enough to hold EOF in addition to any possible char. Therefore we use int.".

I looked up in stdio.h, it says #define EOF (-1)

The book conclusively states that char cannot be used whereas this program "works just fine" (See EDIT) with c as char data type as well. What is going on? Can anyone explain in terms of bits and signed values?

EDIT:
As Oli mentioned in the answer, the program cannot distinguish between EOF and 255. So it will not work fine. I want to know what's happening - Are you saying that when we do the comparison c!=EOF, the EOF value gets cast to a char value = 255 (11111111 in binary; i.e. the bits 0 through 7 of EOF when written in 2's complement notation)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

getchar result is the input character converted to unsigned char and then to int or EOF i.e. it will be in the -1 — 255 range that's 257 different values, you can't put that in an 8 bit char without merging two of them. Practically either you'll mistake EOF as a valid character (that will happen if char is unsigned) or will mistake another character as EOF (that will happen if char is signed).

Note: I'm assuming an 8 bit char type, I know this assumption isn't backed up by the standard, it is just by far the most common implementation choice.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...