TL;DR
ASCII
Until PHP 5.4, the PHP interpreter didn't at all care about the charset of PHP files, as evidenced by the fact that the zend.script_encoding ini directive only appeared in that version. It always treated it as ASCII basically.
When PHP needs to identify, for example, a function name, that happens to contain characters beyond ASCII-7bit (well, any labeled entity with any label really, but you get my point...), it merely looks for a function in the symbol table with the same byte sequence - an umlaut (or whatever...) written in one way would be treated differently than an umlaut written in another way. Try it. For backwards compatibility, if zend.script_encoding is not set, this is still the default behavior. Also take note of the regex showing what is a valid identifier, which you can see is charset neutral (well... except latin letters, which are in the ASCII-7bit range), but shows you bytes instead.
This leads us also to the declare(encoding) construct. If you see THAT in a file, that's the definitive charset to honor for that particular file (ONLY). Use something else until you encounter one, and if you see more than one - honor the second one after its declare statement.
If there's none...
In a static context (i.e. when you don't know the effective ini settings), you'd need to fallback to something else (something that's user defined, ideally) when the charset is important, or otherwise just treat characters beyond ASCII-7bit as pure binary, and display them in some uniform code-point-like fashion.
In a dynamic context (e.g. if you could for example rename the file for a moment, create a temporary file at that place, with that name; have it echo the value of zend.script_encoding; restore back the normal file), you should use the zend.script_encoding value if available, and fallback to something else (just as in a static context) otherwise.
The same treatment applies to strings, HTML fragments and any other contents of a PHP file - it's just read as a binary string, except certain ASCII characters (i.e. bytes) that are important to the PHP lexer, such as the sequence "<?php" (notice that all are ASCII characters...); an apostrophe within a single quoted string; etc. - The interpreter itself doesn't care about a string's charset, and if you must display a string's contents on screen, you should use the above means to figure out the best way to do so.
Edge cases (requested in comments):
1.
Is there a restriction on what encoding are allowed?
There doesn't seem to be any list of allowed encodings anywhere, or at least I can't find one. Given that this is the successor of the --enable-zend-multibyte compile setting, UTF encodings of all flavors are sure to be in that list. Even if other (ANSI) encodings don't have an effect on PHP itself, that shouldn't deter you from using that value as a hint.
2.
How does "declare(encoding)" work if the source file is UTF-16 (null 8 bit bytes between 8 bit ascii chars for the declaration)?
zend.script_encoding is used until a declare(encoding) is encountered. If it's not set, ASCII is assumed. This shouldn't be a problem even in a UTF-16 file... right? (I don't use UTF-16) While this may be a problem for PHP files encoded as UTF-16, I think it's fair to say the vast majority of developers just don't encode their scripts in UTF-16. Their data, sure, if the application's case calls for it. But not the script itself. Most PHP files in the wild are encoded either with an ANSI encoding or UTF-8.
3.
If the .ini or the file setting is UTF-8 or otherwise, then identifiers are presumably taken only from code points in range x41-xFF, but not from code points x100 up?
I haven't tried supplying invalid UTF-8 bytes to tell you the answer to that one, nor does the manual ever state anything on the question. I would assume that PHP execution will fail with a parse error on that. Or at least it should. As far as your tool is concerned, it should report the invalid UTF-8 sequence anyway, since even if PHP allows it, that's still a QA problem.
4.
For UTF encodings, are characters in strings represented as their UTF code point (that makes no sense since PHP strings seem only have 8 bit characters)?
No. Characters in strings and non-PHP content are still treated as just a sequence of bytes, which you can confirm by looking at the output of strlen(), and seeing how it differs from mb_strlen(), which is the one that respects encoding (well... it respects the mbstring.internal_encoding setting to be exact, but still).
5.
If not, what does it mean to set the encoding to UTF something?
AFAIK, it affects lookups in the symbol table. With UTF set, umlauts written in different ways, or in different UTF flavors that end up with the same UTF code points... they would all converge on the same symbol, as opposed to without declare(encoding), where byte-by-byte comparrison is done instead. And I say "AFAIK" here, because frankly, I've never used such experiments myself... I'm a "do gooddy 'everything-as-valid-UTF-8'-er".