Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
442 views
in Technique[技术] by (71.8m points)

python - why is 'ord' seen as an unassigned variable here?

I hope it's not a duplicate (and at the same time it's difficult to tell, given the amount of questions with such errors, but which are basic mistakes), but I don't understand what happens here.

def f():
    c = ord('a')

f()

runs, no error (ord converts character to ASCII code, it's a built-in). Now:

if False:
    ord = None
def f():
    c = ord('a')

f()

Also runs, no error (ord isn't overwritten, condition is always false). Now:

def f():
    if False:
        ord = None
    c = ord('a')

f()

I get (at line where c = ord('a'))

UnboundLocalError: local variable 'ord' referenced before assignment

It seems that just referencing a left side operand makes it a local variable, even if the code is not run.

Obviously I can workaround this, but I was very surprised, given that the dynamic aspect of python allows you to define a variable like being an integer, and at the next line define it as a string.

It seems related to What's the scope of a variable initialized in an if statement?

Apparently the interpreter still takes notes of unreached branches when compiling to bytecode, but what happens exactly?

(tested on Python 2.7 and Python 3.4)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's not about the compiler doing a static analysis based on unrelated branches when compiling to bytecode; it's much simpler.

Python has a rule for distinguishing global, closure, and local variables. All variables that are assigned to in the function (including parameters, which are assigned to implicitly), are local variables (unless they have a global or nonlocal statement). This is explained in Binding and Naming and subsequent sections in the reference documentation.

This isn't about keeping the interpreter simple, it's about keeping the rule simple enough that it's usually intuitive to human readers, and can easily be worked out by humans when it isn't intuitive. (That's especially important for cases like this—the behavior can't be intuitive everywhere, so Python keeps the rule simple enough that, once you learn it, cases like this are still obvious. But you definitely do have to learn the rule before that's true. And, of course, most people learn the rule by being surprised by it the first time…)

Even with an optimizer smart enough to completely remove any bytecode related to if False: ord=None, ord must still be a local variable by the rules of the language semantics.

So: there's an ord = in your function, therefore all references to ord are references to a local variable, not any global or nonlocal that happens to have the same name, and therefore your code is an UnboundLocalError.


Many people get by without knowing the actual rule, and instead use an even simpler rule: a variable is

  • Local if it possibly can be, otherwise
  • Enclosing if it possibly can be, otherwise
  • Global if it's in globals, otherwise
  • Builtin if it's in builtins, otherwise
  • an error

While this works for most cases, it can be a bit misleading in some cases—like this one. A language with LEGB scoping done Lisp-style would see that ord isn't in the local namespace, and therefore return the global, but Python doesn't do that. You could say that ord is in the local namespace, but bound to a special "undefined" value, and that's actually close to what happens under the covers, but that's not what the rules of Python say, and, while it may be more intuitive for simple cases, it's harder to reason through.


If you're curious how this works under the covers:

In CPython, the compiler scans your function to find all assignments with an identifier as a target, and stores them in an array. It removes global and nonlocal variables. This arrays ends up as your code object's co_varnames, so let's say your ord is co_varnames[1]. Every use of that variable then gets compiled to a LOAD_FAST 1 or STORE_FAST 1, instead of a LOAD_NAME or STORE_GLOBAL or other operation. That LOAD_FAST 1 just loads the frame's f_locals[1] onto the stack when interpreted. That f_locals starts off as an array of NULL pointers instead of pointers to Python objects, and if a LOAD_FAST loads a NULL pointer, it raises UnboundLocalError.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...