- The bug is now fixed in recent versions of python as explained in @jsf's answer
dictionary-displays
If a comma-separated sequence of key/datum pairs is given, they are evaluated from left to right to define the entries of the dictionary: each key object is used as a key into the dictionary to store the corresponding datum. This means that you can specify the same key multiple times in the key/datum list, and the final dictionary’s value for that key will be the last one given.
A dict comprehension, in contrast to list and set comprehensions, needs two expressions separated with a colon followed by the usual “for” and “if” clauses. When the comprehension is run, the resulting key and value elements are inserted in the new dictionary in the order they are produced.
set displays
A set display yields a new mutable set object, the contents being specified by either a sequence of expressions or a comprehension. When a comma-separated list of expressions is supplied, its elements are evaluated from left to right and added to the set object. When a comprehension is supplied, the set is constructed from the elements resulting from the comprehension.
There is a difference in calling the set constructor or using a comprehension and the plain literal.
def f1():
return {x for x in [True, 1]}
def f2():
return set([True, 1])
def f3():
return {True, 1}
print(f1())
print(f2())
print(f3())
import dis
print("f1")
dis.dis(f1)
print("f2")
dis.dis(f2)
print("f3")
dis.dis(f3)
Output:
{True}
{True}
{1}
How they are created influences the outcome:
605 0 LOAD_CONST 1 (<code object <setcomp> at 0x7fd17dc9a270, file "/home/padraic/Dropbox/python/test.py", line 605>)
3 LOAD_CONST 2 ('f1.<locals>.<setcomp>')
6 MAKE_FUNCTION 0
9 LOAD_CONST 3 (True)
12 LOAD_CONST 4 (1)
15 BUILD_LIST 2
18 GET_ITER
19 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
22 RETURN_VALUE
f2
608 0 LOAD_GLOBAL 0 (set)
3 LOAD_CONST 1 (True)
6 LOAD_CONST 2 (1)
9 BUILD_LIST 2
12 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
15 RETURN_VALUE
f3
611 0 LOAD_CONST 1 (True)
3 LOAD_CONST 2 (1)
6 BUILD_SET 2
9 RETURN_VALUE
Python only runs the BUILD_SET
bytecode when you pass a pure literal separated by commas as per:
When a comma-separated list of expressions is supplied, its elements are evaluated from left to right and added to the set object.
The line for the comprehension:
When a comprehension is supplied, the set is constructed from the elements resulting from the comprehension.
So thanks to Hamish filing a bug report it does indeed come down to the BUILD_SET
opcode as per Raymond Hettinger's comment in the link The culprit is the BUILD_SET opcode in Python/ceval.c which unnecessarily loops backwards, the implementation of which is below:
TARGET(BUILD_SET) {
PyObject *set = PySet_New(NULL);
int err = 0;
if (set == NULL)
goto error;
while (--oparg >= 0) {
PyObject *item = POP();
if (err == 0)
err = PySet_Add(set, item);
Py_DECREF(item);
}
if (err != 0) {
Py_DECREF(set);
goto error;
}
PUSH(set);
DISPATCH();
}