I'm using Python 2.7.3.
Consider a dummy class with custom (albeit bad) iteration and item-getting behavior:
class FooList(list):
def __iter__(self):
return iter(self)
def next(self):
return 3
def __getitem__(self, idx):
return 3
Make an example and see the weird behavior:
>>> zz = FooList([1,2,3])
>>> [x for x in zz]
# Hangs because of the self-reference in `__iter__`.
>>> zz[0]
3
>>> zz[1]
3
But now, let's make a function and then do argument unpacking on zz
:
def add3(a, b, c):
return a + b + c
>>> add3(*zz)
6
# I expected either 9 or for the interpreter to hang like the comprehension!
So, argument unpacking is somehow getting the item data from zz
but not by either iterating over the object with its implemented iterator and also not by doing a poor man's iterator and calling __getitem__
for as many items as the object has.
So the question is: how does the syntax add3(*zz)
acquire the data members of zz
if not by these methods? Am I just missing one other common pattern for getting data members from a type like this?
My goal is to see if I could write a class that implements iteration or item-getting in such a way that it changes what the argument unpacking syntax means for that class. After trying the two example above, I'm now wondering how argument unpacking gets at the underlying data and whether the programmer can influence that behavior. Google for this only gave back a sea of results explaining the basic usage of the *args
syntax.
I don't have a use case for needing to do this and I am not claiming it is a good idea. I just want to see how to do it for the sake of curiosity.
Added
Since the built-in types are treated specially, here's an example with object
where I just maintain a list object and implement my own get and set behavior to emulate list.
class FooList(object):
def __init__(self, lst):
self.lst = lst
def __iter__(self): raise ValueError
def next(self): return 3
def __getitem__(self, idx): return self.lst.__getitem__(idx)
def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)
In this case,
In [234]: zz = FooList([1,2,3])
In [235]: [x for x in zz]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-235-ad3bb7659c84> in <module>()
----> 1 [x for x in zz]
<ipython-input-233-dc9284300db1> in __iter__(self)
2 def __init__(self, lst):
3 self.lst = lst
----> 4 def __iter__(self): raise ValueError
5 def next(self): return 3
6 def __getitem__(self, idx): return self.lst.__getitem__(idx)
ValueError:
In [236]: add_3(*zz)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-236-f9bbfdc2de5c> in <module>()
----> 1 add_3(*zz)
<ipython-input-233-dc9284300db1> in __iter__(self)
2 def __init__(self, lst):
3 self.lst = lst
----> 4 def __iter__(self): raise ValueError
5 def next(self): return 3
6 def __getitem__(self, idx): return self.lst.__getitem__(idx)
ValueError:
But instead, if I ensure iteration stops and always returns 3, I can get what I was shooting to play around with in the first case:
class FooList(object):
def __init__(self, lst):
self.lst = lst
self.iter_loc = -1
def __iter__(self): return self
def next(self):
if self.iter_loc < len(self.lst)-1:
self.iter_loc += 1
return 3
else:
self.iter_loc = -1
raise StopIteration
def __getitem__(self, idx): return self.lst.__getitem__(idx)
def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)
Then I see this, which is what I originally expected:
In [247]: zz = FooList([1,2,3])
In [248]: ix = iter(zz)
In [249]: ix.next()
Out[249]: 3
In [250]: ix.next()
Out[250]: 3
In [251]: ix.next()
Out[251]: 3
In [252]: ix.next()
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-252-29d4ae900c28> in <module>()
----> 1 ix.next()
<ipython-input-246-5479fdc9217b> in next(self)
10 else:
11 self.iter_loc = -1
---> 12 raise StopIteration
13 def __getitem__(self, idx): return self.lst.__getitem__(idx)
14 def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)
StopIteration:
In [253]: ix = iter(zz)
In [254]: ix.next()
Out[254]: 3
In [255]: ix.next()
Out[255]: 3
In [256]: ix.next()
Out[256]: 3
In [257]: ix.next()
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-257-29d4ae900c28> in <module>()
----> 1 ix.next()
<ipython-input-246-5479fdc9217b> in next(self)
10 else:
11 self.iter_loc = -1
---> 12 raise StopIteration
13 def __getitem__(self, idx): return self.lst.__getitem__(idx)
14 def __setitem__(self, idx, itm): self.lst.__setitem__(idx, itm)
StopIteration:
In [258]: add_3(*zz)
Out[258]: 9
In [259]: zz[0]
Out[259]: 1
In [260]: zz[1]
Out[260]: 2
In [261]: zz[2]
Out[261]: 3
In [262]: [x for x in zz]
Out[262]: [3, 3, 3]
Summary
The syntax *args
relies on iteration only. For built-in types this happens in a way that is not directly overrideable in classes that inherit from the built-in type.
These two are functionally equivalent:
foo(*[x for x in args])
foo(*args)
These are not equivalent even for finite data structures.
foo(*args)
foo(*[args[i] for i in range(len(args))])
See Question&Answers more detail:
os