Why we should use generators? [Python]
Generators don’t hold the entire result in memory. It yields one result at a time.
Before we get into the idea of generators, we need to understand the difference between “iterables” and “iterators”.
Iterables and Iterators
Let me start by saying that the list
, tuples
,strings
, dictionaries
, etc are iterables. Let us see an example:
testList = [1, 2, 3]
for val in testList:
print(val)
This will output (as we can guess):
1
2
3
For an object to be an iterable, it needs to have a method __iter__()
.
We can check if our list has this method by investigating using the built-in dir
function:
Similar posts
print(dir(testList))
This outputs a list:
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
This has a method __iter__
as we can see.
Iterators, unlike iterables, has a state where it knows where it is during an iteration and knows how to get the next value.
In the above example of the list object, if we query its next value, then it will not know.
print(next(testList))
This throws an error:
Traceback (most recent call last):
File "iterables.py", line 7, in <module>
print(next(testList))
TypeError: 'list' object is not an iterator
If we convert the list object into an iterator using the __iter__
method, then we can apply the next
function on it:
testIter = testList.__iter__()
print(testIter)
or
testIter = iter(testList)
print(testIter)
This will print:
<list_iterator object at 0x7fd6d6f2de90>
If we apply the next function now, it will work:
print(next(testIter))
print(next(testIter))
print(next(testIter))
1
2
3
But if we print the next again then it will raise StopIteration
error.
print(next(testIter))
print(next(testIter))
print(next(testIter))
print(next(testIter))
1
2
3
Traceback (most recent call last):
File "iterables.py", line 15, in <module>
print(next(testIter))
StopIteration
This means that the iterators knows where to stop.
If we use the for loop
to run this, then Python will automatically figure it out where to stop:
for val in testIter:
print(val)
1
2
3
To understand this, let us perform the same operation using the while
loop.
while True:
try:
item = next(testIter)
print(item)
except StopIteration:
break
There are several built-in iterator functions in Python such as the range
function we most often use. We can create our own class for iterators. Let us create the one similar to range
:
class rangeNew:
def __init__(self, start, end, step=1):
self.value = start
self.end = end
self.step = step
def __iter__(self):
return self
def __next__(self):
if self.value >= self.end:
raise StopIteration
current = self.value
self.value += self.step
return current
rangeIter = rangeNew(1, 10, 2)
for val in rangeIter:
print(val)
1
3
5
7
9
Generators
- Generators don’t hold the entire result in memory. It yields one result at a time.
-
Ways of creating generators:
-
Using a function
def squares_gen(num): for i in num: yield i**2
def squares(num): results=[] for i in num: results.append(i**2) return results
-
Elapsed time for list:
7.360722
Seconds -
Elapsed time for generators:
5.999999999950489e-06
Seconds -
Difference in time taken for the list and generators:
7.360716
Seconds fornum = np.arange(1,10000000)
-
-
Like a list comprehension
resl = [i**2 for i in num]
resg = (i**2 for i in num)
-
Elapsed time for list:
7.663468000000001
Seconds -
Elapsed time for generators:
9.999999999621423e-06
Seconds -
Difference in time taken:
7.663458000000001
Seconds fornum = np.arange(1,10000000)
-
-
- Getting the results from the generator function:
- Using
next
resg = squares_gen(num) print('res of generators: ',next(resg)) print('res of generators: ',next(resg)) print('res of generators: ',next(resg))
- Using
loop
:for n in resg: print(n)
- Using
Advantages of using generators:
- The generator codes are more readable.
- Generators are much faster and uses little memory.
Results:
- Using function is a faster way of creating values in Python than using loop or list comprehension for both lists and generators.
- The difference between using list or generators is more pronounced when using a comprehension (though generators are still much faster.)
- When we need the result of whole array at a time then the amount of time (or memory) taken to create a list or
list(generators)
are almost same.
Overall, generators gives a performance boost not only in execution time but with the memory as well.
Appendix
How I calculated the time taken by the process
- Calculate sum of the system and user CPU time of the current process.
time.process_time
provides the system and user CPU time of the current process in seconds.- Use
time.process_time_ns
to get the result in nanoseconds
NOTE: The “time taken” shown in this study is subjective to different computers and varies each time depending on the state of the CPU. But each and everytime, the using generators are much faster.
References:
Disclaimer of liability
The information provided by the Earth Inversion is made available for educational purposes only.
Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.
UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.
Leave a comment