Python - A Look into Generators

Python - A Look into Generators

Introduction

In this short article we will look into Python generators.

A generator is,

a function that returns an object (iterator) which we can iterate over (one value at a time).[1]

Therefore, unlike lists, they are considered lazy. The key benefit to generators is that as they only produce one item at a time they can reduce memory overhead when dealing with large datasets.

For example, let us consider you have a 2 million line log file. You want to parse each line and perform a whois on the source IP. Rather than having to load 2 million log lines into memory and then parse through them all using some whois module, you can use a generator to yield a single log line at a time, perform a whois, before moving onto the next. Therefore reducing the memory burden of the processing job.

Generator Functions

To create a generator a function is defined using the yield statement instead of return.

We can then iterate over our generator using next().

>>> def generator(num):
...     print('Begin Generator ...')
...     while num > 0:
...         yield num
...         num -= 1
... 
>>> number = generator(5)
>>> next(number)
Begin Generator ...
5
>>> next(number)
4
>>> next(number)
3
>>> next(number)
2
>>> next(number)
1
>>> next(number)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
StopIteration

Generator Expressions

Generators can also be created in the same way as list comprehensions.

A typical list comprehension is shown below,

>>> [i for i in ['eth0','eth1','eth2']]
['eth0', 'eth1', 'eth2']

If we now use parentheses, a generator object is returned instead,

>>> (i for i in ['eth0','eth1','eth2'])
<generator object <genexpr> at 0x7fdb3d853af0>

And, as per before we can iterate over it like so,

>>> (i for i in ['eth0','eth1','eth2'])
<generator object <genexpr> at 0x7fdb3d853af0>

>>> interfaces = (i for i in ['eth0','eth1','eth2'])
>>> next(interfaces)
'eth0'
>>> next(interfaces)
'eth1'
>>> next(interfaces)
'eth2'
>>> next(interfaces)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
StopIteration

References


  1. "Python yield, Generators and Generator Expressions - Programiz." https://www.programiz.com/python-programming/generator. Accessed 24 Apr. 2019. ↩︎

Subscribe to our newsletter and stay updated.

Don't miss anything. Get all the latest posts delivered straight to your inbox.
Great! Check your inbox and click the link to confirm your subscription.
Error! Please enter a valid email address!