Adding yield keyword to a function will make the function return a generator object that can be iterated upon.
- What does the yield keyword do?
- Approaches to overcome generator exhaustion
- How to materialize Generators?
- How yield works, step by step
- Exercise 1: Write a program to create a generator that generates cubes of numbers up to 1000 using
- Exercise 2: Write a program to return odd number by pipelining generators
- Difference between
Get FREE pass to my next webinar where I teach how to approach a real ‘Netflix’ business problem, and how to transition to a successful data science career.
What does the yield keyword do?
yield in Python can be used like the
return statement in a function. When done so, the function instead of returning the output, it returns a generator that can be iterated upon. You can then iterate through the generator to extract items. Iterating is done using a
for loop or simply using the
next() function. But what exactly happens when you use
yield? What the
yield keyword does is as follows: Each time you iterate, Python runs the code until it encounters a
yield statement inside the function. Then, it sends the yielded value and pauses the function in that state without exiting. When the function is invoked the next time, the state at which it was last paused is remembered and execution is continued from that point onwards. This continues until the generator is exhausted. What does remembering the state mean? It means, any local variable you may have created inside the function before
yield was called will be available the next time you invoke the function. This is NOT the way a regular function usually behaves. Now, how is it different from using the
return keyword? Had you used
return in place of
yield, the function would have returned the respective value, all the local variable values that the function had earlier computed would be cleared off and the next time the function is called, the function execution will start fresh. Since the
yield enables the function to remember its ‘state’, this function can be used to generate values in a logic defined by you. So, it function becomes a ‘generator’.
# Function returns a generator when it encounters 'yield'. def simple_generator(): x = 1 yield x yield x + 1 yield x + 2 generator_object = simple_generator() generator_object # only generator. no code runs. no value gets returned
<generator object simple_generator at 0x000001603AC32930>
Now you can iterate through the generator object. But it works only once.
for i in generator_object: print(i)
1 2 3
Calling the generator the second time wont give anything. Because the generator object is already exhausted and has to be re-initialized.
# Calling the generator again wont work. for i in generator_object: print(i)
If you call
next() over this iterator, a
StopIteration error is raised
next(generator_object) #> StopIteration Error
Approaches to overcome generator exhaustion
To overcome generator exhaustion, you can:
- Approach 1: Replenish the generator by recreating it again and iterate over. You just saw how to do this.
- Approach 2: Iterate by calling the function that created the generator in the first place
- Approach 3 (best): Convert it to an class that implements a
__iter__()method. This creates an iterator every time, so you don’t have to worry about the generator getting exhausted.
We’ve see the first approach already. Approach 2: The second approach is to simple replace the generator with a call the the function that produced the generator, which is
simple_generator() in this case. This will continue to work no matter how many times you iterate it.
# Approach 2: Iterate by calling the function that returned the generator for i in simple_generator(): print(i)
1 2 3
Approach 3: Now, let’s try creating a class that implements a
__iter__() method. It creates an iterator object every time, so you don’t have to keep recreating the generator.
# Approach 3: Convert it to an class that implements a `__iter__()` method. class Iterable(object): def __iter__(self): x = 1 yield x yield x + 1 yield x + 2 iterable = Iterable() for i in iterable: # iterator created here print(i) for i in iterable: # iterator again created here print(i)
1 2 3 1 2 3
How to materialize Generators?
We often store data in a list if you want to materialize it at some point. If you do so, the content of the list occupies tangible memory. The larger the list gets, it occupies more memory resource. But if there is a certain logic behind producing the items that you want, you don’t have to store in a list. But rather, simply write a generator that will produce the items whenever you want them. Let’s say, you want to iterate through squares of numbers from 1 to 10. There are at least two ways you can go about it: create the list beforehand and iterate. Or create a generator that will produce these numbers.
# Print squares of numbers from 1 to 10, using LIST my_list = [1, 4, 9, 25, 36, 49, 64, 81, 100] for i in my_list: print(i)
1 4 9 25 36 49 64 81 100
Let’s do the same with generators now.
# Print squares of numbers from 1 to 10, using GENERATOR def squares(x=0): while x < 10: x = x + 1 yield x*x for i in squares(): print(i)
1 4 9 16 25 36 49 64 81 100
Generators are memory efficient because the values are not materialized until called. And are usually faster. You will want to use a generator especially if you know the logic to produce the next number (or any object) that you want to generate. Can a generator be materialized to a list? Yes. You can do so easily using list comprehensions or by simply calling
# Materialise list from generator using list comprehension materialised_list = [i for i in squares()] # Materialise list from generator using list() materialised_list = list(squares()) materialised_list
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
How yield works, step by step
yield is a keyword that returns from the function without destroying the state of it’s local variables. When you replace
yield in a function, it causes the function to hand back a generator object to its caller. In effect,
yield will prevent the function from exiting, until the next time
next() is called. When called, it will start executing from the point where it paused before. Output:
def generator_func(): num = 1 print("First time execution of the function") yield num num = 10 print("Second time execution of the function") yield num num = 100 print("Third time execution of the function") yield num obj = generator_func()
See that I have created a function using
yield keyword. Let’s try to access the function, as we have created an object
obj for the function, it will be defined as an iterator. So to access it, use the
next() function. It will iterate until the next
yield statement is reached.
print(next(obj)) print(next(obj)) print(next(obj))
First time execution of the function 1 Second time execution of the function 10 Third time execution of the function 100
See that the function printed until the first
yield. Now if you iterate again, it will not start from the beginning, it starts from where it left off. After exhausting all the
yield statements in the function, it will produce a
StopIteration error, if called again. A generator function can be completely used only once. If you want to iterate through them again, then you need to create the object again.
Exercise 1: Write a program to create a generator that generates cubes of numbers up to 1000 using
I am going to try to create a generator function which will return the cubic of the number until the cube limit reaches 1000, one at a time using
yield keyword. The memory will be alloted only to the element which is running, after the execution of output of that element, the memory will be deleted.
# Solution: Generate cubes of numbers def cubicvar(): i = 1; while True: yield i*i*i i += 1 for num in cubicvar(): if num > 1000: break print(num)
1 8 27 64 125 216 343 512 729 1000
Exercise 2: Write a program to return odd number by pipelining generators
Multiple generators can be pipelined(one generator using another) as a series of operations in the same code. Pipelining also makes the code more efficient and easy to read. For pipeling functions, use
()paranthesis to give function caller inside a function.
# Solution: Generate odd numbers by pipelining generators def gen_int(n): for i in range(n): yield i def gen_2(gen): for n in gen: if n % 2: yield n for i in gen_2(gen_int(10)): print(i)
1 3 5 7 9
|Returns the result to the caller||Used to convert a function to a generator. Suspends the function preserving its state|
|Destroys the variables once execution is complete||Yield does not destroy the functions local variables. Preserves the state.|
|There is usually one return statement per function||There can be one ore more yield statements, which is quite common.|
|If you execute a function again it starts from beginning||The execution begins from where it was previously paused|