Iterable objects#
import numpy as np
Suppose we are writing code to summarize some simple data, perhaps a group of teams that each have a score for a particular game.
team = ['green', 'red', 'blue']
score = [5, 9, 7]
Suppose we want to list the names of each team. One typical approach is to loop over each item in the list:
N_teams = len(team)
print('The team names are:')
for i in range(N_teams):
print(f'{team[i]}')
However, this is not the most efficient way to do this, as we need to define two extra variables: N_teams and the iteration index i. There is a better way!
Python supports what is referred to as iterable objects: generally, this means objects that can be easily used by Python’s built-in and efficient iteration schemes. Examples are: list, tuple, dictionaries, and strings. You may be interested in reading about all of the technical details here; however, for this assignment the explanations here and examples below are sufficient. It is important to recognize this type of object exists in Python, so we can take advantage of it in our code, or so you can recognize the algorithms in other authors code.
Let’s see an iterable object in action. First, an easy way to test if an object is iterable is to use it as an argument of the iter() method:
iter(team)
Other objects can also be made into an iterator, like numpy arrays:
iter(np.array([1, 5, 67]))
The cell above should return iterator, which indicates the object (argument) is iterable. An integer, however, is not iterable.
\(\text{Task 1.1:}\)
Try running the code below to confirm that an integer is not iterable (an error will occur), then fix it by converting the argument to an iterable object. Experiment by turning the integer into a list, string and np.array.
iter(5)
Great, you can make an iterator! But what do you do with it?
Most of the time we simply use it in a for loop, but it is worthwhile to understand this object a bit more deeply.
One simple way to understand an iterator is to imagine that it is a way of (efficiently!) turning your iterable object into something that can go sequentially through all its values. To do this it has two “abilities”:
the iterator knows its current value, and
the iterator knows its next value
Let’s try using the iterator by running the following cells:
letters = 'abcde'
letter_iterator = iter(letters)
Nothing new yet, but now watch what happens when we use the Python built-in command next()
Run the cell multiple times.
next(letter_iterator)
There are two key things to note here:
The iterator “knows” how to go through all of its values, and
Calls to
nextreturns to value itself, not the index!
This is a very useful feature, as we can see in this simple example below:
for i in letters:
print(i)
Obviously this is a simple for loop, but hopefully the explanation above indicates what is actually happening “under the hood” of the for loop:
identifies
lettersas an iterable objectconverts it into an iterator, and then it
finds the next value in the sequence and assigns it to variable
istop once there is no next value
This is a bit of a simplification of what happens in reality, but it gets the idea across.
Now let’s take a look at something you have probably used a lot: the range() method:
print(type(range(5)))
print(range(5))
Although it looks like it produces a tuple, range is a special iterable object that simply counts. As we can see it works the same as our for loop above:
for i in range(5):
print(i)
Hopefully this explains why range is such a useful feature for accessing sequential indexed elements of arrays.
It turns out there are two more built-in Python methods that produce useful iterable objects: enumerate and zip. Let’s take a look at their doc strings.
\(\text{Task 1.2:}\)
Read the docstrings below for the two methods. Confirm that they produce an iterable, but compare them to observe how the input and output of each is different. Can you imagine why one could be more useful than another in certain situations?
enumerate:
Init signature: enumerate(iterable, start=0)
Docstring:
Return an enumerate object.
iterable
an object supporting iteration
The enumerate object yields pairs containing a count (from start, which
defaults to zero) and a value yielded by the iterable argument.
enumerate is useful for obtaining an indexed list:
(0, seq[0]), (1, seq[1]), (2, seq[2]), ...
And zip:
Init signature: zip(self, /, *args, **kwargs)
Docstring:
zip(*iterables, strict=False) --> Yield tuples until an input is exhausted.
>>> list(zip('abcdefg', range(3), range(4)))
[('a', 0, 0), ('b', 1, 1), ('c', 2, 2)]
The zip object yields n-length tuples, where n is the number of iterables
passed as positional arguments to zip(). The i-th element in every tuple
comes from the i-th iterable argument to zip(). This continues until the
shortest argument is exhausted.
If strict is true and one of the arguments is exhausted before the others,
raise a ValueError.
The main takeaways, should be as follows:
Yes, they both produce an iterator
enumeratetakes one iterable object and returns indices along with the valueziptakes two iterable objects and returns their values
Let’s try them out by running the next cell. Does it behave as expected?
thing_1 = 'roberts_string'
thing_2 = [2, 3, 36, 3., 1., 's', 7, '3']
test_1 = enumerate(thing_1)
print(f'We created: {test_1}')
print(next(test_1), next(test_1), next(test_1))
test_2 = zip(thing_1, thing_2)
print(f'We created: {test_2}')
print(next(test_2), next(test_2), next(test_2))
Can you see the difference?
Looking at them in a for loop will also illustrate what’s going on:
print('First, enumerate:')
for i, j in enumerate(thing_1):
print(i, j)
print('\nThen, zip:')
for i, j in zip(thing_1, thing_2):
print(i, j)
Now let’s return to our teams from the beginning of this assignment: let’s apply our knowledge of enumerate and zip to see if we can print out the points per team in an efficient way.
\(\text{Task 1.3:}\)
Use enumerate to print out the summary of points per team according to the print statement.
Hint: you only need to use one of the lists as an argument.
team = ['green', 'red', 'blue']
score = [5, 9, 7]
for # YOUR_CODE_LINE_WITH_enumerate_HERE:
print(f'Team {} has {} points.')
You may have noticed that enumerate is a bit awkward for this case, since we still need to define an unnecessary iteration index to access the team name. Let’s see if zip makes things easier:
\(\text{Task 1.4:}\)
Use zip to print out the summary of points per team according to the print statement.
team = ['green', 'red', 'blue']
score = [5, 9, 7]
for # YOUR_CODE_LINE_WITH_zip_HERE:
print(f'Team {} has {} points.')
That’s really compact!
By Robert Lanzafame, Delft University of Technology. CC BY 4.0, more info on the Credits page of Workbook.