On this page
Comprehensions and Generators
Comprehensions and Generators
One of the features that makes Python code look so clean and readable is the comprehension syntax. Comprehensions let you build collections from existing iterables in a single, expressive line. Generators take this further by producing values lazily — one at a time — without building the entire result in memory. Mastering these tools will dramatically improve the quality and performance of your Python code.
List Comprehensions
The general form of a list comprehension is:
[expression for item in iterable if condition]The if condition part is optional. Python evaluates expression for every item in iterable (that satisfies condition) and builds a new list.
# Squares of numbers 1 through 10
squares = [n ** 2 for n in range(1, 11)]
print(squares)
# [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
# Compare to the loop equivalent:
squares_loop = []
for n in range(1, 11):
squares_loop.append(n ** 2)
# Same result, but less expressive
# With a filter condition
even_squares = [n ** 2 for n in range(1, 11) if n % 2 == 0]
print(even_squares) # [4, 16, 36, 64, 100]
# Transform strings
words = [" Hello ", " Python ", " World "]
clean = [word.strip().lower() for word in words]
print(clean) # ['hello', 'python', 'world']
# Filter and transform together
numbers = [-5, -3, -1, 0, 1, 3, 5, 7, 9]
positive_doubles = [n * 2 for n in numbers if n > 0]
print(positive_doubles) # [2, 6, 10, 14, 18]Nested Comprehensions
You can nest for clauses to flatten nested structures:
# Flatten a 2D matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
print(flat) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
# All (x, y) pairs where x != y
pairs = [(x, y) for x in range(1, 4) for y in range(1, 4) if x != y]
print(pairs)
# [(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)]
# Transpose a matrix
matrix_3x3 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transposed = [[row[i] for row in matrix_3x3] for i in range(3)]
print(transposed)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]tip type: warning title: "Keep comprehensions simple"
If your comprehension needs more than two
forclauses or the logic is hard to read at a glance, switch to a regular loop. Comprehensions are a readability tool — when they make code harder to understand, they defeat their own purpose.
Dictionary Comprehensions
Same idea, but produces a dict. The expression must be a key: value pair:
# Map names to their lengths
names = ["Alice", "Bob", "Carol", "Dave"]
name_lengths = {name: len(name) for name in names}
print(name_lengths) # {'Alice': 5, 'Bob': 3, 'Carol': 5, 'Dave': 4}
# Celsius to Fahrenheit lookup table
temps_celsius = {"freezing": 0, "body": 37, "boiling": 100}
temps_fahrenheit = {label: c * 9/5 + 32 for label, c in temps_celsius.items()}
print(temps_fahrenheit)
# {'freezing': 32.0, 'body': 98.6, 'boiling': 212.0}
# Filter + transform: keep only passing grades
grades = {"Alice": 92, "Bob": 54, "Carol": 78, "Dave": 45, "Eve": 88}
passing = {name: grade for name, grade in grades.items() if grade >= 60}
print(passing) # {'Alice': 92, 'Carol': 78, 'Eve': 88}
# Invert keys and values
inverted = {v: k for k, v in {"a": 1, "b": 2, "c": 3}.items()}
print(inverted) # {1: 'a', 2: 'b', 3: 'c'}Set Comprehensions
Identical to list comprehensions but use {} and produce a set (no duplicates):
# Unique word lengths in a sentence
sentence = "the quick brown fox jumps over the lazy dog"
unique_lengths = {len(word) for word in sentence.split()}
print(sorted(unique_lengths)) # [2, 3, 4, 5]
# Unique absolute values (removes duplicates naturally)
numbers = [-4, -3, -2, -1, 0, 1, 2, 3, 4]
abs_values = {abs(n) for n in numbers}
print(sorted(abs_values)) # [0, 1, 2, 3, 4]
# Vowels in a word
vowels = {char for char in "programming" if char in "aeiou"}
print(vowels) # {'a', 'i', 'o'}Generator Expressions
A generator expression looks like a list comprehension but uses () instead of []. Instead of building the entire list in memory, it returns a generator object that produces values one at a time (lazily):
# List comprehension — builds the entire list in memory immediately
squares_list = [n ** 2 for n in range(1_000_000)] # ~8MB in memory
# Generator expression — produces values on demand
squares_gen = (n ** 2 for n in range(1_000_000)) # ~120 bytes!
# Both work the same way when iterated
for sq in squares_gen:
if sq > 100:
break
print(sq, end=" ")
# 0 1 4 9 16 25 36 49 64 81 100
# Generators work with built-in functions
total = sum(n ** 2 for n in range(101)) # no extra list needed
print(total) # 338350
# Check if any/all conditions are met (short-circuits — very efficient)
numbers = [2, 4, 6, 7, 8, 10]
print(all(n % 2 == 0 for n in numbers)) # False (7 is odd)
print(any(n > 9 for n in numbers)) # True (10 > 9)Generator Functions with `yield`
A generator function uses yield instead of return. When called, it returns a generator object. Each time you iterate, execution resumes from where it last yielded:
def count_up(start: int, stop: int, step: int = 1):
"""Yield integers from start up to (but not including) stop."""
current = start
while current < stop:
yield current
current += step
for n in count_up(0, 10, 2):
print(n, end=" ")
# 0 2 4 6 8
# Generators are iterators — you can call next() manually
gen = count_up(1, 4)
print(next(gen)) # 1
print(next(gen)) # 2
print(next(gen)) # 3
try:
print(next(gen)) # raises StopIteration
except StopIteration:
print("Generator exhausted")Infinite Generators
Because generators are lazy, they can represent infinite sequences:
def fibonacci():
"""Yield the Fibonacci sequence indefinitely."""
a, b = 0, 1
while True:
yield a
a, b = b, a + b
def take(n: int, iterable):
"""Return the first n items from an iterable."""
for i, item in enumerate(iterable):
if i >= n:
return
yield item
# Print the first 15 Fibonacci numbers
fib_nums = list(take(15, fibonacci()))
print(fib_nums)
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]Pipelines with Generators
Generators compose naturally into data pipelines — each stage pulls from the previous one lazily:
def read_numbers(count: int):
"""Simulate reading numbers from a data source."""
for i in range(1, count + 1):
yield i
def filter_even(numbers):
"""Keep only even numbers."""
for n in numbers:
if n % 2 == 0:
yield n
def square(numbers):
"""Square each number."""
for n in numbers:
yield n ** 2
# Build the pipeline — nothing is computed yet!
pipeline = square(filter_even(read_numbers(20)))
# Values are computed only when consumed
result = list(pipeline)
print(result)
# [4, 16, 36, 64, 100, 144, 196, 256, 324, 400]This pipeline processes one element at a time, using O(1) memory regardless of the input size. For large datasets (files, API streams, database cursors), this is a significant advantage.
`yield from` — Delegating to Sub-Generators
yield from lets a generator delegate to another iterable:
def flatten(nested):
"""Recursively flatten a nested list structure."""
for item in nested:
if isinstance(item, list):
yield from flatten(item) # delegate to recursive call
else:
yield item
data = [1, [2, 3], [4, [5, 6]], 7, [8, [9, [10]]]]
print(list(flatten(data)))
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def chain(*iterables):
"""Yield elements from each iterable in sequence."""
for it in iterables:
yield from it
result = list(chain([1, 2], [3, 4], [5, 6]))
print(result) # [1, 2, 3, 4, 5, 6]tip type: info title: "Generators are single-pass"
Unlike lists, a generator can only be iterated once. After it is exhausted, iterating it again yields nothing. If you need to iterate a generator multiple times, either convert it to a list first (if it fits in memory) or recreate the generator. This is the main trade-off for the memory savings they provide.
`itertools` — The Generator Toolkit
The itertools module in the standard library provides powerful tools for working with iterables:
from itertools import islice, chain, product, combinations, permutations, accumulate
# islice — slice a generator (like list slicing, but lazy)
evens = (n for n in range(0, 1_000_000, 2))
first_five_evens = list(islice(evens, 5))
print(first_five_evens) # [0, 2, 4, 6, 8]
# product — Cartesian product
suits = ["♠", "♥", "♦", "♣"]
ranks = ["A", "K", "Q", "J"]
face_cards = list(product(ranks, suits))
print(len(face_cards)) # 16
print(face_cards[:4]) # [('A', '♠'), ('A', '♥'), ('A', '♦'), ('A', '♣')]
# combinations — unique groupings
items = ["a", "b", "c", "d"]
pairs = list(combinations(items, 2))
print(pairs)
# [('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
# accumulate — running totals
sales = [100, 150, 200, 120, 180]
running_total = list(accumulate(sales))
print(running_total) # [100, 250, 450, 570, 750]nextSteps
- error-handling
Sign in to track your progress