On this page
Dictionaries and Sets
Dictionaries and Sets
Python provides two hash-based collection types that are essential for writing efficient programs: dictionaries and sets. Both rely on hash tables internally, which gives them O(1) average-time lookups — meaning they find or check values in constant time regardless of how many elements they hold. Understanding when and how to use them is critical for writing performant Python code.
Dictionaries
A dictionary (dict) is an ordered (since Python 3.7), mutable mapping of keys to values. Keys must be hashable (typically strings, numbers, or tuples of hashable types). Values can be anything.
Creating Dictionaries
# Literal syntax
empty: dict[str, int] = {}
person = {"name": "Alice", "age": 30, "city": "Berlin"}
# dict() constructor with keyword arguments
config = dict(host="localhost", port=8000, debug=True)
# dict() from a list of (key, value) pairs
items = [("a", 1), ("b", 2), ("c", 3)]
from_pairs = dict(items)
# dict.fromkeys() — create with default values
keys = ["x", "y", "z"]
defaults = dict.fromkeys(keys, 0)
print(defaults) # {'x': 0, 'y': 0, 'z': 0}
print(person)
print(config)Accessing and Modifying
user = {"name": "Bob", "email": "[email protected]", "age": 25, "active": True}
# Access by key — raises KeyError if missing
print(user["name"]) # Bob
# .get() — returns None (or a default) if key is missing, never raises
print(user.get("email")) # [email protected]
print(user.get("phone")) # None
print(user.get("phone", "N/A")) # N/A
# Check if a key exists
print("age" in user) # True
print("address" in user) # False
# Add or update a key
user["phone"] = "+49-555-1234"
user["age"] = 26 # update existing
# Delete a key
del user["phone"]
# .pop() removes and returns the value
age = user.pop("age")
print(f"Removed age: {age}")
# .setdefault() — return the value if key exists, otherwise insert and return default
user.setdefault("role", "user") # inserts role="user" since it doesn't exist
user.setdefault("name", "nobody") # does NOT change name since it already exists
print(user["role"]) # user
print(user["name"]) # Bob
print(user)Iterating Over Dictionaries
scores = {"Alice": 92, "Bob": 78, "Carol": 85, "Dave": 90}
# Iterate over keys (default)
for name in scores:
print(name)
# Iterate over values
for score in scores.values():
print(score)
# Iterate over key-value pairs
for name, score in scores.items():
print(f"{name}: {score}")
# Sort by value
for name, score in sorted(scores.items(), key=lambda x: x[1], reverse=True):
print(f"{name}: {score}")
# Alice: 92
# Dave: 90
# Carol: 85
# Bob: 78Merging Dictionaries
defaults = {"color": "blue", "size": "medium", "font": "Arial"}
overrides = {"color": "red", "size": "large"}
# Python 3.9+: | operator (merge, right side wins)
merged = defaults | overrides
print(merged) # {'color': 'red', 'size': 'large', 'font': 'Arial'}
# Python 3.9+: |= updates in place
defaults |= overrides
print(defaults)
# Classic approach (works in all Python 3 versions)
merged2 = {**defaults, **overrides}
# .update() modifies in place
base = {"a": 1, "b": 2}
base.update({"b": 99, "c": 3})
print(base) # {'a': 1, 'b': 99, 'c': 3}Dictionary Comprehensions
# Square each number
squares = {n: n ** 2 for n in range(1, 6)}
print(squares) # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
# Filter even numbers
names = ["Alice", "Bob", "Carol", "Dave"]
scores_raw = [92, 78, 85, 90]
score_dict = {name: score for name, score in zip(names, scores_raw) if score >= 85}
print(score_dict) # {'Alice': 92, 'Carol': 85, 'Dave': 90}
# Invert a dictionary (swap keys and values)
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted) # {1: 'a', 2: 'b', 3: 'c'}tip type: tip title: "Use .get() to avoid KeyError"
Accessing a missing key with
dict[key]raises aKeyError. Usedict.get(key, default)when you are not sure the key exists. For building nested structures or counters, also considercollections.defaultdictwhich automatically creates a default value for missing keys.
`collections.defaultdict` and `Counter`
from collections import defaultdict, Counter
# defaultdict — provides a default factory for missing keys
word_lengths: defaultdict[int, list[str]] = defaultdict(list)
words = ["cat", "dog", "ant", "fox", "elephant", "bear"]
for word in words:
word_lengths[len(word)].append(word)
print(dict(word_lengths))
# {3: ['cat', 'dog', 'ant', 'fox'], 8: ['elephant'], 4: ['bear']}
# Counter — specialized dict for counting
text = "the quick brown fox jumps over the lazy dog"
word_counts = Counter(text.split())
print(word_counts.most_common(3))
# [('the', 2), ('quick', 1), ('brown', 1)]
# Counter supports arithmetic
c1 = Counter({"a": 3, "b": 2})
c2 = Counter({"a": 1, "b": 4, "c": 2})
print(c1 + c2) # Counter({'b': 6, 'a': 4, 'c': 2})
print(c1 - c2) # Counter({'a': 2})Sets
A set is an unordered collection of unique hashable elements. Sets do not allow duplicates and have no indexing. Their main strengths are fast membership testing and set algebra operations.
Creating Sets
# Literal syntax — note: {} creates an empty DICT, not a set!
empty_set: set[int] = set()
fruits = {"apple", "banana", "cherry", "apple", "banana"}
print(fruits) # {'cherry', 'banana', 'apple'} — duplicates removed, order not guaranteed
# set() constructor
from_list = set([1, 2, 2, 3, 3, 3])
print(from_list) # {1, 2, 3}
from_string = set("mississippi")
print(from_string) # {'m', 'i', 's', 'p'} — unique charactersSet Operations
python_devs = {"Alice", "Bob", "Carol", "Dave", "Eve"}
js_devs = {"Bob", "Frank", "Carol", "Grace"}
# Union — all members of either set
print(python_devs | js_devs)
# {'Alice', 'Bob', 'Carol', 'Dave', 'Eve', 'Frank', 'Grace'}
# Intersection — members in BOTH sets
print(python_devs & js_devs)
# {'Bob', 'Carol'}
# Difference — in python_devs but NOT in js_devs
print(python_devs - js_devs)
# {'Alice', 'Dave', 'Eve'}
# Symmetric difference — in one but NOT in both
print(python_devs ^ js_devs)
# {'Alice', 'Dave', 'Eve', 'Frank', 'Grace'}
# Subset and superset
small = {"Bob", "Carol"}
print(small.issubset(python_devs)) # True
print(python_devs.issuperset(small)) # True
print(small.isdisjoint({"Alice", "Dave"})) # True (no common elements)Modifying Sets
tags = {"python", "web", "backend"}
# Add a single element
tags.add("api")
# Add multiple elements
tags.update(["devops", "cloud"])
# Remove — raises KeyError if not found
tags.remove("web")
# Discard — silently ignores if not found (preferred when unsure)
tags.discard("nonexistent")
# Pop — removes and returns an arbitrary element
removed = tags.pop()
print(f"Removed: {removed}")
print(tags)Frozensets — Immutable Sets
# frozenset is immutable — can be used as a dictionary key or set element
permissions = frozenset({"read", "write"})
admin_permissions = frozenset({"read", "write", "delete", "admin"})
# Set operations work the same
print(permissions.issubset(admin_permissions)) # True
# Frozensets can be dictionary keys
role_map: dict[frozenset[str], str] = {
frozenset({"read"}): "viewer",
frozenset({"read", "write"}): "editor",
frozenset({"read", "write", "delete", "admin"}): "admin",
}
user_perms = frozenset({"read", "write"})
print(role_map.get(user_perms, "unknown")) # editorPractical Patterns
Remove Duplicates While Preserving Order
# set() loses order — use dict.fromkeys() to preserve it
items = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
unique_ordered = list(dict.fromkeys(items))
print(unique_ordered) # [3, 1, 4, 5, 9, 2, 6]Fast Membership Testing
# Using a list — O(n) lookup
valid_extensions_list = [".jpg", ".png", ".gif", ".webp", ".svg"]
# Using a set — O(1) lookup (much faster for large collections)
valid_extensions_set = {".jpg", ".png", ".gif", ".webp", ".svg"}
filename = "photo.jpg"
ext = "." + filename.split(".")[-1]
if ext in valid_extensions_set:
print("Valid image file")tip type: info title: "Sets for large membership checks"
If you frequently check whether a value exists in a collection (e.g., allowed values, seen IDs, blacklisted words), store them in a
setrather than alist. For a collection of 10,000 elements, a set lookup is roughly 1,000 times faster than a list lookup.
nextSteps
- comprehensions-and-generators
Sign in to track your progress