[Python Series 13] Work With Data Using Comprehensions

한국어 버전

Work with data using comprehensions

Now that you understand loops and conditions, let's consolidate that knowledge and learn how to write cleaner, shorter code using Python comprehensions. A comprehension combines "iteration + condition + expression" into one line that builds a new collection. Use only the readable patterns and your data transformations become sharper.

Key terms

  1. Comprehension: a compact syntax that merges loops and conditionals to build lists or dictionaries
  2. Set: an unordered collection type that forbids duplicates
  3. Generator expression: the parenthesized version that computes values lazily to save memory

Core ideas

Study notes

  • Time: 45–55 minutes
  • Prereqs: comfort with lists, dictionaries, conditions, and loops
  • Goal: refactor list/dict/set logic into comprehensions without losing readability
  • Comprehensions compress loops and filters into one expression.
  • Sets eliminate duplicates and ignore ordering.
  • Generator expressions delay computation and conserve memory.

Code walkthrough

Basic list comprehension

numbers = [1, 2, 3, 4, 5]
squared = [n * n for n in numbers]
even_squared = [n * n for n in numbers if n % 2 == 0]
  • for n in numbers iterates through the original list.
  • n * n defines what to store.
  • if n % 2 == 0 is an optional filter.

Long conditions hurt readability—keep complex branching in plain loops or helper functions.

Dictionary and set comprehensions

people = [
    {"name": "민지", "score": 85},
    {"name": "준호", "score": 92},
]

score_map = {p["name"]: p["score"] for p in people}
passed = {p["name"] for p in people if p["score"] >= 90}
  • {key: value for ...} produces a dictionary.
  • {expression for ...} produces a set. Sets skip duplicates, so they are perfect for deduplicated filters.

Handle nested loops

matrix = [[1, 2], [3, 4], [5, 6]]
flattened = [item for row in matrix for item in row]

Read from left to right: for row in matrix then for item in row. More than two levels quickly become hard to read, so consider regular loops or itertools after that.

Conditional expressions and layered transforms

def normalize(score):
    return 0 if score < 0 else min(score, 100)

normalized = [normalize(s) for s in raw_scores]
labels = ["pass" if s >= 60 else "fail" for s in normalized]

Inline ternaries (A if condition else B) are fine for short branches. As soon as logic interleaves, extract helpers so intent stays clear.

Performance and memory

Comprehensions allocate the result immediately. For millions of items, prefer a generator expression.

lazy_numbers = (n * n for n in range(10_000_000))

Switching to parentheses builds a generator that yields values on demand. We will dive deeper into iterators and generators in the next chapter; for now remember that comprehensions are eager while generator expressions are lazy.

Why it matters

  • Comprehensions combine iteration, filtering, and transformation in one readable block.
  • Lists, dictionaries, and sets share the same idea, but avoid excessive nesting.
  • Large datasets benefit from generator expressions to trim memory use.

Practice

  • Follow along: recreate each list and dict example, and keep both the loop and comprehension versions side by side.
  • Extend: refactor CSV parsing logic into a comprehension, then produce a set to remove duplicates.
  • Debug: intentionally misplace parentheses in a nested comprehension until you hit SyntaxError, then correct the order.
  • Definition of done: you have applied at least one list, dict, and set comprehension to real code and can explain the readability trade-offs.

Wrap-up

Sharper code clarifies intent. Next we will decorate and compose functions with decorators and higher-order techniques.

💬 댓글

이 글에 대한 의견을 남겨주세요