[Python Series 15] Save Memory with Iterators and Generators

한국어 버전

Compute lazily with iterators and generators

When processing large amounts of data, running out of memory is a common problem. You can solve this by generating values one at a time, exactly when you need them. Three terms show up at once, but they all share the same goal: pull items one at a time. We will start from familiar lists and gradually generalize to iterators and generators.

Key terms

  1. Iterable: anything you can loop over with for, e.g., lists, dicts, strings
  2. Iterator: an object that implements __iter__ and __next__, returning the next value each call
  3. Generator: a function or expression that uses yield to produce values and automatically becomes an iterator
  4. Lazy evaluation: a strategy that defers computation until the moment the value is requested

Core ideas

Study notes

  • Time: 60 minutes
  • Prereqs: loops/comprehensions, defining functions
  • Goal: build a custom iterator class, then re-create it with generator functions and expressions
  • An iterable is any object you can traverse.
  • An iterator responds to next() and raises StopIteration to finish.
  • Generators use yield to create iterators in one step.
  • Lazy evaluation reduces memory pressure by delaying work.
  • Feel free to take only the Core path if you are short on time.

Code walkthrough

Terminology (Core)

  • Iterable: works in a for loop by exposing __iter__ or indexed access via __getitem__.
  • Iterator: exposes both __iter__ and __next__; each next() returns the next value.
  • Generator: a special function or expression with yield; Python turns it into an iterator automatically.

Build an iterator class (Core)

class Countdown:
    def __init__(self, start):
        self.current = start

    def __iter__(self):
        return self

    def __next__(self):
        if self.current <= 0:
            raise StopIteration
        value = self.current
        self.current -= 1
        return value


for number in Countdown(3):
    print(number)  # 3 2 1

When StopIteration is raised, the loop exits. Custom iterators give you precise control over state, but they require boilerplate.

Generator functions and yield (Core)

Generator functions remove most of the ceremony. Think of yield as "hand over a value and pause."

def countdown(start):
    current = start
    while current > 0:
        yield current
        current -= 1


for number in countdown(3):
    print(number)

As soon as a function executes yield, it becomes a generator. Execution pauses after each yield and resumes on the next next() call, preserving state automatically.

Generator expressions (Core → Plus)



squares = (n * n for n in range(1, 1_000_001))
first_ten = list(itertools.islice(squares, 10))

Parentheses create a generator expression. It hardly uses memory until you consume it with list, sum, or another consumer.

Iterator toolbox (Optional)

itertools combines iterators like LEGO bricks.

  • itertools.count(start=0, step=1): infinite increasing sequence
  • itertools.cycle(iterable): repeat items forever
  • itertools.chain(a, b, ...): concatenate multiple iterables
  • itertools.groupby(iterable, key): group sorted data by a key function

Memory and performance strategies (Optional)

  • Stream large CSV files line by line with generators instead of loading them at once.
  • Wrap slow sources such as network responses in generators to let consumers pace themselves.
  • Use itertools.islice to grab only the portion you need.

Why it matters

  • Iterables can be looped; iterators drive next(); generators make iterators effortless.
  • Generator functions remember state while producing one value at a time.
  • With generator expressions and itertools, you build pipelines that avoid wasting memory.

Practice

  • Follow along: implement both the Countdown class and the countdown generator to compare outputs.
  • Extend: mimic a large CSV list and process it with a generator expression plus itertools.islice.
  • Debug: replace StopIteration with return, see how None leaks out, then fix the exception.
  • Definition of done: you have a class-based iterator, a yield function, and a generator expression running in one notebook and can justify when to pick each.

Wrap-up

Next we will manage resources safely with context managers and the with statement.

💬 댓글

이 글에 대한 의견을 남겨주세요