[FastAPI Series 19] Logging, Monitoring, and Error Handling

한국어 버전

During operations you need logging and monitoring to answer “what just happened?” Logging records events inside the code. Monitoring turns those records and metrics into dashboards. OpenTelemetry sends logs, metrics, and traces in a standard format. In this episode we configure structured logs, OpenTelemetry-driven metrics/traces, and user-friendly error responses.

Primer: observability in plain language

  • Logs are “diaries that record events in chronological order.”
  • Monitoring is the “bulletin board that shows the diary and numbers as graphs.”
  • Observability means “designing the system so you can find the cause using only the diary and the board.”

Keep that framing and the tooling roles become clear.

Key terms

  1. Structured log: Logs written in a consistent format such as JSON so analysis tools can search and aggregate them easily.
  2. OpenTelemetry (OTel): An open standard for collecting and shipping logs, metrics, and traces to backends like Grafana Tempo or Jaeger.
  3. structlog: A Python helper that makes structured logging easy by adding timestamps, levels, and fields automatically.
  4. JSONResponse: FastAPI’s JSON wrapper, useful when returning consistent, user-friendly error payloads.
  5. Prometheus/OTLP: Prometheus scrapes metrics, while OTLP (OpenTelemetry Protocol) transports traces/logs to external collectors.

Practice card

  • Estimated time: 55 minutes (core) / +30 minutes (optional)
  • Prereqs: Episode 18 performance code, basic logging familiarity
  • Goal: Apply structured logs and request/error handlers, then extend into OpenTelemetry, metrics, or alerts when time allows

Core practice: structured logs + error responses

The required scope covers JSON logs, a request-logging middleware, and friendly error responses. The rest of the observability stack lives in the optional section.

Structured logs with structlog

structlog prints JSON logs so ELK, Loki, or similar collectors can parse them.


structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="ISO"),
        structlog.processors.add_log_level,
        structlog.processors.dict_tracebacks,
        structlog.processors.JSONRenderer(),
    ]
)

logger = structlog.get_logger()

@app.get("/health")
async def health():
    logger.info("health_check", status="ok")
    return {"status": "ok"}

dict_tracebacks transforms exceptions into JSON stack traces.

Request/response logging middleware

@app.middleware("http")
async def log_requests(request: Request, call_next):
    logger.info("request_started", method=request.method, path=request.url.path)
    response = await call_next(request)
    logger.info(
        "request_finished",
        method=request.method,
        path=request.url.path,
        status=response.status_code,
    )
    return response

Avoid dumping sensitive bodies—log only the metadata you actually need.

Unified error responses

from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse

@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request, exc):
    logger.warning("validation_failed", errors=exc.errors())
    return JSONResponse(
        status_code=422,
        content={
            "detail": "The request format is invalid.",
            "errors": exc.errors(),
        },
    )

Users see a friendly message while logs keep the full details.

Optional: full observability stack

Once the core work is done, add OpenTelemetry, Prometheus, dashboards, and alerts. If you only need this in production, schedule it for a future session.

Optional: OpenTelemetry wiring

pip install opentelemetry-sdk opentelemetry-instrumentation-fastapi opentelemetry-exporter-otlp
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor

provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint=settings.otel_endpoint))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

app = FastAPI()
FastAPIInstrumentor.instrument_app(app)

OTLP lets you send traces to Tempo, Jaeger, Datadog, and more.

Optional: expose metrics

prometheus-fastapi-instrumentator surfaces request counts and latency distributions within minutes.

from prometheus_fastapi_instrumentator import Instrumentator

Instrumentator().instrument(app).expose(app)

This adds a /metrics endpoint in Prometheus format.

Optional: dashboard D2

FastAPILokiPrometheusTempoGrafanaOn-call Engineer LogsMetricsTraces

Bundling logs, metrics, and traces inside Grafana shortens incident response time.

Optional: error alerts

  • SaaS options like Sentry, Honeybadger, or Airbrake send stack traces straight to your inbox.
  • Slack/Teams webhooks can forward critical logs as messages.

With observability in place, mean time to recovery shrinks. The final episode ties every component together into a mini service.

Exercises

  • Follow along: emit JSON logs via structlog, register the request middleware, and add the validation exception handler.
  • Extend: wire OpenTelemetry or Prometheus and sketch a simple dashboard flow.
  • Debug: trigger intentional errors to inspect logged fields and trim any sensitive data.
  • Definition of done: a single log line lets you trace request IDs and statuses, plus you have tested at least one observability extension.

Wrap-up

Design observability alongside your code and incidents become less stressful. Treat JSON logs, middleware, and friendly error responses as the defaults, then layer OpenTelemetry when needed.

💬 댓글

이 글에 대한 의견을 남겨주세요