notes

Log | Files | Refs | README

span.md (7167B)


      1 # Span
      2 
      3 ## Question: What is a span in relation to traces, logs and metrics?
      4 
      5 A **span** is the foundational unit of **distributed tracing**, representing a
      6 single logical operation (e.g., an HTTP request, a database call, or a function
      7 invocation) with timing and contextual metadata [7]. To understand spans
      8 deeply—and how they relate to **traces**, **logs**, and **metrics**—we must
      9 examine their structural, semantic, and operational relationships.
     10 
     11 ---
     12 
     13 ### **1. Span as the Atomic Unit of a Trace**
     14 
     15 A **trace** is a directed acyclic graph (DAG) of spans that captures the
     16 end-to-end journey of a request across services [1]. Each span:
     17 
     18 - Has a **start/end timestamp** and **duration**,
     19 - Contains **attributes** (key-value metadata, e.g., HTTP status, user ID),
     20 - May include **events** (timestamped annotations like “query started”),
     21 - Has a **parent-child relationship** with other spans (e.g., a gateway span may
     22   have child spans for auth and DB calls) [[3], [8]].
     23 
     24 Example trace structure[8]:
     25 
     26 ```
     27 Trace
     28 ├── Span (API Gateway)
     29 │   ├── Span (Auth Service)
     30 │   └── Span (User Service)
     31 │       └── Span (Database Query)
     32 └── Span (Response Formatting)
     33 ```
     34 
     35 ---
     36 
     37 ### **2. Relationship to Logs**
     38 
     39 - **Logs** are discrete, timestamped records of events (e.g., “error: connection
     40   timeout”), often unstructured or semi-structured.
     41 - **Spans can embed logs**: When instrumentation libraries (e.g., OpenTelemetry)
     42   integrate with logging frameworks, log statements can be attached to spans as
     43   **structured events** or **log records**, enriching them with trace context
     44   (trace ID, span ID) [10].
     45 - This enables **correlation**: You can view logs _within the context_ of a
     46   specific span—e.g., see all logs from a database query span during a failed
     47   request [10].
     48 
     49 > “When adding OpenTelemetry instrumentation on top of your existing log
     50 > libraries, the log becomes a dot on a trace span” [10].
     51 
     52 ---
     53 
     54 ### **3. Relationship to Metrics**
     55 
     56 - **Metrics** are aggregated numerical measurements over time (e.g., request
     57   rate, latency percentiles, error counts).
     58 - **Spans feed into metrics indirectly**:
     59   - Span durations can be used to compute **latency histograms** (e.g.,
     60     `http.server.request.duration`).
     61   - Span attributes (e.g., `http.status_code`) can be aggregated into
     62     **counters** (e.g., `http_requests_total{status="500"}`).
     63 - While spans are _individual_, _context-rich_ units, metrics are _aggregated_,
     64   _summarized_—but both are essential for the **RED method** (Rate, Errors,
     65   Duration) [7].
     66 
     67 > “Developers can acquire a comprehensive perspective of their software
     68 > environment by combining distributed traces, metrics, events, and logs” [7].
     69 
     70 ---
     71 
     72 ### **4. Relationship to Traces (Recap & Nuance)**
     73 
     74 - A **trace** is a _collection of spans_ that together represent a single
     75   request’s path through a distributed system [3].
     76 - Spans in a trace are linked via:
     77   - **Trace ID** (identifies the full trace),
     78   - **Span ID** (identifies the span),
     79   - **Parent Span ID** (enables tree-like nesting) [[1], [8]].
     80 - Spans may also have **links** to spans in _other traces_ (e.g., for batch
     81   processing or async workflows) [1].
     82 
     83 ---
     84 
     85 ### **5. Practical Implications**
     86 
     87 - **Troubleshooting**: A trace gives you a _map_; logs give you _narrative
     88   detail_; metrics give you _signal-level trends_. For example:
     89   - A metric alert (e.g., high error rate) → drill into traces to find failing
     90     spans → inspect embedded logs for root cause [14].
     91 - **Context propagation**: Spans carry trace context (trace ID, span ID,
     92   sampling flags) across service boundaries, enabling distributed correlation
     93   [9].
     94 
     95 ---
     96 
     97 ### **Summary**
     98 
     99 | Concept     | Role                                       | Relationship to Span                                                |
    100 | ----------- | ------------------------------------------ | ------------------------------------------------------------------- |
    101 | **Span**    | Smallest unit of work in a trace           | —                                                                   |
    102 | **Trace**   | Collection of spans forming a request path | Spans are its building blocks [3]                                   |
    103 | **Logs**    | Event records with timestamps              | Logs can be attached to spans as events or structured metadata [10] |
    104 | **Metrics** | Aggregated numerical signals               | Span data (duration, status) is used to derive metrics [7]          |
    105 
    106 In essence, **spans unify the three pillars of observability**—they are the
    107 _contextual glue_ that lets you correlate logs (what happened), metrics (how
    108 often/long), and traces (how it flows) into actionable insights [[4], [14]].
    109 
    110 ## References
    111 
    112 1. [Traces | OpenTelemetry](https://opentelemetry.io/docs/concepts/signals/traces/)
    113    _(brave)_
    114 2. [OpenTelemetry - Understanding Traces vs. Spans | SigNoz](https://signoz.io/comparisons/opentelemetry-trace-vs-span/)
    115    _(brave)_
    116 3. [Logs vs Metrics vs Traces - Engineering Fundamentals Playbook](https://microsoft.github.io/code-with-engineering-playbook/observability/log-vs-metric-vs-trace/)
    117    _(google)_
    118 4. [Observability primer | OpenTelemetry](https://opentelemetry.io/docs/concepts/observability-primer/)
    119    _(brave)_
    120 5. [Unpacking Observability: Understanding Logs, Events, Spans, and Traces | Dzero Labs](https://medium.com/dzerolabs/observability-journey-understanding-logs-events-traces-and-spans-836524d63172)
    121    _(google)_
    122 6. [OpenTelemetry demystified: a deep dive into distributed tracing | CNCF](https://www.cncf.io/blog/2023/05/03/opentelemetry-demystified-a-deep-dive-into-distributed-tracing/)
    123    _(google)_
    124 7. [What Are Spans in Distributed Tracing? - LogicMonitor](https://www.logicmonitor.com/blog/what-are-spans-in-distributed-tracing)
    125    _(startpage)_
    126 8. [Traces & Spans: Observability Basics You Should Know - Last9](https://last9.io/blog/traces-spans-observability-basics/)
    127    _(startpage)_
    128 9. [software-skills/skills/system-design/references/key-concepts ...](https://github.com/itzcull/software-skills/blob/master/skills/system-design/references/key-concepts/distributed-tracing.md)
    129    _(aol)_
    130 10. [Tracing the Line: Understanding Logs vs. Traces - Honeycomb](https://www.honeycomb.io/blog/understanding-logs-vs-traces)
    131     _(google)_
    132 11. [A Deep Dive into OpenTelemetry. Part 1 - AWS in Plain English](https://aws.plainenglish.io/opentelemetry-deep-dive-part-1-6ebbd2362bd3)
    133     _(google)_
    134 12. [Deep Dive into OpenTelemetry in Saleor](https://saleor.io/blog/otel-deep-dive)
    135     _(google)_
    136 13. [Logging Observability - OpenClaw AI Agent Skill | LLMBase](https://llmbase.ai/openclaw/logging-observability/)
    137     _(aol)_
    138 14. [Learning Observability from Scratch: Logs, Metrics, and Traces | by Milind Nair | Mar, 2026 | Medium](https://medium.com/@nairmilind3/learning-observability-from-scratch-c36d9771003b)
    139     _(brave)_
    140 15. [A Deep Dive Into OpenTelemetry Metrics | Tiger Data](https://www.tigerdata.com/blog/a-deep-dive-into-open-telemetry-metrics)
    141     _(aol)_
    142 16. [GitHub - tokio-rs/tracing: Application level tracing for Rust.](https://github.com/tokio-rs/tracing)
    143     _(aol)_