Post

OpenTelemetry Architecture Deep Dive

Introduction

OpenTelemetry (OTel) provides a unified architecture for metrics, traces, and logs. Understanding its internal layers helps advanced teams design scalable observability pipelines and avoid hidden costs.

Core Components

OpenTelemetry is composed of a few major building blocks.

  • API: Stable interfaces for instrumentation.
  • SDK: Implementation that handles batching, sampling, and processing.
  • Collector: Vendor-neutral pipeline for receiving, processing, and exporting telemetry.
  • Exporters: Output adapters to backends like Prometheus, Tempo, or Elastic.

Data Flow

The instrumentation API emits signals into the SDK, which applies processors like resource detection, attribute filtering, and batching. From there, data is exported directly to a backend or routed through the collector. The collector is preferred in production because it centralizes authentication, load shedding, and buffering.

Signal Correlation

Metrics and traces can be linked using exemplars, while logs can include trace and span identifiers. This correlation is critical when you need to jump from a high-level SLO breach to the exact trace that caused it.

Java Example: Custom Tracer Provider

The following snippet shows a manual setup for a Spring Boot service where you control sampling and resource attributes.

1
2
3
4
5
6
7
8
9
10
11
SdkTracerProvider tracerProvider = SdkTracerProvider.builder()
    .setResource(Resource.getDefault().merge(
        Resource.create(Attributes.of(ResourceAttributes.SERVICE_NAME, "billing-api"))
    ))
    .setSampler(Sampler.traceIdRatioBased(0.2))
    .addSpanProcessor(BatchSpanProcessor.builder(otlpSpanExporter).build())
    .build();

OpenTelemetry openTelemetry = OpenTelemetrySdk.builder()
    .setTracerProvider(tracerProvider)
    .build();

Collector Pipelines for Production

Use the collector to apply tail-based sampling, attribute scrubbing, and rate limiting. This keeps SDKs lightweight and moves heavy processing to a central, scalable component.

Conclusion

OpenTelemetry is more than a library. It is an architecture for telemetry pipelines. Mastering its components allows you to tune reliability, cost, and data fidelity with precision.

This post is licensed under CC BY 4.0 by the author.