OpenTelemetry Metrics Roadmap
After the release of the OpenTelemetry Specification v1.0, we are now putting more energy towards the metrics specification. Here’s our update on progress made so far and things that are lined up for completion through the next few months.
Project Scope
Given there are many well-established metrics solutions that exist today, it is important to understand the goals of OpenTelemetry’s metrics effort:
Being able to connect metrics to other signals. For example, metrics and traces can be correlated via exemplars, and metrics dimensions can be enriched via Baggage and Context. Additionally, Resource can be applied to logs/metrics/traces in a consistent way.
Providing a path for OpenCensus customers to migrate to OpenTelemetry. This was the original goal of OpenTelemetry — converging OpenCensus and OpenTracing. We will focus on providing the semantics and capability, instead of doing a 1–1 mapping of the APIs.
Working with existing metrics instrumentation standards. The minimum goal is to provide full support for Prometheus and StatsD — users should be able to use OpenTelemetry clients and Collector to collect and export metrics, with the ability to achieve the same functionality as their native clients.
To accelerate the metrics work, we have three work streams running in parallel:
1: Metrics API/SDK — this group will initially focus on the metrics API design by working closely with the Prometheus team and Metrics Data Model and Protocol team (see below). This group will be working toward identifying a stable API set (ready for API feature freeze) while evaluating effectiveness with various experimental SDK designs. When a stable API set is available, maintainers can finish their experimental SDKs for preview.
2: Metrics Data Model and Protocol — this group is working to specify protocol details that ensure correct integration between push- and pull-based metrics systems, including support for high-availability, having already validated support for OpenCensus Views. This group is working with the Prometheus team to specify operational details for handling OpenMetrics targets in the OpenTelemetry Collector.
3: Prometheus Metrics Support — this group is working on design and development of end-to-end support for Prometheus metrics. A phased approach is being taken with phase 1 deliverables including discovery, auto-sharding for scraping, supporting the “up” metric, labeling incoming samples in the OpenTelemetry Collector with enhancements in the Prometheus receiver, remote-write exporter and specification. Future phases ensure full support for Prometheus metrics in the API/SDKs. You can track progress for this workstream here.
Timeline
Now: we are taking an iterative approach with several small milestones. Currently a small set of language clients (.NET, Java, Python) are working closely with us on the prototype. If you are interested in participating or providing requirements/suggestions, please reach out and say hi on the otel-metrics channel on CNCF Slack (click here to join CNCF Workspace for the first time).
3/31/2021: Get the metrics data model and protocol (OTLP) to “Stable”. This means people can use OTLP as an exchange format for metrics.
5/31/2021: Release an “Experimental” metrics API/SDK specification which we can recommend to language client owners to implement a metrics preview release. This means starting from 6/1/2021, we will recommend it to client maintainers for implementation. We may introduce additional features later, but we will raise the bar to avoid changing/increasing the scope at this stage.
5/31/2021: Full support for Prometheus in the OpenTelemetry Collector will be completed for all items in Phase1 ensuring both specification and implementation stability. The Prometheus receiver and remote-write exporter will be fully functional at this stage.
9/30/2021: Metrics API/SDK specification reaches “Feature-freeze”. This means starting from 10/1/2021, we will focus on bug fixes or editorial changes. Depending on the actual progress, the API specification might reach Feature-freeze earlier than the SDK.
11/30/2021: Metrics API/SDK specification reaches “Stable”. Together with the stable version of the specification, we should expect release candidates from multiple language clients, similar to what we had for tracing. Depending on the actual progress, the API specification might reach Stable earlier than the SDK.
Note: each milestone is dependent on the ones that precede it including interdependencies between the two workstreams, and therefore slippage in a milestone may result in slippage of all the subsequent milestones. Each workstream will explicitly call out risks as early as possible to give the OpenTelemetry community an opportunity to mitigate schedule impacts.
The latest timelines for these workstreams can be found in OpenTelemetry projects on GitHub.
Path to Acceleration
The current timelines are based on availability of our current project maintainers and contributors. These timelines can be accelerated with additional engineering resources and project participation. If you are a project or workstream stakeholder and are interested in adding engineering resources to help accelerate timelines, please reach out to me (@reyang) or Alolita (@alolita).