Telemetry Subscribers

The telemetry-subscribers crate is used for logging and tracing the node. It provides a tracing subscriber with flexible options to monitor the nodes performance and behavior. It is currently used in different main functions of the node, including tests and is typically initialized at the beginning of a main function like:

// initialize tracing
let _guard = telemetry_subscribers::TelemetryConfig::new()
    .with_env()
    .init();

Configuration

The TelemetryConfig allows setting various configuration options, such as tracing with the OpenTelemetry protocol (OTLP), outputting JSON logs, writing logs to files, setting log levels, defining span levels, setting panic hooks, or specifying Prometheus registries.

/// Configuration for different logging/tracing options
/// ===
/// - json_log_output: Output JSON logs to stdout only.
/// - log_file: If defined, write output to a file starting with this name, ex
///   app.log
/// - log_level: error/warn/info/debug/trace, defaults to info
#[derive(Default, Clone, Debug)]
pub struct TelemetryConfig {
    /// Enable OpenTelemetry tracing
    pub enable_otlp_tracing: bool,
    /// Enables Tokio Console debugging on port 6669
    pub tokio_console: bool,
    /// Output JSON logs.
    pub json_log_output: bool,
    /// If defined, write output to a file starting with this name, ex app.log
    pub log_file: Option<String>,
    /// Log level to set, defaults to info
    pub log_string: Option<String>,
    /// Span level - what level of spans should be created.  Note this is not
    /// same as logging level If set to None, then defaults to INFO
    pub span_level: Option<Level>,
    /// Set a panic hook
    pub panic_hook: bool,
    /// Crash on panic
    pub crash_on_panic: bool,
    /// Optional Prometheus registry - if present, all enabled span latencies
    /// are measured
    pub prom_registry: Option<prometheus::Registry>,
    pub sample_rate: f64,
    /// Add directive to include trace logs with provided target
    pub trace_target: Option<Vec<String>>,
}

It implements a with_env function in order to set the config fields based on environment variables. After a TelemetryConfig instance is created and values are set, the init function will enable it.

For that, the init first sets up a EnvFilter to manage which log messages are shown, based on the log level. Per default, the log level is set to info, but it can be adjusted by setting the log_string variable. Then, another filter is created for span levels.

A span is a single unit of work and represents a specific operation, like a database query, and has a start and end time. Spans can be linked together to show the flow of operations in a system. Each span can include:

A name describing the operation
Timing information (start and end)
Attributes (key-value pairs) for extra context or relationships to other spans (parent-child).

Spans help understand the flow of requests through a system better. The span filter, created by the init function ensures that only relevant spans are processed, which helps manage performance and logging noise.

After the span filter is set up, a collection of layers is initialized. These layers send data to tokio-console for debugging or integrate with Prometheus for measuring span latencies. Each layer will be enabled or disabled based on the configuration. If OTLP tracing is enabled, an OpenTelemetryLayer will be set up for tracing to either a file or an OTLP endpoint based on environment settings.

After setting up all layers, a tracing subscriber is created with the configured layers and set as the global default. Ultimately, the function creates and returns TelemetryGuards and TracingHandle structs, which manage the tracing subscriber. They should be kept active in the main function to ensure logging and tracing throughout the application's lifecycle.

Prometheus Layer

The telemetry-subscriber allows to measure Tokio-tracing span latencies that will be recorded into Prometheus histograms directly. For that, a prometheus::Registry must be passed to the TelemetryConfig using the with_prom_registry function. The name of the Prometheus histogram is tracing_span_latencies(_sum/count/bucket). For example, in the iota-node it is set up as follows:

let runtimes = IotaRuntimes::new( & config);
let metrics_rt = runtimes.metrics.enter();
let registry_service = iota_metrics::start_prometheus_server(config.metrics_address);
let prometheus_registry = registry_service.default_registry();

// Initialize logging
let (_guard, filter_handle) = telemetry_subscribers::TelemetryConfig::new()
    .with_env()
    .with_prom_registry(&prometheus_registry)
    .init();

drop(metrics_rt);

In order to set up the subscriber, it enters the metrics runtime first and creates a RegistryService from the iota-metrics crate. It initializes the default Prometheus registry in the RegistryService and returns it. The metrics_address is the address where the Prometheus metrics will be exposed and can be specified in the node's configuration file. Per default, it is set to 0.0.0.0:9400. After the default registry is created, it is passed to the TelemetryConfig with the with_prom_registry function.

What is the primary purpose of the `telemetry-subscribers` crate in an IOTA node?

Feedback Form

Configuration​

Prometheus Layer​

Question 1/3

Configuration

Prometheus Layer