Telemetry Subscribers
The telemetry-subscribers
crate is used for logging and tracing the node.
It provides a tracing
subscriber with flexible options to monitor the nodes performance and behavior.
It is currently used in different main functions of the node, including tests and is typically initialized at the
beginning of a main function like:
// initialize tracing
let _guard = telemetry_subscribers::TelemetryConfig::new()
.with_env()
.init();
Configuration
The TelemetryConfig
allows setting various configuration options, such as tracing with the OpenTelemetry protocol (OTLP), outputting JSON logs,
writing logs to files, setting log levels, defining span levels, setting panic hooks, or specifying Prometheus
registries.
/// Configuration for different logging/tracing options
/// ===
/// - json_log_output: Output JSON logs to stdout only.
/// - log_file: If defined, write output to a file starting with this name, ex
/// app.log
/// - log_level: error/warn/info/debug/trace, defaults to info
#[derive(Default, Clone, Debug)]
pub struct TelemetryConfig {
/// Enable OpenTelemetry tracing
pub enable_otlp_tracing: bool,
/// Enables Tokio Console debugging on port 6669
pub tokio_console: bool,
/// Output JSON logs.
pub json_log_output: bool,
/// If defined, write output to a file starting with this name, ex app.log
pub log_file: Option<String>,
/// Log level to set, defaults to info
pub log_string: Option<String>,
/// Span level - what level of spans should be created. Note this is not
/// same as logging level If set to None, then defaults to INFO
pub span_level: Option<Level>,
/// Set a panic hook
pub panic_hook: bool,
/// Crash on panic
pub crash_on_panic: bool,
/// Optional Prometheus registry - if present, all enabled span latencies
/// are measured
pub prom_registry: Option<prometheus::Registry>,
pub sample_rate: f64,
/// Add directive to include trace logs with provided target
pub trace_target: Option<Vec<String>>,
}
It implements a with_env
function in order to set the config fields based on environment variables.
After a TelemetryConfig
instance is created and values are set, the init
function will enable it.
For that, the init
first sets up a EnvFilter
to manage which log messages are shown, based on the log level.
Per default, the log level is set to info
, but it can be adjusted by setting the log_string
variable.
Then, another filter is created for span levels.
A span is a single unit of work and represents a specific operation, like a database query, and has a start and end time. Spans can be linked together to show the flow of operations in a system. Each span can include:
- A name describing the operation
- Timing information (start and end)
- Attributes (key-value pairs) for extra context or relationships to other spans (parent-child).
Spans help understand the flow of requests through a system better.
The span filter, created by the init
function ensures that only relevant spans are processed, which helps manage
performance and logging noise.
After the span filter is set up, a collection of layers is initialized. These layers send data to tokio-console for
debugging or integrate with Prometheus for measuring span latencies. Each layer will be enabled or disabled based
on the configuration.
If OTLP tracing is enabled, an OpenTelemetryLayer
will be set up for tracing to either a file or an OTLP endpoint
based on environment settings.
After setting up all layers, a tracing subscriber is created with the configured layers and set as the global default.
Ultimately, the function creates and returns TelemetryGuards
and TracingHandle
structs, which manage the tracing
subscriber.
They should be kept active in the main function to ensure logging and tracing throughout the application's lifecycle.
Prometheus Layer
The telemetry-subscriber
allows to measure Tokio-tracing span latencies that will be recorded into Prometheus histograms directly.
For that, a prometheus::Registry
must be passed to the TelemetryConfig
using the with_prom_registry
function.
The name of the Prometheus histogram is tracing_span_latencies(_sum/count/bucket)
. For example, in the iota-node
it is set up as follows:
let runtimes = IotaRuntimes::new( & config);
let metrics_rt = runtimes.metrics.enter();
let registry_service = iota_metrics::start_prometheus_server(config.metrics_address);
let prometheus_registry = registry_service.default_registry();
// Initialize logging
let (_guard, filter_handle) = telemetry_subscribers::TelemetryConfig::new()
.with_env()
.with_prom_registry(&prometheus_registry)
.init();
drop(metrics_rt);
In order to set up the subscriber, it enters the metrics runtime first and creates a RegistryService
from
the iota-metrics
crate.
It initializes the default Prometheus registry in the RegistryService
and returns it.
The metrics_address
is the address where the Prometheus metrics will be exposed and can be specified in the node's
configuration file.
Per default, it is set to 0.0.0.0:9400
.
After the default registry is created, it is passed to the TelemetryConfig
with the with_prom_registry
function.