In the last few posts, we have seen how we can utilize the OTeL collector to collect system and K8s metrics, along with metadata. This base configuration state can provide simple health metrics on diverse environment types. From here, the amount of unique metrics practitioners can gather becomes vast. There are various receivers, the prometheus ecosystem, StatsD, custom OTLP, etc. An overlooked category of metrics is often of processes running on a host.
Configuration
receivers:
hostmetrics:
collection_interval: 10s
scrapers:
process:
metrics:
process.cpu.utilization:
enabled: true
process.disk.operations:
enabled: true
process.memory.utilization:
enabled: true
process.threads:
enabled: true
As part of the hostmetrics receiver, the process scraper can easily be added. This example configuration provides a sufficient instantiation, with gauge like metrics such as cpu and memory utilization being enabled.
Resource Attributes
Valuable resource attributes are populated alongside the metrics. Of these, process.pid
, process.owner
, process.command
and process.command_line
are particularly interesting. Process identifiers and owners can help highlight resource contention within a host. Changes in process command line arguments can highlight anomalous changes in metrics due to different command line strings.
Sensitive Data Scrubbing
One thing to note is the potential to leak sensitive command line arguments as part of data collection. It is imperative to add the transform
processor
transform:
metric_statements:
- context: metric
statements:
- replace_pattern(resource.attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")
An example where the command_line
resource attribute with a password argument is redacted via OTTL (OpenTelemetry Transformation Language).
Conclusion
Process monitoring is an excellent way to isolate long-running processes that are exhausting resources or ephemeral processes that cause sudden anomalies. OTeL provides a clean and concise way to do just this.