33 breaking releases
0.34.0 | Aug 5, 2024 |
---|---|
0.33.0 | Apr 9, 2024 |
0.32.0 | Feb 20, 2024 |
0.30.0 | Nov 29, 2023 |
0.13.0 | Nov 26, 2020 |
#420 in Database interfaces
130KB
2.5K
SLoC
Elasticsearch exporter
Prometheus Elasticsearch exporter capable of working with large clusters. Caution you may overload Prometheus server by enabling all metrics, exporter is capable to export over near 1 million metrics. To avoid overloading Prometheus server run multiple Elasticsearch exporters that target just few specific metrics.
$ curl -s http://127.0.0.1:9222/metrics | wc
940272 1887011 153668390
Try it out with docker
$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT
Features
- Metric collection is decoupled from serving
/metrics
page - Skips zero/empty metrics (controlled with flag
exporter_allow_zero_metrics
) - Elasticsearch "millis" converted to seconds
- Elasticsearch "kilobytes" converted to bytes
- All time based metrics are converted as f64 seconds, keywords
millis
replaced withseconds
- Added
_bytes
and_seconds
postfix - Preserves metrics tree namespace up to last leaf
- elasticsearch_cat_indices_pri_warmer_total_time_seconds_bucket
- elasticsearch_cat_health_unassign
- elasticsearch_nodes_info_jvm_mem_heap_max_in_bytes
- Custom namespace labels
vin_cluster_version
for convenient comparison of metrics between cluster versions - Automatic cluster metadata updates every 5 minutes
- /_nodes/* API additional labels injection by mapping node ID to fetched cluster metadata
- name (map from node ID to name) namespaced ->
name
- version (Elasticsearch node version) namespaced to
vin_cluster_version
- IP namespaced ->
ip
- name (map from node ID to name) namespaced ->
- Automatic metrics deletion based on lifetime settings, by default metric by value will be deleted after 600s since last occurrence.
- Metric names are normalized to snake case, colon is replaced with underscore, brackets are replaced with colon (:)
- "transport_actions_cluster:monitor/nodes/info[n]_requests_count" -> "transport_actions_cluster_monitor_nodes_info:n:_requests_count"
- "transport_actions_internal:cluster/coordination/join/ping_requests_count" -> "transport_actions_internal_cluster_coordination_join_ping_requests_count"
Options
- Configurable labels "skip" and/or "include" (flags:
exporter_include_labels
,exporter_skip_labels
) - Configurable skip metrics (controlled with (flag
exporter_skip_metrics
) - Configurable global timeout (flag
elasticsearch_global_timeout
) - Configurable global polling interval (flag
exporter_poll_default_interval
) - Configurable per metric polling interval (flag
exporter_poll_intervals
) - Configurable metrics collection (flag
exporter_metrics_enabled
) - Configurable metrics namespace (flag
exporter_metrics_namespace
): metrics will be prefixed with custom namespace instead ofelasticsearch
- Configurable metadata collection (flag
exporter_metadata_refresh_interval
)
TLS validation
The certificate path is defined via flag --elasticsearch_certificate_path=CERTIFICATE_PATH
.
By default TLS validation is enabled, you don't have to configure anything. Default validation of the certificate, which validates that the certificate provided by the server is signed by a trusted Certificate Authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified by the CommonName (CN) or Subject Alternative Name (SAN) within the certificate.
No validation
No validation is performed on the certificate provided by the server.
Provide a flag --elasticsearch_certificate_validation=none
Full validation
Full validation of the certificate, which validates that the certificate provided by the server is signed by a trusted Certificate Authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified by the CommonName (CN) or Subject Alternative Name (SAN) within the certificate.
Provide a flag --elasticsearch_certificate_validation=full
Partial validation
Validates that the certificate provided by the server is signed by a trusted Certificate Authority (CA), but does not perform hostname verification.
Provide a flag --elasticsearch_certificate_validation=partial
Usage cheat sheet
Scraping /_nodes/stats
subsystem thread_pool path metric
$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="nodes_stats=true" --elasticsearch_path_parameters="nodes_stats=thread_pool"
Scraping /_nodes/stats
subsystem thread_pool + fs paths metric
$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="nodes_stats=true" --elasticsearch_path_parameters="nodes_stats=thread_pool,fs"
Scraping /stats
for total.indexing
and total.search
metrics only
$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="stats=true" --elasticsearch_query_filter_path="stats=indices.*.total.indexing,indices.*.total.search"
Scraping /_cat/shards
for search.fetch*
metrics only. In this case elasticsearch_query_filter_path
must always include index,shard
, and dotted format is not supported. Example:
$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT --exporter_metrics_enabled="cat_shards=true" --elasticsearch_query_filter_path="cat_shards=index,shard,search*fetch*"
$ curl -s http://127.0.0.1:9222
Proper Elasticsearch exporter
Available /_cat subsystems:
- cat_allocation
- cat_shards
- cat_indices
- cat_segments
- cat_nodes
- cat_recovery
- cat_health
- cat_pending_tasks
- cat_aliases
- cat_thread_pool
- cat_plugins
- cat_fielddata
- cat_nodeattrs
- cat_repositories
- cat_templates
- cat_transforms
Available /_cluster subsystems:
- cluster_health
Available /_nodes subsystems:
- nodes_usage
- nodes_stats
- nodes_info
Available /_stats subsystems:
- stats
Exporter settings:
elasticsearch_url: http://127.0.0.1:9200
elasticsearch_global_timeout: 30s
elasticsearch_query_fields:
elasticsearch_subsystem_timeouts:
- nodes_stats: 15s
elasticsearch_path_parameters:
- nodes_info: http,jvm,thread_pool
- nodes_stats: breaker,indices,jvm,os,process,transport,thread_pool
exporter_skip_labels:
- cat_allocation: health,status
- cat_fielddata: id
- cat_indices: health,status
- cat_nodeattrs: id
- cat_nodes: health,status,pid
- cat_plugins: id,description
- cat_segments: health,status,checkpoint,prirep
- cat_shards: health,status,checkpoint,prirep
- cat_templates: composed_of
- cat_thread_pool: node_id,ephemeral_node_id,pid
- cat_transforms: health,status
- cluster_stats: segment,patterns
exporter_include_labels:
- cat_aliases: index,alias
- cat_allocation: node
- cat_fielddata: node,field
- cat_health: shards
- cat_indices: index
- cat_nodeattrs: node,attr
- cat_nodes: ip,name,node_role
- cat_pending_tasks: index
- cat_plugins: name
- cat_recovery: index,shard,stage,type
- cat_repositories: index
- cat_segments: index,shard
- cat_shards: index,node,shard
- cat_templates: name,index_patterns
- cat_thread_pool: node_name,name,type
- cat_transforms: index
- cluster_health: status
- nodes_info: name
- nodes_stats: name
- nodes_usage: name
- stats: index
exporter_skip_metrics:
- cat_aliases: filter,routing_index,routing_search,is_write_index
- cat_nodeattrs: pid
- cat_recovery: start_time,start_time_millis,stop_time,stop_time_millis
- cat_templates: order
- nodes_usage: _nodes_total,_nodes_successful,since
exporter_poll_default_interval: 15s
exporter_poll_intervals:
- cluster_health: 5s
exporter_skip_zero_metrics: true
exporter_metrics_enabled:
- cat_health: true
- cat_indices: true
- nodes_info: true
- nodes_stats: true
exporter_metrics_namespace: elasticsearch
exporter_metadata_refresh_interval: 180s
exporter_metrics_lifetime_default_interval: 15s
exporter_metrics_lifetime_interval:
- cat_indices: 180s
- cat_nodes: 60s
- cat_recovery: 60s
Self exporter metrics
# HELP elasticsearch_subsystem_request_duration_seconds The Elasticsearch subsystem request latencies in seconds.
# TYPE elasticsearch_subsystem_request_duration_seconds histogram
elasticsearch_subsystem_request_duration_seconds_bucket{cluster="devnull",subsystem="/_nodes/os",le="0.005"} 0
elasticsearch_subsystem_request_duration_seconds_sum{cluster="devnull",subsystem="/nodes_stats"} 0.130069193
elasticsearch_subsystem_request_duration_seconds_count{cluster="devnull",subsystem="/nodes_stats"} 1
# HELP http_request_duration_seconds The HTTP request latencies in seconds.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{handler="/metrics",le="0.005"} 1
http_request_duration_seconds_sum{handler="/metrics"} 0.004372555
http_request_duration_seconds_count{handler="/metrics"} 1
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.24
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 16
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 25006080
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1605894185.46
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1345773568
Debug
Levels: info,warn,error,debug,trace
To debug HTTP requests
export RUST_LOG=info,reqwest=debug
To trace everything
export RUST_LOG=trace
Development
To start:
cargo run --bin elasticsearch_exporter
To test:
cargo test
License
MIT
Dependencies
~19–35MB
~580K SLoC