Concepts
Last updated
Last updated
Prometheus Server
The Prometheus server is the brain of the metric-based monitoring system. The main job of the server is to collect the metrics from various targets using pull model.
Target is nothing but a server, pod, endpoints etc which we will look in to detail in the next topic.
The general term for collecting metrics from the targets using Prometheus is called scraping.
Targets
Target is the source where Prometheus scrape the metrics. A target could be servers, services, Kubernetes pods, application endpoints, etc.
By default prometheus looks for metrics under /metrics
path of the target. The default path can be changed in the target configuration. This means, if you dont specify a custom metric path, Prometheus looks for the metrics under /metrics
.
Service Discovery
Prometheus uses two methods to scrape metrics from the targets.
Static configs: When the targets have a static IP or DNS endpoint, we can use those endpoints as targets.
Sevice Discovery: In most autoscaling systems and distributed systems like Kubernetes, the target will not have a static endpoint. In this case, that target endpoints are discovered using prometheus service discovery and targets are added automatically to the prometheus configuration.
TSTB
The metric data which prometheus receives changes over time (CPU, memory, network IO etc..). It is called time-series data. So Prometheus uses a Time Series Database (TSDB) to store all its data.
By default Prometheus stores all its data in an efficient format (chunks) in the local disk. Overtime, it compacts all the old data to save space. It also has retention policy to get rid of old data.
The TSDB has inbuilt mechanisms to manage data kept for long time. You can choose any of the following data retention policies.
Time based retention: Data will be kept for the specified days. The default retention is 15 days.
Size-based retention: You can specify the maximum data TSDB can hold. Once this limit it reached, prometheus will free up the space to accommodate new data.
Exporters
Exporters are like agents that run on the targets. It converts metrics from specific system to format that prometheus understands.
It could be system metrics like CPU, memory etc, or Java JMX metrics, MySQL metrics etc.
Alert Manager
Alertmanager is the key part of Prometheus monitoring system. Its primary job is to send alerts based on metric thresholds set in the Prometheus alert configuration.
The alert get triggered by Prometheus and sent to Alertmanager. It in turn sends the alerts to the respective notification systems/receivers (email, slack etc) configured in the alert manager configurations.
Alert Deduplicating: Process of silencing duplicated alerts.
Grouping: Process of grouping related alerts togther.
Silencing: Silence alerts for maintenance or false positives.
Routing: Routing alerts to appropriate receivers based on severities.
Inhibition: Process of stopping low severity alert when there is a medium of high severity alert.
PushGateway
Prometheus by default uses pull mechanism to scrap the metrics.
However, there are scenarios where metrics need to be pushed to prometheus.
So instead for waiting for prometheus to pull the metrics, we need to push the metrics to prometheus. To push metrics, prometheus offers a solution called Pushgateway. It is kind of a intermediate gateway.
Pushgateway needs to be run as a standalone component. The batch jobs can push the metrics to the pushgateway using HTTP API. Then Pushgateway exposes those metrics on /metrics
endpoint. Then prometheus scrapes those metrics from the Pushgateway.
PromQL
PromQL is a flexible query language that can be used to query time series metrics from the prometheus.
We can directly used the queries from the Prometheus
user interface or we can use curl
command to make a query over the command line interface.
Clinet Libraries
Prometheus Client Libraries are software libraries that can be used to instrument application code to expose metrics in the way Prometheus understands.
In cases where you need custom instrumentation or you want to create your own exporters, you can use the client libraries.
A very good use case is batch jobs that need to push metrics to the Pushgateway. The batch job needs to be instrumented with client libraries to expose requirement metrics in prometheus format.