1 背景介绍

对于服务的监控，概括来说需要采集sdk、上报、存储、图形化展示、监控报警。先介绍2个工具, Prometheus是一套成熟且流行的系统和服务监控系统，它几乎满足了监控的所有能力。 Grafana, 它和Prometheus相比更侧重的是图形化展示，有强大、灵活的仪表盘体系，我们会把基于Prometheus收集的数据作为数据源导入到Grafana，关于他们更详细介绍可去官网查看。

作为web服务的开发者，当然不需要我们去搭建一整套，会有专门的团队来维护监控报警体系。但是我们需要掌握采集端SDK的工作和监控的配置，这就需要理解Prometheus中各种指标的类型，对于监控的配置就又需要掌握如何去写PromQL,它是Prometheus自己的查询语法，下面就开始干吧！

2 环境搭建

2.1 初始化一个node项目

先初始化一个node项目看一下采集端的效果, node的Prometheus的sdk是 prom-client

mkdir node-demo
cd node-demo
npm init
yarn add express @types/express prom-client

ts-node index.ts 启动服务

import express from 'express'
import {register, collectDefaultMetrics} from 'prom-client'
const app = express()
const port = 8080
collectDefaultMetrics()
app.get('/metrics', async (req, res) => {
  res.send(await register.metrics())
})
app.get('/', (req, res) => {
  res.send('Hello World!')
})
app.listen(port, () => {
  console.log(`Example app listening on port ${port}`)
})

访问http://localhost:8080/metrics,如果能看到下面这样的文本就证明ok了。

虽然我们生产环境的Prometheus和Grafana不需要我们自己搭建，但是对于学习这套体系，自己在本地搭建一套是很有帮助的，因为生产环境的Prometheus和Grafana不会收集你本地服务的指标，所以可以自己收集本地启动的服务指标进行学习，如果不需要学习PromQL的话可以选做。

2.2 docker安装Prometheus和Grafana

对于安装Prometheus和Grafana, 用docker安装的方式最便捷。

1 登录docker账户，如果没有的先注册。
2 拉镜像

docker pull prom/prometheus
docker pull grafana/grafana

2.3 创建prometheus.yml配置文件

在自己喜欢的位置创建prometheus.yml文件，用于配置Prometheus。我的位置~/prometheus/prometheus.yml 文件内容设置如下：

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "node-demo"
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["192.168.1.42:8080"]

这里我设置了一个node-demo的job, targets中是我的主机ip和我要启动的node服务的端口。你按照自己主机的ip进行替换。

2.4 启动容器

启动,注意prometheus.yml配置成你自己的文件位置。

docker run -d  --name prometheus -p 9090:9090 -v ~/prometheus/prometheus.yml:/etc/prometheus prom/prometheus/prometheus.yml
docker run -d --name grafana -p 3000:3090 grafana/grafana

2.5 图形化初试

容器启动后我们访问http://localhost:9090访问Prometueus, 输入up点击Execute,如果看到如下图的结果就证明启动的都没问题，这里的up就是Prometheus自己的查询语言PromQL。

上面的Status菜单展示的都是prometheus.yml文件中的内容，Targets中也可以查看我们配置的node-demo的健康状态。

接下来和Grafana打通，访问http://localhost:3000, 默认的用户名密码都是admin。登录之后点击 DATA SOURCES

选择Prometheus

修改URL和name

点击Save&test 如果看到提示Data source is working 则证明成功。

点击Explore, 进入探索页面，上面自动选择创建的数据源，输入up点击 Run query 可如下结果则证明成功

3 Prometheus中的指标

3.1 指标初探

下面回到服务指标的学习,学完之后可以看懂这些文本先看一下指标的形式

# HELP nodejs_heap_space_size_total_bytes Process heap space size total from Node.js in bytes. 
# TYPE nodejs_heap_space_size_total_bytes gauge nodejs_heap_space_size_total_bytes{space="read_only"} 176128 
nodejs_heap_space_size_total_bytes{space="old"} 54087680 
nodejs_heap_space_size_total_bytes{space="code"} 2990080 
nodejs_heap_space_size_total_bytes{space="map"} 1056768 
nodejs_heap_space_size_total_bytes{space="large_object"} 21299200 
nodejs_heap_space_size_total_bytes{space="code_large_object"} 0 
nodejs_heap_space_size_total_bytes{space="new_large_object"} 0 
nodejs_heap_space_size_total_bytes{space="new"}

#HELP是一个指标的描述文案，可以解释这个指标的功能 #TYPE是一个指标的类型描述，前面的代表这个指标的名称nodejs_heap_space_size_total_bytes 空格后边的代表这个指标的类型，gauge 不带#的就是指标的真实值，它的形式是指标名+值。{}中的内容代表了这个指标的label,也就是这个指标还可以再分属性，比如这个指标是nodejs 堆内存的大小，里边还可以根据space属性分成read_only,old等等。所有的指标都是这种形式。

3.2 指标类型

Prometheus中有四种指标类型，Counter（计数器）、Gauge（仪表盘）、Histogram（直方图）、Summary（摘要）。

Counter

Counter是只增不减的指标, 一般在定义Counter类型指标的名称时推荐使用_total作为后缀。比如默认指标中的

process_cpu_user_seconds_total
process_cpu_system_seconds_total
process_cpu_seconds_total

统计的就是cpu在该node服务上花费的cpu时间，它是一个累积增长的指标。如果我们想要统计CPU的使用率就可以通过1分钟之内CPU的增长时间差除以1分钟得到。再比如如果我们统计请求的QPS也是同样的道理。

Gauge

Gauge是可增可减的指标，侧重于反应指标的实时状态，比如默认指标中的

nodejs_heap_size_total_bytes 81444864
nodejs_heap_size_used_bytes 79206776

统计的就是node中推内存的大小，它显然不是一直增长的是可增可减的指标。

Histogram

Histogram是直方图的意思，它不是单纯一个值的指标，是一个复合的指标，可以统计一个值在各种区间之间的分布情况。看demo如下：

const h = new Histogram({
	name: 'test_histogram',
	help: 'Example of a histogram',
	labelNames: ['code'],
	buckets: [0.1, 0.2, 0.3, 0.4, 0.5, 1],
});

h.labels('200').observe(0.4);
h.labels('200').observe(0.6);
h.observe({ code: '200' }, 0.4);

register.metrics().then(str => console.log(str));

输出结果：

# HELP test_histogram Example of a histogram
# TYPE test_histogram histogram
test_histogram_bucket{le="0.1",code="200"} 0
test_histogram_bucket{le="0.2",code="200"} 0
test_histogram_bucket{le="0.3",code="200"} 0
test_histogram_bucket{le="0.4",code="200"} 2
test_histogram_bucket{le="0.5",code="200"} 2
test_histogram_bucket{le="1",code="200"} 3
test_histogram_bucket{le="+Inf",code="200"} 3
test_histogram_sum{code="200"} 1.4
test_histogram_count{code="200"} 3

可以看到拿到的结果中有总的次数test_histogram_count 有总的值test_histogram_sum, 有设置的几个区间的值，显示的值是小于该区间值的数量，例如test_histogram_bucket{le="0.5",code="200"} 2表示值小于0.5的有2个，那么也就能计算出0.5-1的个数是 3-2=1。

看一个默认指标中的垃圾回收的指标利用的是histogram

# TYPE nodejs_gc_duration_seconds histogram
nodejs_gc_duration_seconds_bucket{le="0.001",kind="major"} 0 
nodejs_gc_duration_seconds_bucket{le="0.01",kind="major"} 2 
nodejs_gc_duration_seconds_bucket{le="0.1",kind="major"} 2 
nodejs_gc_duration_seconds_bucket{le="1",kind="major"} 2 
nodejs_gc_duration_seconds_bucket{le="2",kind="major"} 2 
nodejs_gc_duration_seconds_bucket{le="5",kind="major"} 2 
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="major"} 2 
nodejs_gc_duration_seconds_sum{kind="major"} 0.008220480993390084 
nodejs_gc_duration_seconds_count{kind="major"} 2

Summary

汇总指标，它和Histogram比较相似，也是复合指标，但是用的场景不多，默认的指标中没有使用Summary的，可能是因为它不支持聚合操作，只能统计单实例的指标。看例子：

const h = new Summary({
	name: 'test_summary',
	help: 'Example of a summary',
	labelNames: ['code'],
	percentiles: [0.1, 0.3, 0.4, 0.5, 1],
});
h.labels('200').observe(0.2);
h.labels('200').observe(0.4);

h.labels('200').observe(0.5);
h.labels('200').observe(1);

register.metrics().then(str => console.log(str));

输出结果：

# HELP test_summary Example of a summary
# TYPE test_summary summary
test_summary{quantile="0.1",code="200"} 0.2
test_summary{quantile="0.3",code="200"} 0.33999999999999997
test_summary{quantile="0.4",code="200"} 0.41000000000000003
test_summary{quantile="0.5",code="200"} 0.45
test_summary{quantile="1",code="200"} 1
test_summary_sum{code="200"} 2.1
test_summary_count{code="200"} 4

可以看到它也是统计了总数和总的和，但是和histogram的区别是它统计的是百分比的分布情况，比如quantile="0.4"表示的是40%的值小于0.41000000000000003所以它是在客户端经过计算的，不是简单的增加，这种计算就不能够实现多个实例的聚合。而histogram的区间是可以多实例聚合的。

4 prom-client的使用

4.1 基本使用

Prometheus的nodejs客户端是prom-client，在环境配置的时候已经安装上了。

Registery是一个注册源的概念，是多个指标的容器，最后可以通过它对外暴露数据。prom-client提供了它的构造函数同时也内置了一个默认的registry，所以我们可以这么引用

import {registry, Registry} from 'prom-client';

所有的指标在创建的时候都可以指定registry，如果不指定的话默认就是内置的registry。源码在lib/metric.js中

prom-client内置了一个collectDefaultMetrics方法，用于收集推荐的指标。

collectDefaultMetrics可选地接受具有以下条目的配置对象：

prefix指标名称的可选前缀。默认值：无前缀。
register应该向哪些指标注册。默认值：全局默认注册表。
gcDurationBuckets带有用于 GC 持续时间直方图的自定义存储桶。GC 持续时间直方图的默认存储桶是[0.001, 0.01, 0.1, 1, 2, 5]（以秒为单位）。
eventLoopMonitoringPrecision以毫秒为单位的采样率。必须大于零。默认值：10。

确保collectDefaultMetrics只执行一次。

import {collectDefaultMetrics} from 'prom-client';
collectDefaultMetrics()

我们在环境配置中看到的指标都是该方法产生的。

接下来就是如果拿到监控的指标数据。registry对象提供了2个全量获取async方法metrics获取文本内容,getMetricsAsJSON获取JSON格式。在你需要的路由上返回即可。

4.2 自定义指标

prom-client内置的指标不能完全满足我们的监控需求，我们都需要自定义指标，自定义指标的命名规范。上面讲到的四个指标都可以通过prom-client实现

import {Counter, Gauge, Histogram, Summary} form 'prom-client'

Counter只能是增加的所以只提供一个inc方法用户记录增加的值。例如内置的CPU使用时间统计

 const cpuUserUsageCounter = new Counter({
        name: namePrefix + PROCESS_CPU_USER_SECONDS,
        help: 'Total user CPU time spent in seconds.',
        registers,
        labelNames,
        // Use this one metric's `collect` to set all metrics' values.
        collect() {
                const cpuUsage = process.cpuUsage();
                const userUsageMicros = cpuUsage.user - lastCpuUsage.user;
                const systemUsageMicros = cpuUsage.system - lastCpuUsage.system;
                lastCpuUsage = cpuUsage;
                cpuUserUsageCounter.inc(labels, userUsageMicros / 1e6);
                cpuSystemUsageCounter.inc(labels, systemUsageMicros / 1e6);
                cpuUsageCounter.inc(labels, (userUsageMicros + systemUsageMicros) / 1e6);
        },
});
const cpuSystemUsageCounter = new Counter({
        name: namePrefix + PROCESS_CPU_SYSTEM_SECONDS,
        help: 'Total system CPU time spent in seconds.',
        registers,
        labelNames,
});
const cpuUsageCounter = new Counter({
        name: namePrefix + PROCESS_CPU_SECONDS,
        help: 'Total user and system CPU time spent in seconds.',
        registers,
        labelNames,
});

Gauge是可增可减，所以提供了inc用于增加，dec用于减少，set直接设置具体的值。 startTimer则是用于方便我们计算时间差，而不需要自己手动去计算,源码如下：

/**
 * Start a timer
 * @param {object} labels - Object with labels where key is the label key and value is label value. Can only be one level deep
 * @returns {function} - Invoke this function to set the duration in seconds since you started the timer.
 * @example
 * var done = gauge.startTimer();
 * makeXHRRequest(function(err, response) {
 *	done(); //Duration of the request will be saved
 * });
 */
startTimer(labels) {
        const start = process.hrtime();
        return endLabels => {
                const delta = process.hrtime(start);
                const value = delta[0] + delta[1] / 1e9;
                this.set(Object.assign({}, labels, endLabels), value);
                return value;
        };
}

所有指标都可以传一个collect方法，会在该指标被收集数据时执行，所以我们可以在这个方法中获取指标值然后赋值。例如收集内存就是在collect方法中执行：

const collect = () => {
        const memUsage = safeMemoryUsage();
        if (memUsage) {
                heapSizeTotal.set(labels, memUsage.heapTotal);
                heapSizeUsed.set(labels, memUsage.heapUsed);
                if (memUsage.external !== undefined) {
                        externalMemUsed.set(labels, memUsage.external);
                }
        }
};

执行collect的源码在各指标的get方法中:

async get() {
    if (this.collect) {
            const v = this.collect();
            if (v instanceof Promise) await v;
    }
...
}

Histogram 同样有startTimer，它的设置方法是observe,它的参数多在了可以设置buckets 自定义值的统计区间。所有的指标都可以定义labelNames和registers。只要确保你定义的指标代码能够在服务启动的时候执行一次就可以，执行多次会抛出错误。

5 PromQL

Prometheus 提供了一种称为 PromQL（Prometheus Query Language）的功能性查询语言，让用户可以实时选择和聚合时间序列数据。

5.1 时间序列的理解

我们定义的各种指标是定时被Prometheus抓取的，那么它的存储结构就是以时间为横轴的数据。我们在prometheus中输入一个指标的时候获取的是最新一次的值

如果我们加上一个时间范围[1m]代表获取1分钟内的数据。下图可以看到每一个指标可以获取4个值，因为我们是15秒抓取一次数据。

可以理解成是点数据和时间段数据的区别。

当时点数据的时候我们切换到Graph面板就会展示以时间做为横轴，值作为纵轴的图表。

5.2 即时矢量选择器

我们一个指标中可能定义了label，只有指标名称是无法区分label，即时矢量选择器用于对label进行选择。 {}用于写label的选择器，支持的语法有:

=：选择与提供的字符串完全相等的标签。
!=：选择不等于提供的字符串的标签。
=~：选择与提供的字符串进行正则表达式匹配的标签。
!~：选择与提供的字符串不匹配的标签。

例如：

nodejs_heap_space_size_total_bytes{space="large_object"}
nodejs_heap_space_size_total_bytes{space!="large_object"}
nodejs_heap_space_size_total_bytes{space=~"new.*"}

其中正则表达式的匹配可以认为是 /^$/完全匹配, 例如：

5.3 范围矢量选择器

是选择一定范围内的多个样本。持续时间可以写到[]中，支持的单位包括：

ms- 毫秒
s- 秒
m- 分钟
h- 小时
d- 天 - 假设一天总是 24 小时
w- 周 - 假设一周总是 7 天
y- 年 - 假设一年总是 365d

例如：

nodejs_heap_space_size_total_bytes{space=~"new.*"}[1m]

5.4 偏移修改器(不常用)

offset修饰符允许更改查询中各个瞬间和范围向量的时间偏移量。例如

nodejs_heap_space_size_total_bytes[1m]

得到的时间是1656819991.26 而

nodejs_heap_space_size_total_bytes[1m] offset 1m

得到的时间是1656819931.333 向后偏移了一分钟

@修饰符(不常用) 默认指标获取的都是当前时间的指标值，而@修饰符可以指定获取哪一个时间点的值，例如：

nodejs_heap_space_size_total_bytes @1656820184

5.5 运算

支持的全量运算参考文档这里说一些常用的运算。算术运算 Prometheus 中存在以下二元算术运算符：

+（添加）
-（减法）
*（乘法）
/（分配）
%（模数）
^（幂/幂）

比如内存默认单位是byte 我们展示的时候展示MB就可以

nodejs_heap_size_total_bytes/1024/1024

5.6常用的聚合运算

因为我们的服务一般都是多实例的，所以需要在统计的时候把所有实例的数据聚合到一起。

sum（计算维度总和）
min（选择最小尺寸）
max（选择最大尺寸）
avg（计算尺寸的平均值）
group（结果向量中的所有值都是 1）
stddev（计算维度上的总体标准偏差）
stdvar（计算维度上的总体标准方差）
count（计算向量中的元素个数）
count_values（计算具有相同值的元素个数）
bottomk（样本值的最小 k 个元素）
topk（按样本值计算的最大 k 个元素）
quantile（在维度上计算 φ-quantile (0 ≤ φ ≤ 1)）

举例：

sum(nodejs_heap_space_size_total_bytes) by (space)
topk(2, nodejs_heap_space_size_total_bytes)

5.7 常用的函数

全量函数参见文档, 这里讲解3种计算增长率的方法用到的函数及他们的区别。

increase()计算区间向量中时间序列的增量，它只计算增量，所以想要计算增长率则需要手动的去除以时间。

increase(http_requests_total[1m])/60

rate()方法用于计算区间向量时间范围内的增长率，秒为单位。

rate(http_requests_total[1m])

它计算增长率的方式是时间范围内最后的样本和第一个样本的差值除以时间,所以可能会出现中间某一个时间内增长率高而无法统计到，可能被整个时间范围给平均了，所以一种方法是把时间范围设置的短一些，第二种就是使用irate。

irate()也是计算区间向量时间范围内的增长率，但是他是瞬时增长率。基于时间范围内最后两个数据点计算。所以总结就是irate更灵敏而rate侧重时间段内的趋势。

6 总结

本文讲解了整体监控需要的工作，介绍了Prometheus、Grafana的功能和docker方式安装、Prometheus中的四种指标、prom-client的基本使用和自定义指标，最后还讲解了常用的PromQL语法。

如果觉得有用请帮忙点个赞🙏。
我正在参与掘金技术社区创作者签约计划招募活动，点击链接报名投稿。

基于Prometheus的node服务监控