eBPF Talk: CPU and NUMA

在现代的服务器中，基本上 CPU 采用的都是多核 NUMA 架构。对于网络而言，一个网络包从物理网卡驱动出来之后，并到达对应的应用层 socket，最好都在同一个 CPU 上，因为这样的性能最好（没有内存 cache miss 的性能损耗）。

在 eBPF 中，有两个帮助函数能够获取 CPU ID 和 NUMA Node ID。

bpf-helpers 中能找到它们。

CPU: bpf_get_smp_processor_id

获取当前的处理器 ID，即 /proc/cpuinfo 里 process number。不过，只有在内核线程调度的抢占功能被关闭的情况下，该函数才能够获取到稳定的 CPU ID；因为线程会被调度，eBPF 程序所在的线程会被调度到其它 CPU 上。

该函数的文档说明：

1
2
3
4
5
6
7
8
9


       u32 bpf_get_smp_processor_id(void)

              Description
                     Get the SMP (symmetric multiprocessing) processor
                     id. Note that all programs run with preemption
                     disabled, which means that the SMP processor id is
                     stable during all the execution of the program.

              Return The SMP id of the processor running the program.

查看该函数的源代码，能够发现更多细节（更多有意思的地方）。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47


// kernel/bpf/helpers.c

BPF_CALL_0(bpf_get_smp_processor_id)
{
       return smp_processor_id();
}

// include/linux/smp.h

/**
 * raw_processor_id() - get the current (unstable) CPU id
 *
 * For then you know what you are doing and need an unstable
 * CPU id.
 */

/**
 * smp_processor_id() - get the current (stable) CPU id
 *
 * This is the normal accessor to the CPU id and should be used
 * whenever possible.
 *
 * The CPU id is stable when:
 *
 *  - IRQs are disabled;
 *  - preemption is disabled;
 *  - the task is CPU affine.
 *
 * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
 * when smp_processor_id() is used when the CPU id is not stable.
 */

/*
 * Allow the architecture to differentiate between a stable and unstable read.
 * For example, x86 uses an IRQ-safe asm-volatile read for the unstable but a
 * regular asm read for the stable.
 */
#ifndef __smp_processor_id
#define __smp_processor_id(x) raw_smp_processor_id(x)
#endif

#ifdef CONFIG_DEBUG_PREEMPT
  extern unsigned int debug_smp_processor_id(void);
# define smp_processor_id() debug_smp_processor_id()
#else
# define smp_processor_id() __smp_processor_id()
#endif

由源代码可见，bpf_get_smp_processor_id() 并不保证获取到稳定的 CPU ID。

NUMA: bpf_get_numa_node_id

获取当前的 NUMA Node ID。

该函数的文档说明：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


       long bpf_get_numa_node_id(void)

              Description
                     Return the id of the current NUMA node. The primary
                     use case for this helper is the selection of
                     sockets for the local NUMA node, when the program
                     is attached to sockets using the
                     SO_ATTACH_REUSEPORT_EBPF option (see also
                     socket(7)), but the helper is also available to
                     other eBPF program types, similarly to
                     bpf_get_smp_processor_id().

              Return The id of current NUMA node.

该函数的源代码：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


// kernel/bpf/helpers.c

BPF_CALL_0(bpf_get_numa_node_id)
{
       return numa_node_id();
}

// include/linux/topology.h

#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
DECLARE_PER_CPU(int, numa_node);

#ifndef numa_node_id
/* Returns the number of the current Node. */
static inline int numa_node_id(void)
{
       return raw_cpu_read(numa_node);
}
#endif

#else  /* !CONFIG_USE_PERCPU_NUMA_NODE_ID */

/* Returns the number of the current Node. */
#ifndef numa_node_id
static inline int numa_node_id(void)
{
       return cpu_to_node(raw_smp_processor_id());
}
#endif

#endif /* [!]CONFIG_USE_PERCPU_NUMA_NODE_ID */

查看文件 cat /boot/config-$(uname -r) | grep CONFIG_USE_PERCPU_NUMA_NODE_ID 可以知道当前内核是否在编译的时候使用了 CONFIG_USE_PERCPU_NUMA_NODE_ID 选项。

如果使用了该选项，则直接读取 numa_node 变量；否则，获取当前的 CPU ID，并查询 CPU ID 到 Node ID 之间的映射关系。

小结

在写本文之前，还不知道有 bpf_get_numa_node_id() 这个帮助函数，以为在拿到 bpf_get_smp_processor_id() 后在用户态应用程序中去查询 CPU 跟 NUMA Node 的映射关系。有个 bpf_get_numa_node_id()，就不需要在用户态应用程序中去解析那一堆 /proc/xxx 文件了。

eBPF Talk: CPU and NUMA

文章目录

CPU: bpf_get_smp_processor_id

NUMA: bpf_get_numa_node_id

小结

知识星球

星球里的专栏：

《XDP 进阶手册》