此处指的是 trace tracepoint bpf 程序,而不是 trace tracepoint 事件。

trace tracepoint 程序的 demo

demo 效果如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# ./fentry_fexit-tracepoint
2023/07/22 14:31:02 Attached fentry(netlink_extack)
2023/07/22 14:31:03 Attached fexit(netlink_extack)
2023/07/22 14:31:03 Attached tracepoint(netlink_extack)
2023/07/22 14:31:03 Keep setting tc filter...
2023/07/22 14:31:04 Listening events...
2023/07/22 14:31:04 Errmsg: Parent Qdisc doesn't exists (fentry)
2023/07/22 14:31:04 Errmsg: Parent Qdisc doesn't exists (tracepoint)
2023/07/22 14:31:04 Errmsg: Parent Qdisc doesn't exists (fexit: 72)
2023/07/22 14:31:05 set tc filter: filter replace: netlink receive: invalid argument, offset: 0, message: "Parent Qdisc doesn't exists"

其中使用的 trace 手段是 fentryfexit

demo 中使用的 fentry/fexit bpf 代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
SEC("fentry/netlink_extack")
int BPF_PROG(fentry_netlink_extack, struct netlink_extack_error_ctx *nl_ctx)
{
    bpf_printk("tcpconn, fentry_netlink_extack\n");

    /*
     * BPF_CORE_READ() is not dedicated to user-defined struct.
     */

    __u32 msg;
    bpf_probe_read(&msg, sizeof(msg), &nl_ctx->msg);
    char *c = (void *)(__u64) ((void *) nl_ctx + (__u16) msg);

    __output_msg(ctx, c, PROBE_TYPE_FENTRY, 0);

    return 0;
}

SEC("fexit/netlink_extack")
int BPF_PROG(fexit_netlink_extack, struct netlink_extack_error_ctx *nl_ctx, int retval)
{
    bpf_printk("tcpconn, fexit_netlink_extack\n");

    __u32 msg;
    bpf_probe_read(&msg, sizeof(msg), &nl_ctx->msg);
    char *c = (void *)(__u64) ((void *) nl_ctx + (__u16) msg);

    __output_msg(ctx, c, PROBE_TYPE_FEXIT, retval);

    return 0;
}

demo 中使用的 tracepoint bpf 代码如下:

1
2
3
4
5
6
7
8
9
SEC("tp/netlink/netlink_extack")
int tp__netlink_extack(struct netlink_extack_error_ctx *ctx)
{
    char *msg = (void *)(__u64) ((void *) ctx + (__u16) ctx->msg);

    __output_msg(ctx, msg, PROBE_TYPE_DEFAULT, 0);

    return 0;
}

本 demo 就是对 tracepoint bpf 程序的 tp__netlink_extack() 函数进行 fenrty/fexit

用户态的 Go 代码需要做的事情是:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
    spec, err := loadFf()
    if err != nil {
        log.Printf("Failed to load bpf obj: %v", err)
        return
    }

    tpFentry := spec.Programs["fentry_netlink_extack"]
    tpFentry.AttachTarget = obj.TpNetlinkExtack
    tpFentry.AttachTo = "tp__netlink_extack"
    tpExit := spec.Programs["fexit_netlink_extack"]
    tpExit.AttachTarget = obj.TpNetlinkExtack
    tpExit.AttachTo = "tp__netlink_extack"
  1. 第一步,创建 tracepoint 程序。
  2. 第二步,给 fentryfexit 程序指定 AttachTargetAttachTo
  3. 其中,AttachTargettracepoint 程序,AttachTotracepoint 程序中的函数名。
  4. 即,将 fentryfexit 程序 attach 到 tracepoint 程序的 tp__netlink_extack 函数上。

P.S. demo 源代码:GitHub Asphaltt/learn-by-example/ebpf/fentry_fexit-tracepoint

fentry/fexit 的函数参数

仔细对比上面 fentry/fexit 的函数定义和 tracepoint 程序的函数定义:

1
2
3
4
5
6
7
8
SEC("fentry/netlink_extack")
int BPF_PROG(fentry_netlink_extack, struct netlink_extack_error_ctx *nl_ctx);

SEC("fexit/netlink_extack")
int BPF_PROG(fexit_netlink_extack, struct netlink_extack_error_ctx *nl_ctx, int retval);

SEC("tp/netlink/netlink_extack")
int tp__netlink_extack(struct netlink_extack_error_ctx *ctx);

因为 tracepoint 程序只有一个参数:struct netlink_extack_error_ctx *ctx,所以 fentry/fexit 的函数参数就有一个对应的参数:struct netlink_extack_error_ctx *nl_ctx;但参数名不能叫 ctx

其中的 struct netlink_extack_error_ctx 定义如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
struct netlink_extack_error_ctx {
    unsigned long unused;

    /*
     * bpf does not support tracepoint __data_loc directly.
     *
     * Actually, this field is a 32 bit integer whose value encodes
     * information on where to find the actual data. The first 2 bytes is
     * the size of the data. The last 2 bytes is the offset from the start
     * of the tracepoint struct where the data begins.
     * -- https://github.com/iovisor/bpftrace/pull/1542
     */
    __u32 msg; // __data_loc char[] msg;
};

fentry/fexit 的函数参数里不能再使用 ctx

这是因为 BPF_PROG() 宏里已默认提供了 ctx 参数,所以不能再使用 ctx 参数名了。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#define BPF_PROG(name, args...)                                     \
name(unsigned long long *ctx);                                      \
static __always_inline typeof(name(0))                              \
____##name(unsigned long long *ctx, ##args);                        \
typeof(name(0)) name(unsigned long long *ctx)                       \
{                                                                   \
    _Pragma("GCC diagnostic push")                                  \
    _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")          \
    return ____##name(___bpf_ctx_cast(args));                       \
    _Pragma("GCC diagnostic pop")                                   \
}                                                                   \
static __always_inline typeof(name(0))                              \
____##name(unsigned long long *ctx, ##args)

小结

trace tracepoint 的做法类似 trace kprobe

不过,此处选择的 tracepointnetlink_extack;它有个特殊的地方是,它的 tracepoint 中的参数是:

1
2
3
# bpftrace -lv 'tracepoint:netlink:*'
tracepoint:netlink:netlink_extack
    __data_loc char[] msg

本应是 const char *msg,但 tracepoint 里使用的是 __data_loc char[] msg;带有 __data_loc annotation,就不等同于 char *msg 了。

所以,遇到 __data_loc 时,bpf 代码里为什么就要通过 (void *)(__u64) ((void *) nl_ctx + (__u16) msg) 来获取 msg 的值呢?

请看下一篇文章:eBPF Talk: tracepoint __data_loc