忽而蹦出个想法:何不参考 pwru 那样,跟踪所有带有 struct sock *sk 参数的函数?

既然脱胎于 pwru,那么就直接复用 pwru 的代码,快速实现一个 socketrace 工具。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
$ ./socketrace --output-limit-lines 10
2024/01/28 14:30:11 Attaching kprobes (via kprobe-multi) to 1090 functions
1090 / 1090 [----------------------------------------------------------------------------------------------------------------------------------] 100.00% ? p/s
2024/01/28 14:30:11 Attached kprobes (via kprobe-multi) to 1090 functions
2024/01/28 14:30:11 Press Ctrl+C to stop
CPU PROCESS                          FUNC
5   926(sshd)                        aa_sk_perm                          192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        inet_send_prepare                   192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_sendmsg                         192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        lock_sock_nested                    192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_sendmsg_locked                  192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_rate_check_app_limited          192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_send_mss                        192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_current_mss                     192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_established_options             192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
5   926(sshd)                        tcp_stream_alloc_skb                192.168.64.2:22 -> 192.168.64.1:55856 netns=4026531840 family=AF_INET6 protocol=IPPROTO_TCP
2024/01/28 14:30:11 Detaching kprobes (via kprobe-multi) from 5 bpf links
5 / 5 [---------------------------------------------------------------------------------------------------------------------------------------] 100.00% 13 p/s

因为 struct sock *sk 并不局限于 tcp/udp socket,所以 socketrace 也可以跟踪其他协议的 socket,如 AF_NETLINKAF_UNIX 等。

比如:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$  ./socketrace | grep AF_NETLINK
2024/01/28 15:11:38 Attaching kprobes (via kprobe-multi) to 1090 functions
1090 / 1090 [----------------------------------------------------------------------------------------------------------------------------------] 100.00% ? p/s
2024/01/28 15:11:38 Attached kprobes (via kprobe-multi) to 1090 functions
2024/01/28 15:11:38 Press Ctrl+C to stop
1   0(swapper/1)                     nlmsg_notify                        0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_NETLINK protocol=NETLINK_ROUTE
1   0(swapper/1)                     netlink_broadcast_filtered          0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_NETLINK protocol=NETLINK_ROUTE
1   0(swapper/1)                     sk_filter_trim_cap                  0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_NETLINK protocol=NETLINK_ROUTE
1   0(swapper/1)                     security_sock_rcv_skb               0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_NETLINK protocol=NETLINK_ROUTE
1   0(swapper/1)                     apparmor_socket_sock_rcv_skb        0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_NETLINK protocol=NETLINK_ROUTE

$ ./socketrace | grep AF_UNIX
2024/01/28 15:10:49 Attaching kprobes (via kprobe-multi) to 1090 functions
1090 / 1090 [----------------------------------------------------------------------------------------------------------------------------------] 100.00% ? p/s
2024/01/28 15:10:49 Attached kprobes (via kprobe-multi) to 1090 functions
2024/01/28 15:10:49 Press Ctrl+C to stop
4   1313(systemd-networkd)           mem_cgroup_sk_alloc                 0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_UNIX protocol=0
4   1313(systemd-networkd)           sock_init_data                      0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_UNIX protocol=0
4   1313(systemd-networkd)           sock_init_data_uid                  0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_UNIX protocol=0
4   1313(systemd-networkd)           aa_sk_perm                          0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_UNIX protocol=0
4   1313(systemd-networkd)           sk_getsockopt                       0.0.0.0:0 -> 0.0.0.0:0 netns=4026531840 family=AF_UNIX protocol=0

然后,从 struct sock *sk 中获取到的信息,也是比较丰富的,可以看看 socketrace 的用法:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
$ ./socketrace -h
Usage of ./socketrace:
      --filter-addr string        filter IPv4 address
      --filter-funcs string       filter functions with Go regexp, empty means all
      --filter-mark uint          filter sock mark
      --filter-netns string       filter network namespace
      --filter-pid uint           filter process id
      --filter-port uint16        filter TCP/UDP port
      --filter-protocol string    filter protocol, tcp, udp, icmp, empty means all
      --kprobe-way string         specify kprobe way, kprobe or kprobe-multi, empty means auto detect
      --output-file string        output file, empty means stdout
      --output-limit-lines uint   limit output lines, 0 means no limit
      --output-sock-common        output common socket information
      --output-sock-info          output sock information
      --output-socket-info        output socket information
      --output-stack              output stack information

可以看到,socketrace 支持了很多过滤选项,可以根据自己的需求来选择输出的信息。

其实,socketrace 的源代码并不多:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ cloc *.go ./**/*.go ./bpf/sock_trace.c
      15 text files.
      11 unique files.
       4 files ignored.

github.com/AlDanial/cloc v 1.98  T=0.02 s (637.9 files/s, 110060.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Go                              10            194             99           1321
C                                1             51              2            231
-------------------------------------------------------------------------------
SUM:                            11            245            101           1552
-------------------------------------------------------------------------------

其中,bpf/sock_trace.c 是 eBPF 代码,*.go 是 Go 代码。

推荐直接阅读 socketrace 的源代码,其中还有不少需要继续完善的地方。

源代码:socketrace