Finding the Real Origin of Block I/O (submit_bio) Using eBPF
Image by Jesstina - hkhazo.biz.id

Finding the Real Origin of Block I/O (submit_bio) Using eBPF

Posted on

As system administrators and developers, we’ve all been there – stuck in a debugging nightmare, trying to figure out where those pesky block I/O requests are coming from. It’s like searching for a needle in a haystack, except the haystack is on fire, and the needle is a sneaky little devil hiding behind a million lines of code. But fear not, dear reader, for we have a solution that will make your life easier – enter eBPF, the superhero of debugging tools!

What is eBPF?

eBPF (extended Berkeley Packet Filter) is a revolutionary technology that allows you to attach custom programs to various points in the Linux kernel, giving you unparalleled visibility into system internals. With eBPF, you can write custom programs in a safe, sandboxed environment, and attach them to specific kernel events, like system calls, network packets, or – you guessed it – block I/O requests!

Why Use eBPF for Block I/O Tracing?

Traditional debugging methods, like strace or blktrace, can be cumbersome and limited in their ability to provide detailed information about block I/O requests. eBPF, on the other hand, allows you to tap into the kernel’s internal events, giving you a crystal-clear view of what’s happening under the hood. With eBPF, you can:

  • Track the origin of block I/O requests, including the process, thread, and syscall that triggered them
  • Obtain detailed information about the I/O request, such as the block device, sector, and size
  • Correlate block I/O requests with other system events, like network packets or system calls
  • Write custom programs to analyze and filter block I/O requests in real-time

Getting Started with eBPF and Block I/O Tracing

Before we dive into the nitty-gritty, make sure you have the following installed:

  • A Linux kernel version 4.18 or later (eBPF is available on earlier kernels, but some features might not work)
  • The eBPF compiler collection (bcc) installed on your system
  • A C compiler, like GCC, to compile your eBPF programs

Writing Your First eBPF Program

Let’s create a simple eBPF program that traces block I/O requests. Create a new file called `bio_trace.c` with the following contents:


#include <bpf/bpf.h>
#include <linux/bio.h>

SEC("tracepoint/block/block_rq_issue")
int trace_block_rq_issue(struct tracepoint_raw_block_rq_issue *ctx)
{
    struct bio *bio = ctx->args->bio;
    char comm[TASK_COMM_LEN];
    bpf_get_current_comm(&comm, sizeof(comm));

    bpf_printk("Block I/O request from %s (%d): dev %d,%d, sector %llu, size %u\n",
               comm, ctx->args->rq->q->q_disk->disk->major,
               bio->bi_bdev->bd_disk->major,
               bio->bi_sector, bio->bi_size);

    return 0;
}

This program attaches to the `block_rq_issue` tracepoint, which is triggered whenever a block I/O request is issued. The program extracts the process name, block device, sector, and size from the `bio` structure, and prints it to the kernel log using `bpf_printk`.

Compiling and Loading the eBPF Program

Compile the program using the following command:

clang -I/usr/include/bpf -O2 -Wall -target bpf -c bio_trace.c -o bio_trace.o

Load the program into the kernel using the following command:

bpftool prog load bio_trace.o /sys/fs/bpf/bio_trace

Attaching the eBPF Program to the Kernel

Attach the program to the `block_rq_issue` tracepoint using the following command:

bpftool prog attach /sys/fs/bpf/bio_trace id 1234

Replace `1234` with the ID of the `block_rq_issue` tracepoint on your system (you can find this using `bpftool tp list`).

Running the eBPF Program and Analyzing the Results

Now that the program is loaded and attached, you can run it by triggering some block I/O requests on your system. You can do this by running a disk-intensive workload, like `dd` or `fio`, or by running a simple script that performs some disk I/O.

The program will print detailed information about each block I/O request to the kernel log. You can view the log using `dmesg` or `syslog`.

Example Output


[  123.456789] Block I/O request from dd (1234): dev 8,0, sector 1024, size 4096
[  123.456795] Block I/O request from dd (1234): dev 8,0, sector 2048, size 8192
[  123.456802] Block I/O request from bash (1235): dev 8,0, sector 4096, size 512

As you can see, the program prints the process name, block device, sector, and size for each block I/O request. You can use this information to identify the origin of the I/O requests and optimize your system accordingly.

Advanced eBPF Techniques for Block I/O Tracing

Now that you’ve got the basics down, let’s explore some advanced eBPF techniques for block I/O tracing.

Filtering Block I/O Requests

You can use eBPF to filter block I/O requests based on various criteria, like block device, sector range, or process ID. For example, you can modify the `bio_trace` program to only print requests from a specific block device:


SEC("tracepoint/block/block_rq_issue")
int trace_block_rq_issue(struct tracepoint_raw_block_rq_issue *ctx)
{
    struct bio *bio = ctx->args->bio;
    if (bio->bi_bdev->bd_disk->major == 8 && bio->bi_bdev->bd_disk->minor == 0) {
        // Print request details
    }
    return 0;
}

Correlating Block I/O Requests with Other System Events

eBPF allows you to correlate block I/O requests with other system events, like system calls or network packets. For example, you can use eBPF to track the system calls that triggered the block I/O requests:


SEC("tracepoint/syscalls/sys_enter_read")
int trace_sys_enter_read(struct tracepoint_raw_syscalls_sys_enter_read *ctx)
{
    // Store the syscall metadata in a map
    bpf_map_update_elem(&syscall_map, &ctx->args->fd, &ctx->args->buf, sizeof(ctx->args->buf));
    return 0;
}

SEC("tracepoint/block/block_rq_issue")
int trace_block_rq_issue(struct tracepoint_raw_block_rq_issue *ctx)
{
    struct bio *bio = ctx->args->bio;
    // Look up the syscall metadata in the map
    struct syscall_metadata *syscall_md = bpf_map_lookup_elem(&syscall_map, &bio->bi_bdev->bd_disk->major);
    if (syscall_md) {
        // Print the syscall metadata and request details
    }
    return 0;
}

This program uses two eBPF programs: one that tracks system calls and stores the metadata in a map, and another that looks up the metadata in the map and prints it along with the block I/O request details.

Conclusion

Finding the real origin of block I/O (submit_bio) using eBPF is a powerful technique that can help you debug and optimize your system. With eBPF, you can write custom programs that tap into the kernel’s internal events, giving you unparalleled visibility into system internals. By following the instructions in this article, you can write your own eBPF programs to trace block I/O requests and identify the processes, threads, and system calls that trigger them.

Remember to experiment with different eBPF techniques, like filtering and correlating block I/O requests, to get the most out of this powerful debugging tool. Happy debugging!

eBPF Program Description

Frequently Asked Questions

Get answers to the most asked questions about finding the real origin of block I/O (submit_bio) using eBPF!

What is the main challenge of tracing block I/O operations in Linux?

The main challenge of tracing block I/O operations in Linux is identifying the origin of the I/O operation, as the submit_bio function is called by various kernel modules and subsystems, making it difficult to pinpoint the exact source of the I/O request.

How does eBPF help in tracing block I/O operations?

eBPF (extended Berkeley Packet Filter) allows us to attach a probe to the submit_bio function and trace the call stack, which enables us to identify the origin of the I/O operation and follow the execution path to the root cause of the I/O request.

What is the role of the BPF program in tracing block I/O operations?

The BPF program acts as a tracing agent that runs in the kernel and collects information about the block I/O operations, including the call stack and arguments passed to the submit_bio function, allowing us to reconstruct the entire I/O operation flow.

Can eBPF be used to trace other types of I/O operations besides block I/O?

Yes, eBPF can be used to trace other types of I/O operations, such as network I/O, character device I/O, and pipe I/O, by attaching probes to the corresponding kernel functions and tracing the call stack to identify the origin of the I/O request.

What are the benefits of using eBPF for tracing block I/O operations?

The benefits of using eBPF for tracing block I/O operations include high-performance tracing, low overhead, and the ability to collect detailed information about the I/O operation flow, making it an ideal solution for debugging and optimizing I/O-intensive workloads.