Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CS503: Operating Systems, Fall 23 -- Lab 3

Lab 3: Dynamic system call filtering and tracepoints [120 pts]

Due date: 11/17/2023 (Friday), 11:59 PM

The primary aim of this project is to equip you with hands-on experience in implementing security and tracing mechanisms within an operating system.

Recommended: As you implement the different components of the lab, make sure you implement your own test cases. Try to create simple test cases so that you can predict the expected result and reason about your code when it fails. Think hard about the test cases you need to create to ensure that you do not forget to test any aspect of the new functions.

Reminder: You are required to commit and push your changes frequently (e.g., at least every couple of hours of work or more often) to the GitHub Classroom git repository assigned to you. This will be important to help you recover from bad changes, communicate with your TA, and enforce the course policies. Failure to comply with this requirement in any assignment will result in lost points or a zero grade in the assignment.

We will be focusing on two key implementations:

Secure Computing Mode (Seccomp): used to selectively block/restrict systems calls on a per-application basis

Dynamic Tracepoints: used for kernel analysis

The major tasks involve implementing Seccomp (Secure Computing Mode) and Dynamic Tracepoints, similar to those in Linux, within the Xinu environment. Also, in this lab, you need to write a report. Let's start by understanding these concepts.

Readings

Read the following documents and follow the instruction in the setup guide:

1. XINU setup.

2. Linux manual page for Seccomp

3. Linux kernel documentation for tracing

4. Linux manual page for bpf-helpers

5. eBPF specification

6. Lecture slides about coordination

0. eBPF Background

Note: This section intends to provide only a very brief overview of eBPF. You are expected to read the references provided above to understand eBPF in more detail.

eBPF stands for Extended Berkeley Packet Filter. The eBPF virtual machine implementation is a component used to run eBPF programs in the Linux kernel. It that can be used to introduce custom programmatic behavior without having to change the kernel source code or load kernel modules. For example, using eBPF an application can ``push`` a program into the kernel that specifies complex firewall rules. At a high-level, eBPF allows safe, efficient, and programmable modifications of the kernel's behavior. There are two variants of BPF:

(classic) Berkeley Packet Filter (BPF): It started off as a system within UNIX-like operating systems for filtering network packets at the kernel level and sending them to user-level programs. BPF was designed to efficiently deliver packets from the OS network stack to a user process.

Extended (eBPF): To expand its use beyond just packet filtering, BPF evolved into eBPF, a expressive version capable of running sandboxed programs in an interpreter built into the OS kernel. Nowadays, eBPF can be utilized for non-network-related tasks like performance analysis, tracing, security enforcement, etc.

For this lab we will use eBPF exclusively. Linux eBFP programs have several important properties:

Efficiency: eBPF programs are loaded into the kernel and executed there, avoiding costly context switches and system calls.

Safety: eBPF programs are verified by the kernel to ensure that they don't contain functionality that could crash or otherwise harm the kernel. The eBPF language is more constrained than that of typical programs and designed to be easy to verify.

Versatility:eBPF programs can hook into various parts of the kernel via BPF Helper functions - API-like interfaces between the kernel and the eBPF program.

Observability: With eBPF programs, you can monitor the inner workings of the kernel, applications, or even hardware events, leading to high visibility into system behavior.

1. Setup

Make sure you clone the git repository for this specific lab. Follow the link provided in the Ed post announcing the lab and the instructions in the XINU setup guide to set up your development environment. You are required to regularly commit and push your changes to the private git in github classrooms.

Do NOT use the modified XINU source in which you implemented your previous lab. You should start this lab from the fresh repository provided for this specific lab.

Part I: Secure Computing Mode (Seccomp) (50 pts)

In Linux, Seccomp is a mechanism for limiting the system calls that a process can invoke. It has multiple modes, one of which is Seccomp-bpf. When Seccomp-bpf is enabled, before every system call, the Linux kernel will execute a small BPF program that uses the system call number and arguments as the input, and either allow or block that system call based on the result of that BPF program. For more information, please read the materials. In this part, you will implement a similar mechanism that filters Xinu's system calls using eBPF, which is a superset of BPF. To make it easier to follow, we break down the task into sub-tasks:

Task 1.1: Intercepting System Calls in Xinu

Unlike Linux, system calls are simply function calls in Xinu. There isn't a single entrypoint of system calls. So, to intercept them, it's necessary to manually insert checkpoints. Please read open.c for reference and apply similar filtering to all system calls. Similar to Linux, we assign a number for each system call. Please refer to trace.h for more details.

Task 1.2: Embedding eBPF Virtual Machine

An implementation of eBPF virtual machine has been added to the source code of Xinu. Please use this eBPF VM to run filter programs. Its API is defined in ubpf.h. An eBPF program accepts a memory buffer as the input, and returns an integer. Please refer to the example code for loading and executing eBPF programs in trace.c. Also, we added a tool (ebpf-tools/compile.sh) to compile a C program to eBPF byte code. Please do you own research how to use it. The example program for it (ebpf-tools/example.c) contains all the features that may be used in this lab.

Task 1.3: Filtering System Calls

For this step, you need to implement the system call filtering mechanism with eBPF. First implement a new system call attach_filter with the following prototype:

syscall attach_filter(const void *code, int len);

which attaches a filter of BPF program stored in code to the current process. Then, all following system calls will go through filtering before execution. A process can have more than one filter (at most five), and only when all the filters allow, can a system call be executed. A process created by a Seccomp-enabled process inherits the filters, i.e., a system call in the new process need to also pass its creator's filters. Once set up, the filters cannot be detached. The BPF program should be released when the last process that uses it exits. To achieve that, you may need to implement a reference count. Test your implementation with the filters you write, and write the code of the filter, as well as what you observe after they are blocked, to your report.

Part II: Dynamic Tracepoints (70 pts)

Linux provides many tracepoints to allow the user to trace the kernel events, e.g., system calls, context switches, etc. A popular approach to write tracing programs is to use eBPF. An eBPF program can be inserted to a tracepoint in the kernel. Then, when the corresponding event happens, the eBPF program executes. In this part, you will implement a similar mechanism in Xinu.

Task 2.1: Understanding and Identifying Tracepoints

Read the materials to understand where Linux's tracepoints are placed in its kernel. Try to add similar tracepoints to Xinu. We split the tracepoints into two types, per-process tracepoints and global tracepoints. Per-process tracepoints are only effective for one process while global tracepoints are for the whole system. Please propose a list of at least three tracepoints of each type and write it to the report. Also, explain why such tracepoints are useful. Then, please implement the tracepoints you proposed.

There are two required tasks that you should design your tracepoints accordingly: 1. collecting the statistics of scheduler including the number of context switches and average waiting processes; 2. collecting the statistics of IO for one process including the throughput of read and write system calls. For each task, you should create a shell command that can start tracing, stop tracing, or retrieve the statistics and print them. For other tracepoints you proposed, also show examples of their usage and explain them in the report.

Task 2.2: Designing Insertion of Tracepoints

For the tracepoints you proposed, design what the input of the eBPF programs should be for each tracepoint. For example, for a tracepoint of context switches, the input may be the pids of the current and the next process. Add the following system call for attaching a tracepoint:

syscall attach_tracepoint(int tracepoint, const void *code, int len, void *data);

In which tracepoint is the number of the tracepoint, which should be defined in trace.h, and code stores the bytecode of the tracing program, whose length is len. data is any data that is required by specific tracepoints. For example, for the tracepoint of collecting the statistics of scheduler, it may be the memory address that stores the result. Note that this is different from the input provided to the tracing programs. If tracepoint represents a per-process tracepoint, this system call attaches the tracing program to the current process. This system call should return a tracepoint id if successful. Also, add the following system call for detaching a tracepoint:

syscall detach_tracepoint(int tracepoint_id);

For each tracepoint, there can be at most 5 tracing programs attached. Per-process tracepoints are NOT inherited and should be released when the process exits. Make sure all tracepoints can be dynamically attached and detached without restarting the system.

Task 2.3: Designing External Helper Functions

Linux provides many helper functions for eBPF programs to allow it to interact with the kernel, e.g., accessing a hash map for storage, reading from the user/kernel memory, or printing to the kernel ring buffer. Please read ubpf.h to understand how external helper functions work and implement some in Xinu. Please include the following ones:

long bpf_probe_read(void *dst, u32 size, const void *ptr);

long bpf_probe_write(void *dst, const void *src, u32 len);

u64 bpf_ktime_get_ns();

long bpf_trace_printk(const char *fmt, u32 fmt_size, ...); Please refer to Linux's bpf-helpers manual page for their functions. Also, at least implement one more helper function that you define by yourself and is useful for the tracepoints you proposed.

Task 2.4: Writing and Testing Your Tracing Programs

Implement the abovementioned tasks using your tracepoints and helper funcctions. Also, for each of the tracepoint you proposed, write an eBPF tracing program to test it. Please explain how they are implemented and tested in the report.

Turn-in instructions

Early Submission Bonus: Projects delivered 5 full days before the official lab due date receive 10 bonus points as long as they score at least 50% of the assignment total points (without any bonus). It is not necessary to inform the teaching staff ahead of time; students only need to submit before the bonus cutoff date to be considered for this bonus.

Late Submission Policy: You can use a total of 3 late days throughout the course. Refer to the course syllabus on Brightspace for the details on late submission policy.

1. Format for submitting:

For problems where you are asked to print values using kprintf() or printf(), use conditional compilation (C preprocessor directives #if and #define) with macro XTEST set to 1 or 0 (in include/process.h) to effect print/no print. I.e., XTEST==1: print assignment messages, XTEST==0: do not print assignment messages.

For your own debug statements, do the same with macro XDEBUG. I.e., XDEBUG==1: print your debug messages, XDEBUG==0: do not print your debug messages.

2. Before submitting your work, make sure to double-check the the TA Notes in Ed to ensure that additional requirements and instructions have been followed.

3. Please make sure to disable all debugging output before submitting your code.

4. Electronic turn-in instructions:

1. Go to the xinu-cs503-fall2023/compile directory and run make clean.

2. Go to the directory of which your xinu-cs503-fall2023 directory is a subdirectory. E.g., if /homes/bob/xinu-cs503-fall2023 is your directory structure, go to /homes/bob. (Note: please do not rename xinu-cs503-fall2023 or any of its subdirectories.)

3. Type the following command to turn in the files:

turnin -c cs503 -p lab2 xinu-cs503-fall2023

4. List and check the submitted files with the following command:

turnin -c cs503 -p lab2 -v