Assignment out of 45 marks total

For this assignment you will work individually.


You should submit a tar file (yes, even if there's only a single PDF file included. See the General Page for instructions on how to create one) with the following contents:

  1. A PDF of your assignment report, including a cover page with the assignment details and your name with student number.
  2. A directory containing any code or scripts that you wrote. Any included files must be referenced in the PDF report, else they will not be reviewed.

For each question, include the following (where applicable or otherwise unspecified):

  1. What you did (i.e., your answer to the question -- e.g., a command, action on a webpage, code run, etc.).
  2. How you did it (i.e., the exact command you ran to find the answer or complete an action if required -- e.g., Linux command line, area on a website that you went to, configuration, etc.).
  3. Your explanation of what you did and why you did it.
  4. The output or result of what you did, trimmed to include only the relevant output (i.e., cut out previous commands' text, un-needed text, etc.). This is often in the form of a screenshot.

Note that, depending on the question, some of the above requirements will not necessarily exist. Also, #1 and #2 from above can often be combined into a single screencap. Also note that some questions may explicitly state what to submit. This is to ensure you submit the correct information we're looking for, and the other requirements are expected as well (see list above).

These requirements may seem tedious and unnecessary; however, they are useful for markers to see that you completed each question, explained that you understood the question, and provided proof that the task was successfully completed.

Attention:

When in doubt, explain as much as you can. I need to see that you understand the answer and the process you used to get the answer. Not including an explanation or providing too little explanation may result in lost marks.

When answering the questions in Parts B and C, please provide a detailed discussion explaining the problem and what steps you took for the solution. Provide enough detail to demonstrate that you understand the problems and how you addressed them. SUBMITTING CODE WITHOUT EXPLANATION WILL RESULT IN SEVERE PENALTY!
 Please prefix your PDF and TAR file names with your MyCarletonOne ID: e.g., johnsmith-assignment3.pdf

 Attention: 

Any code included in the tar archive is for the TA to test, and should therefore be submitted in a format that is easy to run/compile, e.g., with an appropriate directory structure (if needed), required Makefiles, etc.

Code that you include in your report is for the TA to read, and may therefore be formatted to best explain what you did, e.g., by sectioning the code, highlighting lines of interest and including additional comments, etc.

SUBMITTING CODE WITHOUT EXPLANATION WILL RESULT IN SEVERE PENALTY!

 Hint: 
Instructions on how to copy files to and from your local machine and your VM are available on the General page. You may find this useful for submitting code/scripts/log files/etc.

This assignment will be done on the SCS OpenStack platform (please consult the General Page for further details and instructions). The name of the VM snapshot for this assignment is COMP4108-F19 Assignment 3, under the COMP4108B-F19 project. To protect your VM against abuse, you will automatically be prompted to choose a new password upon your first login—please do so, especially if you enable SSH access.

At the end of Assignment 2 you saw how you could exploit a setuid binary with a race condition in order to gain root privilege on the box. Now what? If you were a real attacker it could be a matter of minutes before the system admin patches the vulnerable program and gives you the boot. How do you hide your tracks? How do you install a backdoor that ensures you aren't a one-trick pony?

 Hint: 
Familiarize yourself with the history command. You can use it to refresh your memory on commands you may have entered.
You may prefer to use the script command, which records input and output to a file automatically. See man script for more information.

Linux Kernel Modules:

The answer to both questions (at least as far as this assignment is concerned) is a rootkit. A garden variety Linux rootkit is generally written as a Loadable Kernel Module or LKM. An LKM allow the system administrator to load new code to extend the kernel's functionality while the machine is running. Many device drivers are implemented as LKMs.

Only root can insert and remove modules (using insmod and rmmod respectfully), but it just so happens you're root today. Lucky you. There is a free guide to programming Linux Kernel Modules available online. This guide explains benign kernel module functionality, and you'll likely want to read (or at least skim) Sections 1, 2, 3, and 8.


 Hint: 
This assignment assumes familiarity with the C programming language. If your C kung-fu is rusty you'll probably want to Google some reference material. The classic choice is the K&R book. The Carleton library has at least one copy. I also enjoy Learn C the Hard Way by Zed Shaw, available free online. 

You'll want to make sure you understand pointers, memory management, arrays, and structures. You're a kernel programmer now.

Rootkits:

Rootkit LKM's alter the state of the system to present processes interacting with the Kernel sanitized information, or to add new functionality convenient for an attacker. A classic way this is done is by hooking system calls. For an idea of what system calls a process invokes, you should revisit your use of the stracecommand in Assignment #2. The guide to programming Linux Kernel Modules introduces syscall hooking briefly.

To hide files from appearing in directory listings for instance, you would find the syscall that ls used to get filesystem directory entries and hook it. By hooking syscalls related to the filesystem a malicious rootkit might hide all files with a $sys$ prefix, allowing it to stash its own files from the system. Rootkits also frequently serve as backdoors that allow a user to elevate their priviledges, or get a remote shell without logging in.

Back to the Future:

You no doubt saw the grave warnings affixed to both the getdents man page and the LKM guide section on hooking syscalls. Not only are these warnings correct, things are worse than you might imagine. These techniques are dangerous, and often unreliable. No sane engineer would design their device driver in this fashion, we're hacking in the true definition.

The details in the LKM guide are unfortunately specific to Linux Kernel versions < 2.6.x and a 32bit architecture. We're living in 2019 and running Linux Kernel version 3.2.x on a 64bit architecture. This affects us in two major ways:

  1. The sys_call_table symbol is no longer exported by the kernel to LKMs. This is to prevent developers from doing stupid things with it. We're going to have to find the address manually so that we can do stupid things with it.
  2. The page of memory where the sys_call_table lives is now marked read only to prevent things from going wrong. We can't write a new hook into the table without first making the page writable, thereby allowing things to go wrong.

Writing a rootkit from scratch is going to be a grueling endeavour. Thankfully your connections in the underground have hooked you up with some super eleet warez. With their C code you should be able to write a respectable piece of kernel malware without losing your mind. Unfortunately your hookup only got you so far. The code's author must have uploaded it before it was completely finished. It looks like you'll have to pick up where they left off...

Important Notes:

You're writing code that runs in kernel space, with full privileges. The slightest mistake in your code is going to lead to legitimately weird things happening including (but not limited to):

  • All of the binaries on your system segfaulting. Including ssh.
  • Data being lost.
  • Kernel modules being stuck loaded.
  • Full kernel panics, leaving the box frozen

Don't keep anything on your VM you aren't ready to lose! Keep your code on your own machine and copy it over to compile/test.

You're going to want to work in very small, verifiable steps. Do not attempt to sit down and program the whole assignment. Instead, start with very small steps in mind and progress further only when you get that step working.

For example, in the file hiding task: start by figuring out what to hook, then try hooking it and keeping the original behavior intact. Once you can do that without crashing your VM, try printing all filenames in a directory to syslog from your hook. Once that's working start writing code to identifying files you want to hide from those being printed to syslog. Finally attempt to remove the entry from the results.

 Hint: 
When in doubt, read the source code! The Linux kernel is open source. The kernel installed on your VM is version 3.2. The corresponding source code is available in /usr/src/linux-3.2.0. Another great online resource is the Linux Cross Reference (please note the ?v=3.2 appended to the URL).
  1. Download the rootkit framework code for this assignment, available here, to your VM using the wget command. THE USERNAME AND PASSWORD ARE FOUND IN THE ASSIGNMENT SUBMISSION LINK IN CULEARN. IT'S THE SAME USERNAME/PASS COMBO AS THE PREVIOUS TWO ASSIGNMENTS.
  2. Run sudo bash to give yourself a bash shell with root privileges. We'll pretend that you got this from the race condition in A1. For most of this assignment you're going to be switching between a root user and a normal user, so I recommend you keep two windows open (the gurus might want to try the screen tool, or a terminal multiplexer with a somewhat steep learning curve).
  3. 1 Mark Find the address of the sys_call_tablesymbol using the System.map
  4. 0.5 Marks Edit the insert.sh script to provide the right memory address for the table_addr parameter in the insmod command. It should be equal to the address you found in the System.map.
  5. 0.5 Marks Confirm you can build the rootkit framework by running make. You can safely ignore the warning about defined but not used variables, as you will be fixing that as you complete the assignment.
  6. 0.5 Marks Confirm you can insert the rootkit module by running ./insert.sh as root. Ensure it was inserted by running lsmod and by checking the syslog.
  7. 0.5 Marks Confirm you can remove the rootkit module by running ./eject.sh as root. Ensure it was ejected by running lsmod and by checking the syslog.
  8. 2 Marks Finish the rootkit code so that the example open() hook works. Look for the TODO markers. Show a snippet of the syslog output it generates once loaded.
 Hint: 
There is a bit of logging in the incomplete framework. Run tail /var/log/syslog to display the last few lines of the syslog. You may also want to try tail -f /var/log/syslog to interactively tail the syslog file. In interactive mode as new lines are printed to the log your terminal will update immediately. Press ctrl-c(that is ctrl and then c) to end the tail command and get back to the shell.
For this Part of the assignment you will be creating a backdoor for gaining root privileges on the machine. Using this backdoor you can come back to the system at anytime and quietly become the root user. From a kernel module most anything inside the kernel is fair game to be edited and messed with. In general you just have to find it, understand it, and subvert it reliably. For your backdoor you'll be subverting the system call that is used to invoke executables like commands, daemons, and scripts.

  1. 5 Marks Write a new hook for the execve syscall using the framework code from Part A. Consult the execve man page to learn the details and function signature of execve(). You will need to know which __NR_X define is used to find the offset in sys_call_table to hook for execve (where X will vary syscall to syscall). You might find /usr/src/linux-3.2.0/include/asm-generic/unistd.h useful in this regard. 

    The hook should print the name of all files being executed, and the effective UID of the user executing the file to syslog using printk. Example output: 

    Jan 28 20:49:17 COMP4108-A3 kernel: [81423.749198] Executing /usr/bin/tail
    Jan 28 20:49:17 COMP4108-A3 kernel: [81423.749200] Effective UID 0
    Jan 28 20:49:19 COMP4108-A3 kernel: [81425.950497] Executing /bin/ls
    Jan 28 20:49:19 COMP4108-A3 kernel: [81425.950499] Effective UID 1000
    

    The current_* macros defined in the /usr/src/linux-3.2.0/include/linux/cred.h include will help you get the information you need to include in your printk message.
  2. 10 Marks Modify your hook code so that when the effective UID of the user executing an executable is equal to the value of the root_uid parameter, they are given uid/euid 0 (i.e. root privs). The root_uid parameter must be provided via the insmod command in insert.sh like the sys_call_table address, and not hard coded. Note that the root_uid parameter should be set to your user's UID to get root, not root's UID. You will need to add this behaviour.

     Hint: 
    The header file /usr/src/linux-3.2.0/include/linux/cred.h and the corresponding code in /usr/src/linux-3.2.0/kernel/cred.c are likely of interest. Specifically, the prepare_kernel_cred(), and commit_creds() functions.

    In order to get full marks you must demonstrate the module working. Set the root_uid param in insert.sh equal to your user's UID, and provide the input/output from:
    1. Building the module code
    2. Runing whoami as a normal user in one terminal
    3. Inserting the module as a root user by running ./insert.sh in a second terminal.
    4. In your normal user terminal running whoami again and being told you are root.
    Example output (from normal user term): 

    comp4108@NodeX:/A3/code/rootkit_framework$ whoami
    comp4108
    comp4108@NodeX:/A3/code/rootkit_framework$ whoami
    root
    

     Remember: 
    Provide enough detail in your report to demonstrate that you understand the problem and solution.
    SUBMITTING CODE WITHOUT EXPLANATION WILL RESULT IN SEVERE PENALTY!
With your handy new backdoor from Part B you could come back to the system at anytime and act as the root user without needing to exploit your treasured race condition privilege escalation. From a kernel module most anything inside the kernel is fair game to be edited and messed with. In general you just have to find it, understand it, and modify it for your own purposes, without causing the system to crash when your modified code is executed in place of the original. In this part you will be subverting the interaction between binaries like ls and the OS provided directory abstraction.

  1. 10 Marks Write a hook for the getdents system call (man page here). Once again this will require finding the __NR_* define for the syscall number. 

    You will want to familiarize yourself with the linux_dirent structure. Your hook code should print the name of all directory entries returned by a call to getdents() to syslog using printk. Sample output:

    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441674] getdents() hook invoked.
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441704] entry: rootkit.o
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441706] entry: .rootkit.mod.o.cmd
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441708] entry: ..
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441710] entry: insert.sh
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441711] entry: rootkit.c
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441712] entry: rootkit.mod.c
    Oct  1 11:44:36 COMP4108-A3 kernel: [ 2266.441714] entry: rootkit.ko
    
    
     Note: 
    Although your VM is an x86_64 architecture, the ls command still uses the legacy getdents() system call instead of getdents64() Make sure your getdents() hook is coded with this in mind. Most notably, you should be accepting a buffer with type struct linux_dirent * notstruct linux_dirent64* as the 2nd argument to your hook. You will want to examine the linux_dirent64 structure, if you also plan to hook the getdents64()system call.
  2. 15 Marks Modify your hook such that the struct linux_dirent* buffer you return to the calling process does not include any dirent's for filenames that start with magic_prefix. The magic_prefix character array should be provided as a kernel module parameter given to insmod in the insert.sh script. You will need to implement this parameter yourself. 

    After coding your getdents hook and implementing the magic_prefix parameter you'll want to test it in action:
    1. Edit the insert.sh script and set the magic_prefix parameter to $sys$
    2. Compile your module by running make
    3. Create a file called $sys$_lol_hidden.txt in your current directory.
    4. Perform a ls -l to see if your $sys$_lol_hidden.txt file was created.
    5. Insert the kernel module by running the insert script ./insert.sh as root.
    6. Run the same ls -l command to validate the $sys$_lol_hidden.txt file is no longer included. It shouldn't be in ls -la either (i.e. isn't just a regular 'hidden' dotfile).
     Hint: 
    If you use $sys$ as your magic_prefix value you must remember to escape the $'s in the bash shell. The easiest way is to use \$ instead of $ when trying to create, edit, delete, or otherwise interact with one of your hidden files.
    Example output (from normal user term): 

    comp4108@COMP4108-A3:/A3/code/rootkit_framework/test$ touch \$sys\$_lol_hidden.txt
    comp4108@COMP4108-A3:/A3/code/rootkit_framework/test$ ls -la
    total 8
    -rw-rw-r-- 1 comp4108 comp4108 0 Oct  1 11:59 bar.txt
    -rw-rw-r-- 1 comp4108 comp4108 0 Oct  1 11:59 baz.txt
    -rw-rw-r-- 1 comp4108 comp4108 0 Oct  1 11:59 foo.txt
    -rw-rw-r-- 1 comp4108 comp4108 0 Oct  1 12:00 $sys$_lol_hidden.txt
    comp4108@COMP4108-A3:/A3/code/rootkit_framework/test$ ls -la
    total 8
    drwxrwxr-x 2 comp4108 comp4108 4096 Oct  1 12:00 .
    drwxrwxr-x 5 comp4108 comp4108 4096 Oct  1 11:59 ..
    -rw-rw-r-- 1 comp4108 comp4108    0 Oct  1 11:59 bar.txt
    -rw-rw-r-- 1 comp4108 comp4108    0 Oct  1 11:59 baz.txt
    -rw-rw-r-- 1 comp4108 comp4108    0 Oct  1 11:59 foo.txt
    

     Hint: 
    Modifying the buffer of dirent's to hide files is the trickiest bit of the assignment. Luckily it is no more difficult than a typical data structure question (with a few twists). 

    Your objective is to sanitize the struct linux_dirent *dirp buffer provided as the 2nd argument to the getdents syscall. This buffer is allocated by the calling process (i.e they make sure there is enough memory malloc'd for the struct linux_dirents that the syscall puts into the buffer.)

    The most important thing to know is that the dirp buffer is not an array of struct linux_dirent's of equal size. To save memory each dirent struct is only as big as it needs to be. In order to allow iterating through the dirent structs in the buffer each dirent struct stores its length to be used as an offset to the next dirent in the buffer (see the figure). You will need to use this knowledge to determine how you can remove a dirent from the buffer. The man page for getdents() has example code for iterating the buffer.

    The second most important thing to know is that the dirp buffer is userland memory You can not edit it directly or bad things will happen. Instead you must first allocate a kernel memory buffer of equivalent size. To do this you must use kalloc and kfree not their user-land counterparts malloc and free. Once you have a kernel buffer of the right size you can use the copy_from_user and copy_to_user functions to copy the userland buffer into your kernel buffer and vice versa.

    So, the steps are:
    1. Call the original getdents() syscall with the dirp buffer your hook receives to have it populated with dirent structs.
    2. Allocate a kernel buffer of the correct size using kmalloc
    3. Copy the userland buffer into your kernel buffer using