r0ops was a reversing challenge, worth only 150 points. Based on the amount of points, I expected it to be relatively easy, but I was in for a ride down the rabbit hole…
The binary opens binds to a port and waits for incoming connections. Upon connecting with nc, nothing much happens. While trying to run it in gdb, I encountered the first anti-debugger trick:
gdb opens more file descriptors. The binary rightly expects the socket handle to be file descriptor 3; if it encounter anything else, it must be because gdb is running.
Examing the output of objdump, I quickly learned that the main program is just a stub to load a ROP chain:
I set a breakpoint at the ret instruction, just before the ROP chain kicks off. I then dumped and copied the ROP chain and used some shell-fu to clean up the output:
Now, the ropchain contained only bare addresses. This is were the second obfuscation step came into place: each ROP gadget starts with a jmp which jumps in the middle of another instruction. Because of this, the disassembly cannot be trusted. Instead, I manually looked up all the ROP gadgets and pieced them together. The ROP chain is quite ingenious, although it also contains tons of redundant instruction:
1234567891011121314151617181920
000000000dead1f4 pop rcx
0000000000000008
000000000dead271 pop r9 ;r9= 1337deadbeef0095
1337deadbeef0095
000000000dead123 mov rax,QWORD PTR [rdi];[rdi+0] first QWORD of input
000000000dead0ed add rsi,0x8 ;rsi=8
000000000dead204 mov QWORD PTR [rsi],rax ;rsi=8
000000000dead267 mov r8,QWORD PTR [rsi];rsi=8r8= first QWORD of input
000000000dead0f8 sub rsi,0x8 ;rsi=0
000000000dead103 add rdi,0x8 ;rdi=8
000000000dead0ed add rsi,0x8 ;rsi=8(no-op)000000000dead27a mov QWORD PTR [rsi],r9 ;rsi=8[rsi]= 1337deadbeef0095
000000000dead20e mov rax,QWORD PTR [rsi];rsi=8rax= 1337deadbeef0095
000000000dead0f8 sub rsi,0x8 ;rsi=0
000000000dead1ec pop rbx ;rbx= cafe
000000000000cafe
000000000dead141 imul rax,rbx ;rax==> 0x2724090c079825d6
000000000dead0ed add rsi,0x8 ;rsi=8
000000000dead204 mov QWORD PTR [rsi],rax ;rax= 0x1337deadbeef0095*0xcafe
...continues...
It basically takes the first QWORD of the input, sent over the socket, and then proceeds to generate a special constant. This is used later to compare against. I followed the rest of the ROP chain, and it basically does the following: it repeatedly multiplies the QWORD of our input with itself. At set intervals, it will multiply this value with the original QWORD. After a fixed number of iterations, it compares the resulting QWORD to the generated magic constant. The ROP chain uses a clever mechanism to implement conditional looping:
1234567
000000000dead1ec pop rbx ;rbx=0
0000000000000000
000000000dead1fc pop rdx ;rdx=1d8; adjustment for rsp!
00000000000001d8 ;000000000dead19b 0xdead19f: cmp rax,rbx ; rax contains a counter used to iterate; 0xdead1a2: jne 0xdead1a7 ; -> ret;if rax != rbx, continue 0xdead1a4: add rsp,rdx ; when it reaches zero, control is passed to the next gadget, located at rsp+0x1d8
Clever stuff, but horrible to trace. There were a lot of jumps and no ops to throw me off. For instance, a gadget would add rsi, 8 and the next one would sub rsi, 8, effectively doing nothing (except annoying me and wearing out my Enter key).
Breaking the chain
The ROP chain repeats this process eight times, so we need to send eight QWORDS over the socket. For each QWORD, a new magic constant is generated (taking the former value, multiplying by 0xcafe and adding 0xbeef). To inspect what was going on, I set breakpoints on two very important ROP gadgets:
This allowed me to dump each value that was generated, and finally see which values are being compared by the binary (one of which was the magic constant).
I briefly considered bruteforcing the entire 64-bit range, but this was way too slow, even in C. I focussed on creating a function that emulates what is done with the first QWORD. After squashing a bug, I ended up with the following python code:
1234567891011121314
defp(x,n):whilen:x=(x*x)&0xffffffffffffffffn-=1returnxdefc(i):x=(p(i,3)*i)&0xffffffffffffffffx=(p(i,4)*x)&0xffffffffffffffffx=(p(i,10)*x)&0xffffffffffffffffx=(p(i,12)*x)&0xffffffffffffffffreturn(p(i,13)*x)&0xffffffffffffffffprinthex(c(0x4242424241414141))# remember, little endian ;)
Then I noticed something crucial. As I entered variations of 0x4242424241414141, the last byte of the generated value was only dependent on the last byte of the input (by chance it was also 0x41)! This gave me an idea…
Byte-by-byte
I found I could bruteforce the correct value for each QWORD, going one byte at a time! After a while (and squashing the aforementioned bug by careful tracing of the ROP chain) I came up with the following python code:
importstructdefq(x):returnstruct.pack('<Q',x)defp(x,n):whilen:x=(x*x)&0xffffffffffffffffn-=1returnxdefc(i):x=(p(i,3)*i)&0xffffffffffffffffx=(p(i,4)*x)&0xffffffffffffffffx=(p(i,10)*x)&0xffffffffffffffffx=(p(i,12)*x)&0xffffffffffffffffreturn(p(i,13)*x)&0xffffffffffffffffkey_list=[]check=0x1337deadbeef0095foruinrange(8):check=((check*0xcafe)+0xbeef)&0xffffffffffffffffkey=0foriinrange(8):forzinxrange(1,0xff):# ugly, but works: it basically only compares the output of the c() function# up to the byte it's bruteforcingif(c(key|(z<<(i*8)))&(0xff<<i*8))==(check&(0xff<<i*8)):key+=(z<<(i*8))breakprint"[+] key {}: {} -> {} == {}".format(u,hex(key),hex(c(key)),hex(check))key_list.append(key)# send all the generated keys as little-endian QWORDS to the binaryfromsocketimport*globalss=socket(AF_INET,SOCK_STREAM)s.connect(('localhost',13337))payload=''forkeyinkey_list:payload+=q(key)s.send(payload+'\n')prints.recv(1000)s.close()
The ROP chain went through its hoops and landed here, dumping the flag!