What is code injection?
code injection, also called Remote Code Execution (RCE), occurs when an attacker exploits an input validation flaw in software to introduce and execute malicious code. Code is injected in the language of the targeted application and executed by the server-side interpreter for that language – PHP, Python, Java, Perl, Ruby, etc. Any application that directly evaluates unvalidated input is vulnerable to code injection, and web applications are a prime target for attackers. This article shows how code injection vulnerabilities arise and how you can protect your web applications from injection.
PHP Code Injection Example
Let’s start with a quick example of vulnerable PHP code. The PHP eval()
function provides a quick and convenient way of executing string values as PHP code, especially in the initial phases of development or for debugging. However, when used with unknown inputs, it can leave your application vulnerable to code injection. Here’s a typical example of quick-and-dirty query string processing – just a simple echo command, like you might use for debugging parameters:
<?php eval ("echo ".$_REQUEST["user_name"].";"); ?>
The PHP interpreter will attempt to evaluate whatever is passed in the user_name parameter. As the parameter name implies, the developer expects the query string to contain a valid user name, for example:
http://www.example.com/index.php?user_name=admin
However, an attacker might supply the following query string to exploit the vulnerable construct and inject PHP code into the application:
http://www.example.com/index.php?user_name=admin;phpinfo();
If successful, this injection will cause the PHP interpreter to echo admin, but then execute phpinfo()
, providing the attacker with information about the operating system, PHP version, and other configuration details.
Unless the system()
function is disabled in PHP interpreter settings, a successful code injection can use this function to execute operating system commands, in effect performing command injection (see note below). Working with the vulnerable code shown above, an attacker might supply the following URL to a Linux-based server:
http://www.example.com/index.php?user_name=admin;system('ls -l');
Again, this will echo admin and then execute code injected after the semicolon. In this example, system('ls -l')
runs the ls -l
command to list the contents of the PHP interpreter’s working directory, including permissions.
How to prevent code injection in PHP
Code development security begins at the code front with the adoption of a secure coding convention. Be aware of front facing inputs and how they are handled. Sanitise everything coming in and be aware of how it’s stored within your application.
As a fundamental rule, dynamic code execution should never be allowed in any application. For example, avoid using core PHP functionality like shell_exec()
and exec()
where possible, which executes at the OS level.
It’s highly recommended to always be using an open source security scanning tool like NetSparker which will scan your package manager to identify potential vulnerabilities in library and dependency use. Something I’ve also been doing daily is checking the composer Vulnerability database to explore the techniques being used. The detailed report pages will allow you to trace back to repositories at the code level to explore how the vulnerabilities were identified.
5 ways to prevent code injection in PHP app development
- Avoid using
exec()
,shell_exec()
,system()
orpassthru()
- Avoid using
strip_tags()
for sanitisation - Code serialization
- Use a PHP security linter
- Utilise a SAST tool to identify code injection issues
1. Avoid using exec(), shell_exec(), system() or passthru()
As the saying goes “here be dragons.” As a rule, Avoid the use of anything that can directly call the operating environment from PHP when possible. From an attack vector point of view this has lots of opportunities to open all sorts of issues directly into your stack.
Traditionally, functions like exec()
, shell_exec()
, system()
and passthru()
would be utilised to perform functions like compressing or decompressing files, creating cron jobs and even navigating OS files and folders. These all will run without any built in code sanitisation, which is where the trouble begins in using them particularly with unvalidated or sanitised direct user input.
PHP does provide some functional operators with built-in input escaping, namely escapeshellcmd()
and escapeshellarg()
, which will pass input through some level of escape and sanitisation as part of the function call. There are a few more secure ways to approach the same functionality.
As of PHP 7.4, archiving can be handled by the ZipArchive
class, activated with --with-zip
flag as part of a PHP compile. Extra care needs to be taken with this however as it still could enable a traversal attack.
Approach file system interaction programmatically by minimising the ways that it can be interacted with directly via input. PHP has file functions that can be utilised without needing to call the OS directly. Getting a list of files in a specified directory, for example, can be done via the scandir()
function, which will return an array of files available in a directory then opened or referenced within code itself. Remember to take care not to access files created by exposed input and also with some of the functions around file based permissions like chmod()
and chown()
. Also, it’s important to not trust the content header (which can be faked) and to validate the file type.
Create cron jobs dynamically using a Composer library like cron. This is actually run via a cron which then runs its own crons as part of the overall cron runtime. Care still needs to be taken with what gets run in it but approaches process creation more programmatically without exposing access to the core OS.
2. Avoid using strip_tags()
for sanitisation
Extra care should always be taken with user input sanitisation and handling. The goal here is to accept valid user input but then to store and handle it in the right way so that your application does not become vulnerable. Remember, inputs are an open attack vector for malicious actors to be able to interact with your application. So make sure to take the time to always handle the input data properly.
The strip_tags()
function is primarily designed to remove HTML and PHP only from supplied input. This means things like JavaScript and SQL and other potentially malicious input will be deemed valid by the function. htmlentities()
is another option that gets used for sanitizing input which also allows for definable UTF charsets, keep in mind that this still does not sanitise input completely.
It’s also possible to strip data down using a regex function like preg_replace('/[^a-zA-Z0-9]+/', '', $text)
. This will only return only text and numerics from the input, for UTF-8 character sets. Care should also be taken with functions like mb_strtolower()
when handling user input cleaning. We’ve seen instances in the past where vulnerabilities have surfaced in some of the multibyte string functions (mb_
) like CVE-2020-7065, which is an out of bounds write vulnerability caused by UTF-32LE encoding. This can lead to a stack overwrite buffer crash and allow for code execution to occur.
Opt instead for using something like filter_var()
which will validate and sanitise based on the defined filters options. For example FILTER_SANITIZE_STRING
, FILTER_FLAG_STRIP_HIGH
passed into the filter_var()
function along with user input for example will remove all HTML tags and all characters with ASCII value > 127
.
There are also a few Composer libraries which are commonly used for input sanitisation. A library like HTML purifier offers HTML compliant sanitisation with fairly good allow listing that lets you customise for the type of input data you need to let into your application.
3. Avoid unserialize()
in PHP
This whole section could quite easily be a whole blog post on its own and it has been a hotly debated topic over the years, even within the developers working on PHP. The PHP manual actually highlights the dangers of using the unserialize()
function. Particularly that it should “not be passed untrusted user input” which can cause “code to be loaded and executed”.
Unserialize()
is intended to be used to convert a class to a string that can be stored and passed to other functions or cached for use later. On its own, it sounds relatively harmless until you start to understand how the underlying C code works when storing unserialize()
data in memory, which is where most of its issues then can happen.
It is highly recommended to use a standard data format such as JSON via json_decode()
and json_encode()
. This can provide a sanitised and safe transport method for sending and receiving serialized data to and from the user.
4. Use a PHP security linter
Having the right tools in place as part of your development workflow helps create secure and more functional code right from the start. Linters are a great way to reduce errors and potential issues as part of PHP application development and also reduce vulnerabilities in source code.
It’s also strongly advised to make sure display errors are turned off by default in PHP.ini configurations. Disabling error_reporting = E_ALL
, ~E_NOTICE
, and ~E_WARNING
will remove error output which could be potentially used to identify environment and configuration information within your application.
The PHP language itself has a linter built in which will display very verbose error messages as part of its validation. It can be called at the CLI (or as part of testing frameworks) by running PHP -l
followed by the file to check. The downside to using this as a linter is it can only check one file at a time, although it can be used as part of a loop to iterate over multiple files in a single run.
PHPlint is a robust, more widely known option for checking multiple files quickly. PHPLint can be used at the CLI or as a composer instigated library. Alternatively it can be called into a docker image fairly easily. PHPLint can be used to check PHP 7 and PHP 8, and while it has fairly verbose output, it can be used to identify issues via several linting processes at once.
Another popular linter is PHP-Parrallel-Lint, which supports PHP 5.3 to PHP 8.0 and is also able to run multiple linter processes quicker than you can type echo "hello world"
. PHP-Parrallel-Lint has more detailed output than the aforementioned options, although it does not yet support a CLI option or an out-of-the-box Docker solution at the moment.