Trends, Techniques, Tips & Tricks for the PHP Scripting Language

PHP Security Basics: The Golden Rules

| February 24, 2009
Filter input. Escape output. You've heard it before, and you'll certainly hear it again. The reason is that rigorous application of these two rules can eliminate 80% of PHP security issues. There is a little bit of a culture clash here. One of the early attractions of PHP were features like register_globals, that did away with masses of parsing code. But our collective innocence has been lost. The register_globals directive has been set Off as a default since PHP version 4.2.0 and should remain that way. There is a lesson here. Explicit filtering of input is a habit that serious web application developers must adopt. If your application provides any value at all, attackers will try to exploit any holes you leave.

So what is input and what is filtering? Input is any incoming data that may be manipulated by an attacker. Obviously data coming from the client qualifies as input, e.g., data accessible via the _GET or _POST superglobals. (You may not trust client-side Javascript code, since it is trivially bypassed). Less obviously, several elements of the _SERVER superglobal can be set by the client. In fact, it is a good idea to consider the entire _SERVER superglobal array as input requiring filtering. Certainly files read from the filesystem should be considered input. Should data coming from the database be considered as input? While the rigorous answer is "yes", particularly if the database server is located remotely from the web server, for many applications it is not unreasonable to tie application security to database security. In this case, a judgment call is required. Such a judgment should be made carefully since it may limit the long term potential of an application.

Input becomes a PHP variable value. Input filtering is about ensuring that the variable value conforms to programmatic expectations. For some types of variable - integers, dates, phone numbers, credit card numbers, email addresses, URLs, etc - these expectations are well defined. However, part of what makes each application unique is the variable types it defines, implicitly or explicitly, so that input filtering is not as straightforward as it might seem. In addition, user expectations play a role. It is safest to require that usernames consist entirely of alphanumeric characters, but many systems also allow underscores, periods and dashes. More and more systems allow spaces, and disallowing single quotes is sure to annoy the O'Reillys of the world. It is possible to cater for all these cases. The point is to set programmatic expectations for variable values, and then ensure that those expectations are met before the variables are used.

By the way, don't try to modify input so that it conforms with expectations. This just introduces a layer of complexity that can itself easily result in new security vulnerabilities. Provide feedback to the application user about input expectations and simply require that they comply. Again, it is not unwise to enhance application functionality to deal with reasonable input, but once expectations have been set, simply require compliance.

Escaping output is typically much more straight forward. Some characters or character sequences have special meaning for the applications to which you send output. The standard examples are HTML sequences for the client and SQL sequences for the database. Of course you'll want to send various command sequences to the client and the database, but you'll almost never want the "active" parts of those sequences to come from inside your application variables. When you do, it should be very explicit, and even then carefully controlled. For the standard examples, use htmlentities for variables you're sending to the client and the equivalent of mysql_real_escape_string for your database. If your database doesn't have a vendor specific string escape function, you should write one. Since writing such a function can require considerable research, you can use addslashes as a fallback, but you should be aware that you will be vulnerable to vendor specific attacks.

Image credit: Pieter Musterd

0 comments:

Post a Comment