Create a Simple Input Sanitation Function With PHP.
Saturday, May 2nd, 2009
I don't care what anybody tells me, PHP security is the number one thing I'm concerned about when writing a script. If you were to directly input data from $_POST or $_GET into a MySQL database, you could (and probably will) be in for a world of trouble. Today I'll walk you through the steps of creating a very easy to use input sanitize function in PHP.
Structure, Your Function Needs It
To start things off, open up some PHP tags and create a new function named clean with two parameters: $str and $html. $str should be set to an empty string (I use two single quotes) and $html should be set to false because on default, all HTML should be stripped.
We'll also want to check if the user passed in an empty string. If they did, return false.
<?php
function clean($str = '', $html = false) {
if (empty($str)) return false;
// The rest of the code goes here...
}
Is the string actually an array?
Sometimes it's nice to be able to pass in a array into a function and still get the results you wanted without writing a loop outside the function. A simple if/else statement should be able to check if the user had passed in an array.
if (is_array($str)) {
foreach($str as $key => $value) $str[$key] = clean($value, $html);
} else {
// The code for cleaning $str goes here...
}
The foreach loop lets the function loop through the given array and apply the clean function to it. Note that we're passing in $value and not $str. This is because $str is the array and $value is the value of a specific item in the array.
From here on out, the rest of the function's code will go in the else section... besides the very last line.
De-escape the String
Sometimes, PHP automatically escapes strings for us, and sometimes it's helpful, this time it's not. We'll want to strip them if get_magic_quotes_gpc() returns 1, or true.
if (get_magic_quotes_gpc()) $str = stripslashes($str);
Strip HTML
For this section of the function, three if/elseif statements will be required: is $html an array? is $html a valid HTML tag? or is $html set to false?
if (is_array($html)) $str = strip_tags($str, implode('', $html));
elseif (preg_match('|<([a-z]+)>|i', $html)) $str = strip_tags($str, $html);
elseif ($html !== true) $str = strip_tags($str);
The three lines break down like this:
- Is $html an array? If so, use strip_tags to remove every HTML tag not in the array. implode is used to turn the array into a string with nothing between each tag.
- Is $html a valid HTML tag? If so, strip every tag except that one. Since $html is already a string implode isn't necessary.
- Is $html not equal to true? I use not equal to true instead of equal to false ($html === false) because it's better to say what it can't be than what it should be in this case. If $html is not equal to false, then remove every HTML tag, no exceptions.
Finishing Up
Before returning the string, if there is any leading or trailing white space we want to strip it.
$str = trim($str);
Finally, return $str so the user can do whatever they want with their newly cleaned data. Make sure this is placed outside the if/else statement.
Conclusion
As you can see, with only 15 lines (19 if you count PHP tags and the defining of the function) of coding you can make your website a safer place for you and your visitors. I have to admit that this is the exact same function I use on my website for pretty much everything. Do you guys have your own clean/sanitize function you use?
Final Code
<?php
function clean($str = '', $html = false) {
if (empty($str)) return;
if (is_array($str)) {
foreach($str as $key => $value) $str[$key] = clean($value, $html);
} else {
if (get_magic_quotes_gpc()) $str = stripslashes($str);
if (is_array($html)) $str = strip_tags($str, implode('', $html));
elseif (preg_match('|<([a-z]+)>|i', $html)) $str = strip_tags($str, $html);
elseif ($html !== true) $str = strip_tags($str);
$str = trim($str);
}
return $str;
}
?>
If you enjoyed this article, you might consider subscribing to our rss feed to stay updated with all the latest tips and articles!
Article Sponsored by:
Looking to become a WordPress rockstar? Theme like a pro with this straight forward tutorial-style guide to WordPress theming, that takes you from the basics to the advanced!









Great article Vasili, simple and effective.
I’m personally a fan of always using curly brackets, but in this situation, it’s just a matter of preference I suppose
Hope to see more like this from you.
The only problem I’m noticing is your regex use with strip tags, the regex fails if there are attributes with the tag and striptags has some issues with said attributes as well. Also this will not protect you against XSS if any of the values you’re using wind up inside of a . Otherwise good work.
Well, usually if you’re trying to strip links you aren’t going to strip some depending on attributes – at least I don’t. As for XSS, I have a completely different function to check for XSS.
Thanks for the comment.
does this work for the items between the tags?
I don’t understand what you mean.. If you mean stripping the content between the tags, then no; it just removes the tags (and whatever is in them, ie. attributes).
Right. First off empty() is a really bad choice here, since the function returns false if empty() evaluates to true, and since $num = 0; will make empty($num) return 0, well, you get the point. You’re better of with using $str != “”.
Also, you should be consistent with the syntax of the if and else-clauses by always using {} for increased readability.
Regarding arrays I recommend you taking a look at array_map, which is a really handy function, and will probably save some work with writing array-compatible code.