没想到啊

Question

if I'm sanitizing my DB inserts, and also escaping the HTML I write with htmlentities($text, ENT_COMPAT, 'UTF-8') - is there any point to also filtering the inputs with xss_clean? What other benefits does it give?

edited Mar 17 at 17:49

jondavidjohn
12.4k11131

asked Mar 17 at 9:26

Dan Searle
136

htmlentities($text, ENT_COMPAT, 'UTF-8') is not a good method of stopping xss, no one should be using this. – Rook Mar 18 at 5:55

htmlentities is absolutely proof against HTML-injection, though ENT_QUOTES is needed instead ofENT_COMPAT if you ever use single quote attribute delimiters. htmlspecialchars is generally preferable tohtmlentities, though, as it has less chance of messing up the charset. CodeIgniter's xss_clean is a worthless cargo-cult-programming disaster area full of wrongheaded misunderstandings of what constitutes string handling. – bobince Aug 20 at 10:32

feedback

htmlentities($text, ENT_COMPAT, 'UTF-8') is not a good method of stopping xss, no one should be using this.
htmlentities is absolutely proof against HTML-injection, though ENT_QUOTES is needed instead ofENT_COMPAT if you ever use single quote attribute delimiters. htmlspecialchars is generally preferable tohtmlentities, though, as it has less chance of messing up the charset. CodeIgniter's xss_clean is a worthless cargo-cult-programming disaster area full of wrongheaded misunderstandings of what constitutes string handling.

Rook · Answer 1 · 2011-08-20 18:38:49Z

xss_clean() is extensive, and also silly. 90% of this function does nothing to prevent xss. Such as looking for the word alert but not document.cookie. No hacker is going to use alert in their exploit, they are going to hijack the cookie with xss or read a CSRF token to make an XHR.

However running htmlentities() or htmlspecialchars() with it is redundant. A case wherexss_clean() fixes the issue and htmlentities($text, ENT_COMPAT, 'UTF-8') fails is the following:

<?php
print"<img src='$var'>";
?>

A simple poc is:

http://localhost/xss.php?var=http://domain/some_image.gif'%20onload=alert(/xss/)

This will add the onload= event handler to the image tag. A method of stoppipng this form of xss ishtmlspecialchars($var,ENT_QUOTES); or in this case xss_clean() will also prevent this.

However, quoting from the xss_clean() documentation:

Nothing is ever 100% foolproof, of course, but I haven't been able to get anything passed the filter.

That being said, XSS is an output problem not an input problem. For instance this function cannot take into account that the variable is already within a <script> tag or event handler. It also doesn't stop DOM Based XSS. You need to take into consideration how you are using the data in order to use the best function. Filtering all data on input is a bad practice. Not only is it insecure but it also corrupts data which can make comparisons difficult.

Thanks. I guess the important point is really "what are you doing with the data". The job I was working on when this came up was an editable block of text, where I didn't want any active HTML tags at all, so my solution works in that case. The output is dumped into a DIV, and with all HTML tags encoded I don't see how anything malicious could be inserted there. Of course if I wanted to allow some HTML in the input that would complicate things. Still, I'm not too happy with the idea of encoding everything on input, I'd rather handle it as needed on output (while protecting the db of course).
@Dan Searle if you want a safe sub-set of html then you should checkout htmlpurifier.org
XSS is an output problem and not an input problem, which is why something like xss_clean() can never be a reliable approach to solving XSS problems (and xss_clean() itself is a laughably terrible implementation even by the low, low standards of anti-XSS tools). I'm very surprised you appear to be endorsing it as a preferable alternative to output-level escaping in your first sentence.
@bobince your totally correct. In fact 90% of the checks made in xss_clean() do nothing for security. Such as looking for alert() but not document.cookie. But later in my post i did say xss is an ouput problem, and that there is only one way that i know of where xss is possible. BUt i'll update this post.
Thanks, that's better! Actually the current version of xss_clean does appear to look for document.cookie... although obviously this is still completely useless as it wouldn't catch document . cookie,document['cookie'] or any of the other thousand ways you could refer to it.

jondavidjohn · Answer 2 · 2011-03-17 13:37:18Z

Yes you should still be using it, I generally make it a rule to use it at least on public facing input, meaning any input that anyone can access and submit to.

Generally sanitizing the input for DB queries seems like a side-effect as the true purpose of the function is to prevent Cross-site Scripting Attacks.

I'm not going to get into the nitty gritty details of every step xss_clean takes, but i will tell you it does more than the few steps you mentioned, I've pastied the source of the xss_clean function so you can look yourself, it is fully commented.

没想到啊

公告

2 Answers