Hi.
I've noticed a strange behavior of NoScript's Anti-XSS subsystem.
I have a locally saved html file with the following code:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head><meta content="text/html; charset=windows-1251" http-equiv="content-type">
<title>Title</title>
</head>
<form action="http://yandex.ru/yandsearch">
<input name=text size=55>
<input type=submit value="Search">
</form>
</body>
</html>
And it generally works fine (it submits a search query to yandex.ru search engine), except some cases:
I've noticed, that whenever my search query contains "урав" - Anti-XSS system alerts me that it has worked out and blocked the query, removing "урав" (or %F3%F0%E0%E2) from the url.
Why is that happening? Other search queries do not trigger such behavior.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1
[NoScript XSS] Sanitized suspicious request. Original URL [http://yandex.ru/yandsearch?text=%F3%F0%E0%E2+gena01] requested from [file:///C:/TMP/SEA/yandex.html]. Sanitized URL: [http://yandex.ru/yandsearch?text=%20+gena01#9875335212563178277].
Search term: урав gena01
And there's more to it then that.
If (local file, in my case) "yandex.html" (Character Encoding) is set to Cryllic (Windows-1251), & I have allowed yandex.ru, then I get the XSS warning & I am also returned results.
If yandex.html is set to Unicode (UTF-8) & yandex.ru is allowed, then there is no XSS warning & the page states (& shows, effectively) "nothing found".
Is there any reason why you have to use the legacy charset on your local page, when even yandex.ru uses UTF-8 (which is the current standard for internationalized pages)?
The problem is due to the charset of the page causing the query string to be encoded in an obsolete way, which in your case (with the combination of characters used in your query) may be used in an attack against a buggy behavior of the PHP utf8_decode() function.
Notice that sending query strings which are not UTF-8 encoded across different domains (which is the thing which may trigger this false positive) is extremely rare nowadays, and doesn't justify a work-around which may be used to circumvent the filter protection against the utf8_decode() bug.
iDrugoy wrote:The problem is not just with my homepage.
Visiting this link will also cause anti-xss false positive:
... and how did you create that link, exactly?
It doesn't matter how I got this URL. The fact is that it is a valid URL and that it causes false positive anti-xss warning by your extension.
It does matter, because it happens to be indistinguishable from an attack against a known PHP multibyte decoding weakness which can be exploited to bypass XSS filters.
Therefore, knowing whether this legitimate false positive is a common occurrence (which I believe is not, because it appears to be caused by an artificial character encoding mismatch) is important to make a cost-benefit assessment for a (not necessarily possible) work-around.