Page 1 of 1
Shy char in ABE filter
Posted: Fri Oct 08, 2010 9:04 pm
by Guest
I've been using NoScript, and appreciate the product very much. I've written a couple of ABE filters as well. There was an issue that came to my attention concerning the "shy" character, which displays like a hyphen. So, I made a filter like this:
Because of possible rendering issues, I'm using the hyphen, character 45.
# A a soft hyphen - chr 173
Site ^.*-*
Deny
I tested the filter with a hyphen, and it works. In notepad, I can copy a soft hyphen "", which doesn't render on the webpage, but does in notepad, saved the user filter, and ABE no longer shows that I have a USER filter. Right now, those double quotes don't show anything in the middle, but if I copy and paste it into notepad, something that looks like a hyphen, but isn't will appear. I went to my profile, removed the soft hyphen, and my USER profile magically reappears.
The issue with shy characters is discussed here:
http://threatpost.com/en_us/blogs/spamm ... rls-100710
Re: Shy char in ABE filter
Posted: Fri Oct 08, 2010 9:42 pm
by Giorgio Maone
The rule disappearing seem to be a bug in Firefox's textarea rendering.
You can work-around by using regexp Unicode escapes:
Code: Select all
# A soft hyphen - chr 173 (0xad)
Site ^.*\u00ad
Deny
However I did not manage to reproduce a shy character phishing attempt in Firefox 3.6 and above: the shy characters seem to be "eaten out" when used in an URL (i.e. the string length is the same as the characters didn't exist at all).
You can easily verify by entering the following in your URL bar:
Code: Select all
javascript:location.href = "http://go\u00ad\u00adogle.com";void(0)
Re: Shy char in ABE filter
Posted: Fri Oct 08, 2010 10:12 pm
by Guest
Thanks for the filter. My home page is on my C drive, so I put some soft hyphens in some of the urls, and it seems that Firefox cleans up the url and gets rid of the soft hyphens before sending it to NoScript, which was kinda what I expected. But, with that disappearing USER filter, I wasn't sure.
Re: Shy char in ABE filter
Posted: Fri Oct 08, 2010 11:00 pm
by al_9x
The URIs are encoded (IDN hostnames are punycoded, other parts are utf-8 % encoded). To match an illegal char(s), you'd have to match their encoded representation, and therefore not need unicode escapes in ABE. Is that not the case, Giorgio?
Re: Shy char in ABE filter
Posted: Sat Oct 09, 2010 7:30 am
by Giorgio Maone
al_9x wrote:The URIs are encoded (IDN hostnames are punycoded, other parts are utf-8 % encoded). To match an illegal char(s), you'd have to match their encoded representation, and therefore not need unicode escapes in ABE. Is that not the case, Giorgio?
Yes it is, but while matching against the percent-encoding thing is OK (even though XSS exceptions match the decoded form for flexibility), I consider matching IDN-encoded host names (only) a bug: since IDN encoding is positional, for a partial filter to be effective it should match both the IDN-encoded (as it does now) and the "externalized" (unicode) representation of host names or, if this double check is too much a performance hit, at least the latter.
This will be fixed in a future release.
Re: Shy char in ABE filter
Posted: Sat Oct 09, 2010 5:05 pm
by Giorgio Maone
Giorgio Maone wrote:since IDN encoding is positional, for a partial filter to be effective it should match both the IDN-encoded (as it does now) and the "externalized" (unicode) representation of host names or, if this double check is too much a performance hit, at least the latter.
This will be fixed in a future release.
latest development build matches both.