[WEB SECURITY] MultiByte Attacks
Chris Weber
chris at casabasecurity.com
Wed Mar 24 01:00:58 EDT 2010
You could use ICU’s conversion tools at http://demo.icu-project.org/icu-bin/convexp. Some exceptions about charsets are that they can be implemented differently on different platforms, by different vendors. So the data that ICU uses may or may not be accurate for your platform. In most cases I’d guess it would be the same, but that’s an exception. Charset aliases have also been ambiguous and varied across implementations.
A good bet in general is to do what you did whenever dealing with CJK charsets. The trick about injecting a 0xbf27 through so an addslashes() ‘sanitizes’ it to become 0xbf5c27 will yield the same effect in many of the multi-byte character sets, where the 0xbf5c sequence together represent a valid character, and the 0x27 is left hanging out solo.
Many of the DBCS and MBCS use a lead byte in the range [\x80-\xff] followed by a trailing byte in the range [\x40-\xff]. There’s exceptions in different charsets, illegal bytes, more specific ranges allowed, etc. But that’s an over-generalized view. For example, GB18030 actually allows for \x30\x39 to be used as a second byte in a multi-byte sequence e.g. 81 31 81.
Note that 0xbf27 is an illegal byte sequence for just about any variable-width encoding. So in your example using GBK if a charset converter were to get ahold of it before a function like addslashes() then the sequences would be handled as an error and you’d get a different result. There are other charsets where 0x5c represents a Won ₩ or Yen ¥ sign and doesn’t map to the ASCII backslash. If you encounter an EBCDIC encoding format well then almost nothing maps directly to ASCII, and you can have about as much fun as you can if UTF-7 were allowed.
-Chris Weber
From: NeZa [mailto:neza0x at gmail.com]
Sent: Tuesday, March 23, 2010 2:09 PM
To: websecurity at webappsec.org
Subject: [WEB SECURITY] MultiByte Attacks
Recently during a pentest I exploited a SQL injection vulnerability by taking advantage of multibyte encoding. Specifically I bypassed addslashes() function from PHP by injecting GBK charset values. A too old vulnerability explained here http://shiflett.org/blog/2006/jan/addslashes-versus-mysql-real-escape-string
As you know, by playing with the Data Base charset we can bypass multiple kind of filters for Data base (SQL Injection) and Web Servers (XSS).
My question would be, Where can we get the multibyte codes for each charset? specially those that supports multibyte encoding (GBK, AL16UTF16, UTF-8).
So once we identified the DB charset, we got our multibyte values accordingly and start trying to bypass the filters.
Let me know your thoughts.
--
NeZa
Hacker Wanna Be from Nezahualcoyotl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webappsec.org/pipermail/websecurity_lists.webappsec.org/attachments/20100323/21d6272f/attachment.html>
More information about the websecurity
mailing list