[WEB SECURITY] Twitter XSS worms

Arian J. Evans arian.evans at anachronic.com
Tue Apr 14 19:42:47 EDT 2009


I'll make this response a 2-in-1. My initial response was to agree with
Chris and expand context for Mitre. Not "dis" encoding. :)

I have been a huge proponent of output encoding for a long time and was
recommending mitigating XSS attacks using unicode encoding in JS space
before it was cool and before I even understood how the interpreters really
interpreted it. We are on the same page on encoding.


On Tue, Apr 14, 2009 at 2:17 PM, Jim Manico <jim at manico.net> wrote:

>  > It solves most XSS in plain-vanilla use-cases.
>
> That is a terribly unfair comment. Encoding solves XSS is most uses,
> period. NOT in plain-vanilla use cases only.
>

Fair enough. I used rhetoric loosely. How about "not most web 2.0 social
network use-cases I see"?

We are debating definition at this point more than anything so let's call it
a day.

I am not debating "what is right" here. I am simply saying "what I see on
thousands of websites" and what appears to be their business-case implied
from the practices they follow.


  Just because you encode, doesn't mean you should skip normalization.
> Normalization is a part of validation at the input boundary, not encoding
> the output boundary. The unicode issues you describe below are all easily
> solved through proper character set normalization.
>

Again, debating definition. "Proper".

Unicode Consortium's "proper" leads to XSS, and even SQLi in at least one
web application I've seen. And I hear from folks in Europe that the problem
is growing.


Facts are:

1) I have given up trying to get people to "do it right". I'll let you
smarter, passionate guys fight that battle. :)


2) Some folks simply refuse to whitelist or use libraries like Anti-Samy,
and insist on blacklists. (Not WAFs -- I mean their own home-grown
*blacklists* -- I wanted to keep this discussion away from "products"). I
see more blacklists in use IRL than whitelists, or at least, that's what I
infer from my test data.

Blacklists are dangerous. We all know that. As a simple example we regularly
find Mozilla/Firefox interpreter bugs where we can escape out of a given
context in a manner that should not work, and create a new context to
exploit. Even despite this, folks still use blacklists and try to tune them
accordingly.

There is at least one social network I have found that blacklists some of my
favorite made-up HTML tags I use for testing! Yikes.


3) You are a highly competent purist developer. If more developers were like
you, why, then we wouldn't have this list or these witty discussions.


At the end of the day my goal was to point out to Mitre that these social
widgets like twitter are moving down a direction that is fundamentally
different from how we treat binary syntax exploits in unmanaged code. And
that this is driven by explicit use-case in many situations. Not a simple
"failure to validate length of buffer" or "unsigned integer" etc.

So now that I've done my monthly "stick my hand into the webappsec tarpit"
I'm off and I'll see you all in a month or two when another nice sticky
subject comes up. Please feel free to clean up any messes while I am gone,

-- 
Arian Evans
*
When the Cambrian measures were forming, They promised perpetual peace.
They swore, if we gave them our weapons, that the wars of the tribes would
cease.
But when we disarmed They sold us and delivered us bound to our foe,
And the Gods of the Copybook Headings said: *"Stick to the Devil you know."*
*
Gods of the Copybook Headings, Kipling, 1919


On Tue, Apr 14, 2009 at 1:30 PM, Jim Manico <jim at manico.net> wrote:

> Regardless,
>
> If you are displaying user data, you need to encode.
>
> One exception, if you accept HTML from users, you need to do strong
> policy-based valiation with an Anti-Samy like library, and forgo encoding
> (or validate again before display).
>
>  And websites that have a business-case for  allowing uploaded user HTML
>> often have to support the use of broken/ malformed HTML, just like browsers
>> do in general, which could  unfortunately impact the ability to deploy
>> AntiSamy.
>>
>
> Sure, if you want to accept "malformed html" from a user, and then
> redisplay that same HTML, then you will need to use a validation scheme
> other than AntiSamy. AntiSamy either cleans up your html for you or rejects
> it. But really, if you have a strong business case for accepting malformed
> HTML from a user that you need to display to other users.... you might want
> to re-think that business case - or maybe push that logic to a  different
> domain.
>
> I admit that if a software team is not currently doing encoding, to move in
> that direction requires a new library (ESAPI) or new code - something that
> always takes time to integrate. That's why I think of encoding as a long
> term strategic fix. But I tell you if I'm bleeding today, I'd slap up a WAF
> faster than you can say "XSS sucks". In fact, even if I'm not bleeding
> today, I'n consider WAF like features like you WHS boys offer.
>
> Please do not "dis" encoding as a smart technique for developers (and
> please keep pushing the edges of where encoding is more difficult or
> impossible). And I will not dis your offerings. =D
>
> Aloha Gents,
> Jim
>
> ----- Original Message ----- From: "Jeremiah Grossman" <
> jeremiah at whitehatsec.com>
> To: "Jim Manico" <jim at manico.net>
> Cc: <websecurity at webappsec.org>
> Sent: Monday, April 13, 2009 6:55 PM
>
> Subject: Re: [WEB SECURITY] Twitter XSS worms
>
>
>  Hey Jim,
>>
>> "does not always work" refers to the fact that we are continually at  the
>> mercy of whatever crazy functionality the browser vendors choose  to support
>> when it comes to XSS. Whatever functionality a specific  HTML tag/attribute
>> combo might enable today, could be different (and  insecure) tomorrow. And
>> websites that have a business-case for  allowing uploaded user HTML often
>> have to support the use of broken/ malformed HTML, just like browsers do in
>> general, which could  unfortunately impact the ability to deploy AntiSamy.
>>
>> Then of course there are Web 2.0 sites that support third-party  created
>> HTML/JavaScript widgets (Facebook, MySpace, Google, etc), with  no browser
>> security framework to speak of able control their behavior.
>>
>> Regards,
>>
>> Jeremiah Grossman
>> Chief Technology Officer
>> WhiteHat Security, Inc.
>> http://www.whitehatsec.com/
>>
>>
>>
>>
>> On Apr 13, 2009, at 8:48 PM, Jim Manico wrote:
>>
>>  > Output Encoding is an "average" practice and does not always work
>>> or solve for modern XSS weaknesses that result from "web 2.0" use- cases.
>>> > I think you will find that promoting "output encoding" as a "best
>>> practice" in "Web 2.0" is a challenge... it breaks more and more business
>>> cases I see.
>>>
>>> Encoding only breaks stuff when you do it wrong. Sure, silly stuff  like
>>> "HTML Entity Encode everything" is an "average" but "flat out  wrong"
>>> practice. The best practice is to encode all user output  within the proper
>>> HTML context. Please take a look at
>>> https://www.owasp.org/index.php?title=XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet<https://www.owasp.org/index.php?title=XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet>for a complete discourse of defensive encoding.
>>>
>>> If you need to accept rich HTML content as user input, use a library like
>>> AntiSamy to provide whitelist HTML validation based on a  specific policy.
>>> http://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project
>>>
>>> As a very active web-2.0 heavy-ajax web application developer, I do  not
>>> understand the comment that encoding (or anti-samy) does not  work in the
>>> web 2.0 world.
>>>
>>> Blacklisting will never work as a long term robust solution to XSS.
>>> Attackers can, will and have bypassed all kinds of blacklist  filtering
>>> technology. Encoding when done correctly stops XSS dead -  but it's not easy
>>> and requires programmers to do things differently.
>>>
>>> What I think is best is doing both - blacklist/waf-like tech for tactical
>>> purposes - alongside deeper contextual encoding within the codebase for
>>> strategic purposes.
>>>
>>> - Jim
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Arian J. Evans
>>> To: Steven M. Christey
>>> Cc: Hoffman, Billy ; Chris Eng ; robert at webappsec.org ;
>>> websecurity at webappsec.org
>>> Sent: Monday, April 13, 2009 2:46 PM
>>> Subject: Re: [WEB SECURITY] Twitter XSS worms
>>>
>>> Steve,
>>>
>>> 1. No novelty here. Ajax attack == non-novel.
>>>
>>> "ajax-as-an-attack-vector-novelty" is a separate question from "what  is
>>> the key XSS fundamental weakness? Or are there more than one?
>>>
>>>
>>> 2. What Chris said. :) This twitter example is textbook <XSS>.
>>>
>>> Output Encoding is an "average" practice and does not always work or
>>> solve for modern XSS weaknesses that result from "web 2.0" use-cases.
>>>
>>> As for "best practice" CWE mapping though -- wait! There's more. :)
>>>
>>> "Best" and "Average" programming practices for modern webapps need  tree
>>> of options depending on the data, the use-case, and the various  data
>>> transformations. Web app issues (at least syntax attacks like  XSS) are not
>>> as black and white as buffer protections are.  "Blacklist" and "Blacklist
>>> Escaping" seem to becoming more and more  common as practices.
>>>
>>>
>>>
>>> Why?
>>>
>>> Fundamentally XSS is a data/function boundary problem. It seems identical
>>> to SQL Injection and Buffer/heap overflows. Except we have  no "stack
>>> canaries" or "parameterized values" in the web world for  most intents. (I
>>> am ignoring checksumming of values that shouldn't  be user-tainted like .NET
>>> ViewState hashes).
>>>
>>> Robust output encoding normalizes metacharacters used to escape data/
>>> function boundaries to escaping safe for the target interpreter (browser).
>>>
>>>
>>> I think you will find that promoting "output encoding" as a "best
>>> practice" in "Web 2.0" is a challenge... it breaks more and more business
>>> cases I see.
>>>
>>> Web "2.0" applications today handle user-tainted data in ways  *unique*
>>> to the web world vs. unmanaged code because of:
>>>
>>>
>>> 2.1. Limitations of modern implementation level languages. Limited
>>> escaping and encoding libraries in most languages; no  parameterization or
>>> "safe sandboxing" in web code.
>>>
>>> 2.2. Limitations of interpreters to "sandbox" data and memory  management
>>> -nx ability with the user agents (browser also has no separate data/control
>>> channel at the protocol level, let alone sandboxing of functions at the
>>> document level)
>>>
>>> 2.3. Unique goals of extensibility in web code -- webapp "2.0" businesses
>>> *want* users to *extend* the code. Awesomeness! :)
>>>
>>>
>>>
>>> Web 2.0 apps, by business-goal and design, take user-tainted  "function"
>>> as "data" and try to use this data as "limited but  extensible function"...
>>>
>>> This is the reverse kind of problem we normally see in unmanaged  code,
>>> and complicates simple solutions.
>>>
>>> You will see this all over social networks.
>>>
>>> They go down the slippery slope of allowing "user-tainted function" which
>>> completely invalidates the "best practice" game.
>>>
>>> Once you add "limited" markup, Javascript, and RIA widgets to the  user's
>>> control...at this point mitigation becomes tactical at best,  and often
>>> implementation-specific.
>>>
>>> Output Encoding only solves the "weakness" of allowing user-tainted  data
>>> to cross the data-function boundary.
>>>
>>> Once you jump that boundary in "Web 2.0 land"...What "Best Practice"
>>> solves?
>>>
>>> - Input Validation (type, length)?
>>>
>>> - Whitelist "allow safe functions"?
>>>
>>> - Blacklist "known dangerous functions"?
>>>
>>> (these two approaches are often combined in Drupal/PostNuke style  CMS
>>> systems, with radio-button and menu-option combo filters)
>>>
>>> - Escape the dangerous metacharacters?
>>>
>>> And to heap on the pile -- not all data types are used equally.
>>>
>>> Example string: userIdDisplayArea=<SAFETAG:attribute>
>>>
>>> 1. In the URI, Hex-escaped for protocol type-safety:
>>>
>>> %3CSAFETAG%3Aattribute%3E
>>>
>>> 2. The app server also un-escapes Hex-URI; you could have a double-
>>> decode canonicaliation issue, but is it a weakness?:
>>>
>>> %25 %33
>>> %43%53%41%46%45%54%41%47%25%33%41%61%74%74%72%69%62%75%74%65%25%33%45
>>>
>>> 3. Base64 in the personalization cookie:
>>>
>>> PFNBRkVUQUc6YXR0cmlidXRlPg==
>>>
>>> 4. Unicode escaped string literals in javascript space:
>>>
>>> \x27\uu0027\r\u0085 etc.
>>>
>>> 5. And then various unicode and transcoded interpretations passed  around
>>> internal to the application. You see the output on  international software
>>> that normalizes things like usernames at the  database for visual
>>> consistency.
>>>
>>> These different encodings and representations make it challenging to
>>> specify a universally safe type of "output" in a webapp. "what  output
>>> where?"
>>>
>>> Which goes back to a "best practice" tree to select use-case, and
>>> data-type, and how you want it to behave in the document. :)
>>>
>>>
>>> As an aside -- I've found a good way to measureapproaches of "web  2.0"
>>> sites is to type in something like (alert) and 'alert' and \x27alert\x27,
>>> and throw in some international unicode transcodings,  and see how they
>>> handle the input and output. If they blacklist  those, it gives you an idea
>>> what they are doing, and NOT doing.
>>>
>>> But that's another story for another worm.
>>>
>>> --
>>> Arian Evans
>>>
>>>
>>>
>>> On Mon, Apr 13, 2009 at 12:59 PM, Steven M. Christey <
>>> coley at linus.mitre.org
>>> > wrote:
>>>
>>> For those who speak fluent XSS, how obscure was the attack vector  and
>>> the
>>> attack technique? Actually, what I'm really wondering is, would "best
>>> practices" or even "average practices" have prevented this attack from
>>> succeeding?  either for the XSS or the CSRF angles.  Is
>>> Ajax-as-an-XSS-attack-vector still novel?
>>>
>>> - Steve
>>>
>>>
>>> ----------------------------------------------------------------------------
>>> Join us on IRC: irc.freenode.net #webappsec
>>>
>>> Have a question? Search The Web Security Mailing List Archives:
>>> http://www.webappsec.org/lists/websecurity/archive/
>>>
>>> Subscribe via RSS:
>>> http://www.webappsec.org/rss/websecurity.rss [RSS Feed]
>>>
>>> Join WASC on LinkedIn
>>> http://www.linkedin.com/e/gis/83336/4B20E4374DBA
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
> ----------------------------------------------------------------------------
> Join us on IRC: irc.freenode.net #webappsec
>
> Have a question? Search The Web Security Mailing List Archives:
> http://www.webappsec.org/lists/websecurity/archive/
>
> Subscribe via RSS: http://www.webappsec.org/rss/websecurity.rss [RSS Feed]
>
> Join WASC on LinkedIn
> http://www.linkedin.com/e/gis/83336/4B20E4374DBA
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webappsec.org/pipermail/websecurity_lists.webappsec.org/attachments/20090414/01bdf115/attachment.html>


More information about the websecurity mailing list