[WEB SECURITY] Twitter XSS worms
jim at manico.net
Tue Apr 14 16:30:31 EDT 2009
If you are displaying user data, you need to encode.
One exception, if you accept HTML from users, you need to do strong
policy-based valiation with an Anti-Samy like library, and forgo encoding
(or validate again before display).
> And websites that have a business-case for allowing uploaded user HTML
> often have to support the use of broken/ malformed HTML, just like
> browsers do in general, which could unfortunately impact the ability to
> deploy AntiSamy.
Sure, if you want to accept "malformed html" from a user, and then redisplay
that same HTML, then you will need to use a validation scheme other than
AntiSamy. AntiSamy either cleans up your html for you or rejects it. But
really, if you have a strong business case for accepting malformed HTML from
a user that you need to display to other users.... you might want to
re-think that business case - or maybe push that logic to a different
I admit that if a software team is not currently doing encoding, to move in
that direction requires a new library (ESAPI) or new code - something that
always takes time to integrate. That's why I think of encoding as a long
term strategic fix. But I tell you if I'm bleeding today, I'd slap up a WAF
faster than you can say "XSS sucks". In fact, even if I'm not bleeding
today, I'n consider WAF like features like you WHS boys offer.
Please do not "dis" encoding as a smart technique for developers (and please
keep pushing the edges of where encoding is more difficult or impossible).
And I will not dis your offerings. =D
----- Original Message -----
From: "Jeremiah Grossman" <jeremiah at whitehatsec.com>
To: "Jim Manico" <jim at manico.net>
Cc: <websecurity at webappsec.org>
Sent: Monday, April 13, 2009 6:55 PM
Subject: Re: [WEB SECURITY] Twitter XSS worms
> Hey Jim,
> "does not always work" refers to the fact that we are continually at the
> mercy of whatever crazy functionality the browser vendors choose to
> support when it comes to XSS. Whatever functionality a specific HTML
> tag/attribute combo might enable today, could be different (and insecure)
> tomorrow. And websites that have a business-case for allowing uploaded
> user HTML often have to support the use of broken/ malformed HTML, just
> like browsers do in general, which could unfortunately impact the ability
> to deploy AntiSamy.
> Then of course there are Web 2.0 sites that support third-party created
> security framework to speak of able control their behavior.
> Jeremiah Grossman
> Chief Technology Officer
> WhiteHat Security, Inc.
> On Apr 13, 2009, at 8:48 PM, Jim Manico wrote:
>> > Output Encoding is an "average" practice and does not always work
>> or solve for modern XSS weaknesses that result from "web 2.0" use- cases.
>> > I think you will find that promoting "output encoding" as a "best
>> practice" in "Web 2.0" is a challenge... it breaks more and more
>> business cases I see.
>> Encoding only breaks stuff when you do it wrong. Sure, silly stuff like
>> "HTML Entity Encode everything" is an "average" but "flat out wrong"
>> practice. The best practice is to encode all user output within the
>> proper HTML context. Please take a look at
>> for a complete discourse of defensive encoding.
>> If you need to accept rich HTML content as user input, use a library
>> like AntiSamy to provide whitelist HTML validation based on a specific
>> policy. http://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project
>> As a very active web-2.0 heavy-ajax web application developer, I do not
>> understand the comment that encoding (or anti-samy) does not work in the
>> web 2.0 world.
>> Blacklisting will never work as a long term robust solution to XSS.
>> Attackers can, will and have bypassed all kinds of blacklist filtering
>> technology. Encoding when done correctly stops XSS dead - but it's not
>> easy and requires programmers to do things differently.
>> What I think is best is doing both - blacklist/waf-like tech for
>> tactical purposes - alongside deeper contextual encoding within the
>> codebase for strategic purposes.
>> - Jim
>> ----- Original Message -----
>> From: Arian J. Evans
>> To: Steven M. Christey
>> Cc: Hoffman, Billy ; Chris Eng ; robert at webappsec.org ;
>> websecurity at webappsec.org
>> Sent: Monday, April 13, 2009 2:46 PM
>> Subject: Re: [WEB SECURITY] Twitter XSS worms
>> 1. No novelty here. Ajax attack == non-novel.
>> "ajax-as-an-attack-vector-novelty" is a separate question from "what is
>> the key XSS fundamental weakness? Or are there more than one?
>> 2. What Chris said. :) This twitter example is textbook <XSS>.
>> Output Encoding is an "average" practice and does not always work or
>> solve for modern XSS weaknesses that result from "web 2.0" use-cases.
>> As for "best practice" CWE mapping though -- wait! There's more. :)
>> "Best" and "Average" programming practices for modern webapps need tree
>> of options depending on the data, the use-case, and the various data
>> transformations. Web app issues (at least syntax attacks like XSS) are
>> not as black and white as buffer protections are. "Blacklist" and
>> "Blacklist Escaping" seem to becoming more and more common as practices.
>> Fundamentally XSS is a data/function boundary problem. It seems
>> identical to SQL Injection and Buffer/heap overflows. Except we have no
>> "stack canaries" or "parameterized values" in the web world for most
>> intents. (I am ignoring checksumming of values that shouldn't be
>> user-tainted like .NET ViewState hashes).
>> Robust output encoding normalizes metacharacters used to escape data/
>> function boundaries to escaping safe for the target interpreter
>> I think you will find that promoting "output encoding" as a "best
>> practice" in "Web 2.0" is a challenge... it breaks more and more
>> business cases I see.
>> Web "2.0" applications today handle user-tainted data in ways *unique*
>> to the web world vs. unmanaged code because of:
>> 2.1. Limitations of modern implementation level languages. Limited
>> escaping and encoding libraries in most languages; no parameterization
>> or "safe sandboxing" in web code.
>> 2.2. Limitations of interpreters to "sandbox" data and memory
>> management -nx ability with the user agents (browser also has no
>> separate data/control channel at the protocol level, let alone
>> sandboxing of functions at the document level)
>> 2.3. Unique goals of extensibility in web code -- webapp "2.0"
>> businesses *want* users to *extend* the code. Awesomeness! :)
>> Web 2.0 apps, by business-goal and design, take user-tainted "function"
>> as "data" and try to use this data as "limited but extensible
>> This is the reverse kind of problem we normally see in unmanaged code,
>> and complicates simple solutions.
>> You will see this all over social networks.
>> They go down the slippery slope of allowing "user-tainted function"
>> which completely invalidates the "best practice" game.
>> control...at this point mitigation becomes tactical at best, and often
>> Output Encoding only solves the "weakness" of allowing user-tainted data
>> to cross the data-function boundary.
>> Once you jump that boundary in "Web 2.0 land"...What "Best Practice"
>> - Input Validation (type, length)?
>> - Whitelist "allow safe functions"?
>> - Blacklist "known dangerous functions"?
>> (these two approaches are often combined in Drupal/PostNuke style CMS
>> systems, with radio-button and menu-option combo filters)
>> - Escape the dangerous metacharacters?
>> And to heap on the pile -- not all data types are used equally.
>> Example string: userIdDisplayArea=<SAFETAG:attribute>
>> 1. In the URI, Hex-escaped for protocol type-safety:
>> 2. The app server also un-escapes Hex-URI; you could have a double-
>> decode canonicaliation issue, but is it a weakness?:
>> %25 %33
>> 3. Base64 in the personalization cookie:
>> \x27\uu0027\r\u0085 etc.
>> 5. And then various unicode and transcoded interpretations passed around
>> internal to the application. You see the output on international
>> software that normalizes things like usernames at the database for
>> visual consistency.
>> These different encodings and representations make it challenging to
>> specify a universally safe type of "output" in a webapp. "what output
>> Which goes back to a "best practice" tree to select use-case, and
>> data-type, and how you want it to behave in the document. :)
>> As an aside -- I've found a good way to measureapproaches of "web 2.0"
>> sites is to type in something like (alert) and 'alert' and
>> \x27alert\x27, and throw in some international unicode transcodings, and
>> see how they handle the input and output. If they blacklist those, it
>> gives you an idea what they are doing, and NOT doing.
>> But that's another story for another worm.
>> Arian Evans
>> On Mon, Apr 13, 2009 at 12:59 PM, Steven M. Christey
>> <coley at linus.mitre.org
>> > wrote:
>> For those who speak fluent XSS, how obscure was the attack vector and
>> attack technique? Actually, what I'm really wondering is, would "best
>> practices" or even "average practices" have prevented this attack from
>> succeeding? either for the XSS or the CSRF angles. Is
>> Ajax-as-an-XSS-attack-vector still novel?
>> - Steve
>> Join us on IRC: irc.freenode.net #webappsec
>> Have a question? Search The Web Security Mailing List Archives:
>> Subscribe via RSS:
>> http://www.webappsec.org/rss/websecurity.rss [RSS Feed]
>> Join WASC on LinkedIn
Join us on IRC: irc.freenode.net #webappsec
Have a question? Search The Web Security Mailing List Archives:
Subscribe via RSS:
http://www.webappsec.org/rss/websecurity.rss [RSS Feed]
Join WASC on LinkedIn
More information about the websecurity