websecurity@lists.webappsec.org

The Web Security Mailing List

View all threads

Artificial Intelligence vs. Human Intelligence on finite amounts of possible outcomes

TL
Tasos Laskos
Tue, Feb 1, 2011 6:49 PM

Hi guys,

This isn't a paper or benchmark or some implemented feature, this is
something that just hit me and I'd appreciate some input.

It came to me while auditing a very slow server which produced a lot of
false positives on blind attacks that used time delays.
The server at some point just died and all modules that used that attack
type thought that their payloads had been executed successfully due to
timeouts.

However, don't focus on this particular situation (I already know the
solution), this was merely the trigger that prompted my
question/suggestion/RFC
which I think will make for an interesting conversation.

Lots of people in this list would like to see our tools implement some
sort of AI (I know for a fact that at least Michal does) to make
educated guesses/decisions about a scan's results
and adjust the report accordingly.
Training an expert system would take a lot of effort/time though and
until convergence has been reached the false results will be reported as
legitimate (not counting SaaS solutions).

I'd like to note at this point that I'm a strong proponent of fixing the
root of the problem instead of adding filtering layers on top of it but
let's ignore this for argument's sake as well.

Premises:

  • We have a finite amount of entities that assert the existence of
    issues -- we'll call these "modules".
  • We have a finite amount of outcomes for each of the modules; usually
    a binary result (either true/vulnerable or false/safe)
    but in some cases with a twist about the certainty of a result
    (i.e. a notice that an issue may require manual verification).

And here comes my point:
Do we really need AI? Wouldn't simple rules that check for unusual
results and give appropriate notice suffice and be a better and more
efficient way to handle this?

A possible implementation I have in mind is to pre-tag a module when
it's added to the system.
The tags would specify key elements of the behavior of a module and will
later be used in the decision making process (based on rules).

For instance, in the example I mentioned at the beginning of this
e-mail, the system would check how many of the results have the
"timing_attack" tag
and if that number was above a preset threshold it would remove the
results from the scan report or flag them accordingly.
And possibly take into account environment statistics to make a more
well-rounded decision (like average response times etc).

What do you guys thing?

Cheers,
Tasos L.

PS. I guess that this could be perceived as pre-trained expert system
but not really.

Hi guys, This isn't a paper or benchmark or some implemented feature, this is something that just hit me and I'd appreciate some input. It came to me while auditing a very slow server which produced a lot of false positives on blind attacks that used time delays. The server at some point just died and all modules that used that attack type thought that their payloads had been executed successfully due to timeouts. However, don't focus on this particular situation (I already know the solution), this was merely the trigger that prompted my question/suggestion/RFC which I think will make for an interesting conversation. Lots of people in this list would like to see our tools implement some sort of AI (I know for a fact that at least Michal does) to make educated guesses/decisions about a scan's results and adjust the report accordingly. Training an expert system would take a lot of effort/time though and until convergence has been reached the false results will be reported as legitimate (not counting SaaS solutions). I'd like to note at this point that I'm a strong proponent of fixing the root of the problem instead of adding filtering layers on top of it but let's ignore this for argument's sake as well. Premises: * We have a finite amount of entities that assert the existence of issues -- we'll call these "modules". * We have a finite amount of outcomes for each of the modules; usually a binary result (either true/vulnerable or false/safe) but in some cases with a twist about the certainty of a result (i.e. a notice that an issue may require manual verification). And here comes my point: Do we really need AI? Wouldn't simple rules that check for unusual results and give appropriate notice suffice and be a better and more efficient way to handle this? A possible implementation I have in mind is to pre-tag a module when it's added to the system. The tags would specify key elements of the behavior of a module and will later be used in the decision making process (based on rules). For instance, in the example I mentioned at the beginning of this e-mail, the system would check how many of the results have the "timing_attack" tag and if that number was above a preset threshold it would remove the results from the scan report or flag them accordingly. And possibly take into account environment statistics to make a more well-rounded decision (like average response times etc). What do you guys thing? Cheers, Tasos L. PS. I guess that this could be perceived as pre-trained expert system but not really.
AR
Andres Riancho
Tue, Feb 1, 2011 7:20 PM

Tasos,

On Tue, Feb 1, 2011 at 3:49 PM, Tasos Laskos tasos.laskos@gmail.com wrote:

Hi guys,

This isn't a paper or benchmark or some implemented feature, this is
something that just hit me and I'd appreciate some input.

It came to me while auditing a very slow server which produced a lot of
false positives on blind attacks that used time delays.
The server at some point just died and all modules that used that attack
type thought that their payloads had been executed successfully due to
timeouts.

Been there :)

However, don't focus on this particular situation (I already know the
solution), this was merely the trigger that prompted my
question/suggestion/RFC
which I think will make for an interesting conversation.

Ok,

Lots of people in this list would like to see our tools implement some sort
of AI (I know for a fact that at least Michal does) to make educated
guesses/decisions about a scan's results
and adjust the report accordingly.
Training an expert system would take a lot of effort/time though and until
convergence has been reached the false results will be reported as
legitimate (not counting SaaS solutions).

Agreed.

I'd like to note at this point that I'm a strong proponent of fixing the
root of the problem instead of adding filtering layers on top of it but
let's ignore this for argument's sake as well.

Premises:
 * We have a finite amount of entities that assert the existence of issues
-- we'll call these "modules".
 * We have a finite amount of outcomes for each of the modules; usually a
binary result (either true/vulnerable or false/safe)
   but in some cases with a twist about the certainty of a result (i.e. a
notice that an issue may require manual verification).

And here comes my point:
   Do we really need AI? Wouldn't simple rules that check for unusual
results and give appropriate notice suffice and be a better and more
efficient way to handle this?

In some cases you need AI, or need lots of signatures (either works).
For example, if you're trying to find SQL injections based on error
messages, a module that has 10 signatures is worse than one that has
100. But I'm sure that the module with 100 signatures doesn't cover
all possible DBMS errors. On the other side, a web application
penetration tester that sees a rendered HTML response can identify a
SQL error even if its not something he has seen in the past (its not
in the expert's signature DB). That's where AI might be handy.

A possible implementation I have in mind is to pre-tag a module when it's
added to the system.
The tags would specify key elements of the behavior of a module and will
later be used in the decision making process (based on rules).

For instance, in the example I mentioned at the beginning of this e-mail,
the system would check how many of the results have the "timing_attack" tag
and if that number was above a preset threshold it would remove the results
from the scan report or flag them accordingly.
And possibly take into account environment statistics to make a more
well-rounded decision (like average response times etc).

That makes sense... somehow... but I would rather fix the cause of the
timing attack bug.

What do you guys thing?

AI for web application scanning has been on my mind since I started
with w3af, but I really haven't found a problem for which I would say:
"The best / faster / easier to develop way to solve this is AI". Maybe
if we hit our heads hard enough, we can find something where AI is
applied and then state: "w3af/arachni , the only web app scanner with
AI" ? :)

Cheers,
Tasos L.

PS. I guess that this could be perceived as pre-trained expert system but
not really.


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org

--
Andrés Riancho
Director of Web Security at Rapid7 LLC
Founder at Bonsai Information Security
Project Leader at w3af

Tasos, On Tue, Feb 1, 2011 at 3:49 PM, Tasos Laskos <tasos.laskos@gmail.com> wrote: > Hi guys, > > This isn't a paper or benchmark or some implemented feature, this is > something that just hit me and I'd appreciate some input. > > It came to me while auditing a very slow server which produced a lot of > false positives on blind attacks that used time delays. > The server at some point just died and all modules that used that attack > type thought that their payloads had been executed successfully due to > timeouts. Been there :) > However, don't focus on this particular situation (I already know the > solution), this was merely the trigger that prompted my > question/suggestion/RFC > which I think will make for an interesting conversation. Ok, > Lots of people in this list would like to see our tools implement some sort > of AI (I know for a fact that at least Michal does) to make educated > guesses/decisions about a scan's results > and adjust the report accordingly. > Training an expert system would take a lot of effort/time though and until > convergence has been reached the false results will be reported as > legitimate (not counting SaaS solutions). Agreed. > I'd like to note at this point that I'm a strong proponent of fixing the > root of the problem instead of adding filtering layers on top of it but > let's ignore this for argument's sake as well. > > Premises: >  * We have a finite amount of entities that assert the existence of issues > -- we'll call these "modules". >  * We have a finite amount of outcomes for each of the modules; usually a > binary result (either true/vulnerable or false/safe) >    but in some cases with a twist about the certainty of a result (i.e. a > notice that an issue may require manual verification). > > And here comes my point: >    Do we really need AI? Wouldn't simple rules that check for unusual > results and give appropriate notice suffice and be a better and more > efficient way to handle this? In some cases you need AI, or need lots of signatures (either works). For example, if you're trying to find SQL injections based on error messages, a module that has 10 signatures is worse than one that has 100. But I'm sure that the module with 100 signatures doesn't cover all possible DBMS errors. On the other side, a web application penetration tester that sees a rendered HTML response can identify a SQL error even if its not something he has seen in the past (its not in the expert's signature DB). That's where AI might be handy. > A possible implementation I have in mind is to pre-tag a module when it's > added to the system. > The tags would specify key elements of the behavior of a module and will > later be used in the decision making process (based on rules). > > For instance, in the example I mentioned at the beginning of this e-mail, > the system would check how many of the results have the "timing_attack" tag > and if that number was above a preset threshold it would remove the results > from the scan report or flag them accordingly. > And possibly take into account environment statistics to make a more > well-rounded decision (like average response times etc). That makes sense... somehow... but I would rather fix the cause of the timing attack bug. > What do you guys thing? AI for web application scanning has been on my mind since I started with w3af, but I really haven't found a problem for which I would say: "The best / faster / easier to develop way to solve this is AI". Maybe if we hit our heads hard enough, we can find something where AI is applied and then state: "w3af/arachni , the only web app scanner with AI" ? :) > Cheers, > Tasos L. > > PS. I guess that this could be perceived as pre-trained expert system but > not really. > > _______________________________________________ > The Web Security Mailing List > > WebSecurity RSS Feed > http://www.webappsec.org/rss/websecurity.rss > > Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA > > WASC on Twitter > http://twitter.com/wascupdates > > websecurity@lists.webappsec.org > http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org > -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af
TL
Tasos Laskos
Tue, Feb 1, 2011 7:46 PM

On 01/02/11 19:20, Andres Riancho wrote:

Tasos,
[...]

I'd like to note at this point that I'm a strong proponent of fixing the
root of the problem instead of adding filtering layers on top of it but
let's ignore this for argument's sake as well.

Premises:

  • We have a finite amount of entities that assert the existence of issues
    -- we'll call these "modules".
  • We have a finite amount of outcomes for each of the modules; usually a
    binary result (either true/vulnerable or false/safe)
    but in some cases with a twist about the certainty of a result (i.e. a
    notice that an issue may require manual verification).

And here comes my point:
Do we really need AI? Wouldn't simple rules that check for unusual
results and give appropriate notice suffice and be a better and more
efficient way to handle this?

In some cases you need AI, or need lots of signatures (either works).
For example, if you're trying to find SQL injections based on error
messages, a module that has 10 signatures is worse than one that has
100. But I'm sure that the module with 100 signatures doesn't cover
all possible DBMS errors. On the other side, a web application
penetration tester that sees a rendered HTML response can identify a
SQL error even if its not something he has seen in the past (its not
in the expert's signature DB). That's where AI might be handy.

That's not what I meant exactly.
My thoughts were mostly towards interpretation of the results after the
scan.

Something akin to adding the following to the report:

Judging by the results of the scan and request timeouts the site seems
to have been stressed to its limits.
This shouldn't have happened and it means that you are susceptible to
DoS attack quite easily.

Or:

The web application's cookies are uniformly vulnerable across the web
application.
Consider adding a centralized point of sanitization.

I know that this comes close to taking the user by the hand (which I've
never really liked)
but I really think that such a system could work and save us time while
we're performing a pentest by
incorporating an expert's maverick experience, insights and
interpretation to an otherwise soulless process.

Something far superior to any AI.

A possible implementation I have in mind is to pre-tag a module when it's
added to the system.
The tags would specify key elements of the behavior of a module and will
later be used in the decision making process (based on rules).

For instance, in the example I mentioned at the beginning of this e-mail,
the system would check how many of the results have the "timing_attack" tag
and if that number was above a preset threshold it would remove the results
from the scan report or flag them accordingly.
And possibly take into account environment statistics to make a more
well-rounded decision (like average response times etc).

That makes sense... somehow... but I would rather fix the cause of the
timing attack bug.

That's why I said not to focus on that particular scenario as I'm not
talking about avoiding
false positives or (just) improving the accuracy of modules but to use
our that information to our advantage.

What do you guys thing?

AI for web application scanning has been on my mind since I started
with w3af, but I really haven't found a problem for which I would say:
"The best / faster / easier to develop way to solve this is AI". Maybe
if we hit our heads hard enough, we can find something where AI is
applied and then state: "w3af/arachni , the only web app scanner with
AI" ? :)

Same here, and I'd rather avoid it too; that's why I presented this
thought of mine as a more fitting alternative to such a situation.
I usually try to avoid unnecessary complexity like the plague.

Cheers,
Tasos L.

PS. I guess that this could be perceived as pre-trained expert system but
not really.


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org

On 01/02/11 19:20, Andres Riancho wrote: > Tasos, > [...] >> I'd like to note at this point that I'm a strong proponent of fixing the >> root of the problem instead of adding filtering layers on top of it but >> let's ignore this for argument's sake as well. >> >> Premises: >> * We have a finite amount of entities that assert the existence of issues >> -- we'll call these "modules". >> * We have a finite amount of outcomes for each of the modules; usually a >> binary result (either true/vulnerable or false/safe) >> but in some cases with a twist about the certainty of a result (i.e. a >> notice that an issue may require manual verification). >> >> And here comes my point: >> Do we really need AI? Wouldn't simple rules that check for unusual >> results and give appropriate notice suffice and be a better and more >> efficient way to handle this? > > In some cases you need AI, or need lots of signatures (either works). > For example, if you're trying to find SQL injections based on error > messages, a module that has 10 signatures is worse than one that has > 100. But I'm sure that the module with 100 signatures doesn't cover > all possible DBMS errors. On the other side, a web application > penetration tester that sees a rendered HTML response can identify a > SQL error even if its not something he has seen in the past (its not > in the expert's signature DB). That's where AI might be handy. > That's not what I meant exactly. My thoughts were mostly towards interpretation of the results after the scan. Something akin to adding the following to the report: --------------------- Judging by the results of the scan and request timeouts the site seems to have been stressed to its limits. This shouldn't have happened and it means that you are susceptible to DoS attack quite easily. --------------------- Or: --------------------- The web application's cookies are uniformly vulnerable across the web application. Consider adding a centralized point of sanitization. --------------------- I know that this comes close to taking the user by the hand (which I've never really liked) but I really think that such a system could work and save us time while we're performing a pentest by incorporating an expert's maverick experience, insights and interpretation to an otherwise soulless process. Something far superior to any AI. >> A possible implementation I have in mind is to pre-tag a module when it's >> added to the system. >> The tags would specify key elements of the behavior of a module and will >> later be used in the decision making process (based on rules). >> >> For instance, in the example I mentioned at the beginning of this e-mail, >> the system would check how many of the results have the "timing_attack" tag >> and if that number was above a preset threshold it would remove the results >> from the scan report or flag them accordingly. >> And possibly take into account environment statistics to make a more >> well-rounded decision (like average response times etc). > > That makes sense... somehow... but I would rather fix the cause of the > timing attack bug. That's why I said not to focus on that particular scenario as I'm not talking about avoiding false positives or (just) improving the accuracy of modules but to use our that information to our advantage. > >> What do you guys thing? > > AI for web application scanning has been on my mind since I started > with w3af, but I really haven't found a problem for which I would say: > "The best / faster / easier to develop way to solve this is AI". Maybe > if we hit our heads hard enough, we can find something where AI is > applied and then state: "w3af/arachni , the only web app scanner with > AI" ? :) > Same here, and I'd rather avoid it too; that's why I presented this thought of mine as a more fitting alternative to such a situation. I usually try to avoid unnecessary complexity like the plague. >> Cheers, >> Tasos L. >> >> PS. I guess that this could be perceived as pre-trained expert system but >> not really. >> >> _______________________________________________ >> The Web Security Mailing List >> >> WebSecurity RSS Feed >> http://www.webappsec.org/rss/websecurity.rss >> >> Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA >> >> WASC on Twitter >> http://twitter.com/wascupdates >> >> websecurity@lists.webappsec.org >> http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org >> > > >
MZ
Michal Zalewski
Tue, Feb 1, 2011 8:56 PM

Lots of people in this list would like to see our tools implement some sort
of AI (I know for a fact that at least Michal does)

Lies!;-)

To clarify, I don't think that the current "AI toolset" (ANN, genetic
algorithms, expert systems) is going to make a substantial difference.
These tools simply offer you a glorified framework for brute-forcing
quasi-optimal decision algorithms in some not-too-complicated cases.
One time, they may arrive at results better than what would be
possible with, ahem, a man-made algorithm; other times, they work just
as well or worse, and just introduce a layer of indirection.

There's a cost to that layer, too: when your "dumb" scanner
incorrectly labels a particular response as XSRF, you just tweak
several lines of code. If the same determination is made by a complex
ANN with hundreds of inputs, there is no simple fix. You may retrain
it with new data, which may or may not help; and even if it helps, it
may decrease performance in other areas. Then you have to change the
topology or inputs, or the learning algorithm... and nothing of this
guarantees success.

Web scanners do lack certain cognitive abilities of humans, which
makes these tools fairly sucky - but I don't think we know how to
approximate these abilities with a computer yet; they're mostly
related to language comprehension and abstract reasoning.

Do we really need AI?

The term is sort of meaningless. Scanners will require a lot of human
assistance until they can perform certain analytic tasks that
computers currently suck at; calling it "AI" is probably just a
distraction.

/mz

> Lots of people in this list would like to see our tools implement some sort > of AI (I know for a fact that at least Michal does) Lies!;-) To clarify, I don't think that the current "AI toolset" (ANN, genetic algorithms, expert systems) is going to make a substantial difference. These tools simply offer you a glorified framework for brute-forcing quasi-optimal decision algorithms in some not-too-complicated cases. One time, they may arrive at results better than what would be possible with, ahem, a man-made algorithm; other times, they work just as well or worse, and just introduce a layer of indirection. There's a cost to that layer, too: when your "dumb" scanner incorrectly labels a particular response as XSRF, you just tweak several lines of code. If the same determination is made by a complex ANN with hundreds of inputs, there is no simple fix. You may retrain it with new data, which may or may not help; and even if it helps, it may decrease performance in other areas. Then you have to change the topology or inputs, or the learning algorithm... and nothing of this guarantees success. Web scanners do lack certain cognitive abilities of humans, which makes these tools fairly sucky - but I don't think we know how to approximate these abilities with a computer yet; they're mostly related to language comprehension and abstract reasoning. > Do we really need AI? The term is sort of meaningless. Scanners will require a lot of human assistance until they can perform certain analytic tasks that computers currently suck at; calling it "AI" is probably just a distraction. /mz
TL
Tasos Laskos
Tue, Feb 1, 2011 9:53 PM

My bad, the statement I had read about your thoughts on AI wasn't
accompanied by such an elaborate explanation.
(Anyways, let's forget the term AI for now and focus on my proposed system.)

My point was interpretation of results as I mentioned in my previous
reply (list latency can play tricks on us).

In very simple terms, you as a person can draw/extrapolate conlusions
based on a combination of factors from the results of a scan.
Such as:

  • if a specific page takes an anormally long time to response then it
    must be doing some heavy duty processing (and thus can be the perfect
    DoS point)
  • if a lot of requests from some point on timed out then the server died

These are very simple examples and anyone worth his salt will be able to
come to the same conclusion by himself.

However, the reason scanners exist is to make relatively simple
scenarios easy and quick to find,
so why not incorporate our insights and conclusions in there too?

Each module, when run, does not yet have a wide view of the whole
process because the process itself is still in progress.
Separate entities that are called after the scan has finished and given
the whole picture will be able to draw well rounded conclusions
based on a combination of factors.

And since they will be written by people they will be as good as the
person who wrote them.

These entities will just check for the existense (or lack) of available
data, very straightforward stuff, and be developed and incorporated into
the whole system
on a case by case basis.

  1. I run a scan and see weird behavior
  2. I look into the situation
  3. I find that the root of that behavior is something very valuable to
    my security assesment
  4. I create a "rule" that will check for the same behavior during future
    scans and replay the knowledge I had previously gained
  5. I send my rule back to the project for everyone to use

Up to now we've only been concerned about identifying issues and (to my
knowledge) have neglected to include interpretation of combinations of
available issues.

Does that make any sense?

-Tasos

Lots of people in this list would like to see our tools implement some sort
of AI (I know for a fact that at least Michal does)

Lies!;-)

To clarify, I don't think that the current "AI toolset" (ANN, genetic
algorithms, expert systems) is going to make a substantial difference.
These tools simply offer you a glorified framework for brute-forcing
quasi-optimal decision algorithms in some not-too-complicated cases.
One time, they may arrive at results better than what would be
possible with, ahem, a man-made algorithm; other times, they work just
as well or worse, and just introduce a layer of indirection.

There's a cost to that layer, too: when your "dumb" scanner
incorrectly labels a particular response as XSRF, you just tweak
several lines of code. If the same determination is made by a complex
ANN with hundreds of inputs, there is no simple fix. You may retrain
it with new data, which may or may not help; and even if it helps, it
may decrease performance in other areas. Then you have to change the
topology or inputs, or the learning algorithm... and nothing of this
guarantees success.

Web scanners do lack certain cognitive abilities of humans, which
makes these tools fairly sucky - but I don't think we know how to
approximate these abilities with a computer yet; they're mostly
related to language comprehension and abstract reasoning.

Do we really need AI?

The term is sort of meaningless. Scanners will require a lot of human
assistance until they can perform certain analytic tasks that
computers currently suck at; calling it "AI" is probably just a
distraction.

/mz

My bad, the statement I had read about your thoughts on AI wasn't accompanied by such an elaborate explanation. (Anyways, let's forget the term AI for now and focus on my proposed system.) My point was interpretation of results as I mentioned in my previous reply (list latency can play tricks on us). In very simple terms, you as a person can draw/extrapolate conlusions based on a combination of factors from the results of a scan. Such as: * if a specific page takes an anormally long time to response then it must be doing some heavy duty processing (and thus can be the perfect DoS point) * if a lot of requests from some point on timed out then the server died These are very simple examples and anyone worth his salt will be able to come to the same conclusion by himself. However, the reason scanners exist is to make relatively simple scenarios easy and quick to find, so why not incorporate our insights and conclusions in there too? Each module, when run, does not yet have a wide view of the whole process because the process itself is still in progress. Separate entities that are called after the scan has finished and given the whole picture will be able to draw well rounded conclusions based on a combination of factors. And since they will be written by people they will be as good as the person who wrote them. These entities will just check for the existense (or lack) of available data, very straightforward stuff, and be developed and incorporated into the whole system on a case by case basis. 1. I run a scan and see weird behavior 2. I look into the situation 3. I find that the root of that behavior is something very valuable to my security assesment 4. I create a "rule" that will check for the same behavior during future scans and replay the knowledge I had previously gained 5. I send my rule back to the project for everyone to use Up to now we've only been concerned about identifying issues and (to my knowledge) have neglected to include interpretation of combinations of available issues. Does that make any sense? -Tasos >> Lots of people in this list would like to see our tools implement some sort >> of AI (I know for a fact that at least Michal does) > Lies!;-) > > To clarify, I don't think that the current "AI toolset" (ANN, genetic > algorithms, expert systems) is going to make a substantial difference. > These tools simply offer you a glorified framework for brute-forcing > quasi-optimal decision algorithms in some not-too-complicated cases. > One time, they may arrive at results better than what would be > possible with, ahem, a man-made algorithm; other times, they work just > as well or worse, and just introduce a layer of indirection. > > There's a cost to that layer, too: when your "dumb" scanner > incorrectly labels a particular response as XSRF, you just tweak > several lines of code. If the same determination is made by a complex > ANN with hundreds of inputs, there is no simple fix. You may retrain > it with new data, which may or may not help; and even if it helps, it > may decrease performance in other areas. Then you have to change the > topology or inputs, or the learning algorithm... and nothing of this > guarantees success. > > Web scanners do lack certain cognitive abilities of humans, which > makes these tools fairly sucky - but I don't think we know how to > approximate these abilities with a computer yet; they're mostly > related to language comprehension and abstract reasoning. > >> Do we really need AI? > The term is sort of meaningless. Scanners will require a lot of human > assistance until they can perform certain analytic tasks that > computers currently suck at; calling it "AI" is probably just a > distraction. > > /mz >
MZ
Michal Zalewski
Tue, Feb 1, 2011 11:33 PM

Up to now we've only been concerned about identifying issues and (to my
knowledge) have neglected to include interpretation of combinations of
available issues.

Well, there's usually some more sophisticated logic in place (several
examples come to mind), but also definitely a lot of room for
improvement when it comes to detecting and correcting anomalies mid-
and post-scan. I don't think you're going to get any negative feedback
on plugging this hole with a modular solution :-)

/mz

> Up to now we've only been concerned about identifying issues and (to my > knowledge) have neglected to include interpretation of combinations of > available issues. Well, there's usually some more sophisticated logic in place (several examples come to mind), but also definitely a lot of room for improvement when it comes to detecting and correcting anomalies mid- and post-scan. I don't think you're going to get any negative feedback on plugging this hole with a modular solution :-) /mz
TL
Tasos Laskos
Wed, Feb 2, 2011 12:03 AM

Up to now we've only been concerned about identifying issues and (to my
knowledge) have neglected to include interpretation of combinations of
available issues.

Well, there's usually some more sophisticated logic in place (several
examples come to mind), but also definitely a lot of room for
improvement when it comes to detecting and correcting anomalies mid-
and post-scan. I don't think you're going to get any negative feedback
on plugging this hole with a modular solution :-)

/mz

C'mon man don't keep to yourself share the examples.

The good thing with Arachni is that you can add these sort of things in
a jiffy as a simple plugin with sub-modules.
So I'll probably whip up a demo tomorrow to give you guys a better idea
of what I'm talking about.

This is either going to become incredibly cool or complete useless, it's
worth a shot.

  • Tasos
>> Up to now we've only been concerned about identifying issues and (to my >> knowledge) have neglected to include interpretation of combinations of >> available issues. > Well, there's usually some more sophisticated logic in place (several > examples come to mind), but also definitely a lot of room for > improvement when it comes to detecting and correcting anomalies mid- > and post-scan. I don't think you're going to get any negative feedback > on plugging this hole with a modular solution :-) > > /mz > C'mon man don't keep to yourself share the examples. The good thing with Arachni is that you can add these sort of things in a jiffy as a simple plugin with sub-modules. So I'll probably whip up a demo tomorrow to give you guys a better idea of what I'm talking about. This is either going to become incredibly cool or complete useless, it's worth a shot. - Tasos
MZ
Michal Zalewski
Wed, Feb 2, 2011 12:16 AM

C'mon man don't keep to yourself share the examples.

Well, they're pretty trivial, but for example, skipfish does a
postprocessing round to eliminate duplicates and other loops; and
generally looks at a variety of information collected in earlier
checks to make decisions later on (e.g., the outcome of 404 probes,
individual and cumulative - if there are too many 404 signatures,
something has obviously gone wrong; etc). Nothing of this is special,
and it does not prevent it from being extremely dumb at times, but
it's probably unfair to say that absolutely no high-level
meta-analysis is being done.

/mz

> C'mon man don't keep to yourself share the examples. Well, they're pretty trivial, but for example, skipfish does a postprocessing round to eliminate duplicates and other loops; and generally looks at a variety of information collected in earlier checks to make decisions later on (e.g., the outcome of 404 probes, individual and cumulative - if there are too many 404 signatures, something has obviously gone wrong; etc). Nothing of this is special, and it does not prevent it from being extremely dumb at times, but it's probably unfair to say that absolutely no high-level meta-analysis is being done. /mz
TL
Tasos Laskos
Wed, Feb 2, 2011 12:35 AM

C'mon man don't keep to yourself share the examples.

Well, they're pretty trivial, but for example, skipfish does a
postprocessing round to eliminate duplicates and other loops; and
generally looks at a variety of information collected in earlier
checks to make decisions later on (e.g., the outcome of 404 probes,
individual and cumulative - if there are too many 404 signatures,
something has obviously gone wrong; etc). Nothing of this is special,
and it does not prevent it from being extremely dumb at times, but
it's probably unfair to say that absolutely no high-level
meta-analysis is being done.

/mz

Of course not, I didn't mean to say that all scanners just blindly spew
out their findings,
but they seem to be sticking with the bare minimum, just enough for
their results to make sence and reduce some noise.
(Myself included obviously.)

Like you said, there's much room for improvement and we need to start
from somewhere.

>> C'mon man don't keep to yourself share the examples. > Well, they're pretty trivial, but for example, skipfish does a > postprocessing round to eliminate duplicates and other loops; and > generally looks at a variety of information collected in earlier > checks to make decisions later on (e.g., the outcome of 404 probes, > individual and cumulative - if there are too many 404 signatures, > something has obviously gone wrong; etc). Nothing of this is special, > and it does not prevent it from being extremely dumb at times, but > it's probably unfair to say that absolutely no high-level > meta-analysis is being done. > > /mz > Of course not, I didn't mean to say that all scanners just blindly spew out their findings, but they seem to be sticking with the bare minimum, just enough for their results to make sence and reduce some noise. (Myself included obviously.) Like you said, there's much room for improvement and we need to start from somewhere.
EP
Erik Peterson
Wed, Feb 2, 2011 2:59 AM

The real problem is that this is a deep dark hole.

Automated Web application testing would be easy(tm) if it wasn't for the
edge cases, of which there are legion. Oh and because web technology is
re-inventing itself every 6 months, anything you create that worked 6
months ago, will become broken sooner or later, thus the cycle of edge
cases is, for all practical purposes, infinite.

Humans handle edge cases almost subconsciously, scanners on the other hand
tend to go sideways. But like MZ says the (better) scanners are doing a
lot pre, during and post processing to handle as much as they can
gracefully. WebInspect and AppScan (just as an example) both do
pre-processing along with some black magic regarding 404 checking, site
availability, response times and login/logout detection to name a few
things, with some more interesting post processing tricks coming in the
future I'm sure.

But again, the problem is that this is a bottomless pit unless you draw
the line somewhere. You will never run out of edge cases to deal with, and
even if you made it modular at what point is it smarter to just use your
brain and notice the scan has gone wrong vs. trying to write code to deal
with all the edge cases.

Unless of course you are trying to build a tool that will be used by
people who know nothing about web applications or web security to which I
say "Good luck with that" :)

++EJP

On 2/1/11 7:35 PM, "Tasos Laskos" tasos.laskos@gmail.com wrote:

C'mon man don't keep to yourself share the examples.

Well, they're pretty trivial, but for example, skipfish does a
postprocessing round to eliminate duplicates and other loops; and
generally looks at a variety of information collected in earlier
checks to make decisions later on (e.g., the outcome of 404 probes,
individual and cumulative - if there are too many 404 signatures,
something has obviously gone wrong; etc). Nothing of this is special,
and it does not prevent it from being extremely dumb at times, but
it's probably unfair to say that absolutely no high-level
meta-analysis is being done.

/mz

Of course not, I didn't mean to say that all scanners just blindly spew
out their findings,
but they seem to be sticking with the bare minimum, just enough for
their results to make sence and reduce some noise.
(Myself included obviously.)

Like you said, there's much room for improvement and we need to start
from somewhere.


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.or
g

The real problem is that this is a deep dark hole. Automated Web application testing would be easy(tm) if it wasn't for the edge cases, of which there are legion. Oh and because web technology is re-inventing itself every 6 months, anything you create that worked 6 months ago, will become broken sooner or later, thus the cycle of edge cases is, for all practical purposes, infinite. Humans handle edge cases almost subconsciously, scanners on the other hand tend to go sideways. But like MZ says the (better) scanners are doing a lot pre, during and post processing to handle as much as they can gracefully. WebInspect and AppScan (just as an example) both do pre-processing along with some black magic regarding 404 checking, site availability, response times and login/logout detection to name a few things, with some more interesting post processing tricks coming in the future I'm sure. But again, the problem is that this is a bottomless pit unless you draw the line somewhere. You will never run out of edge cases to deal with, and even if you made it modular at what point is it smarter to just use your brain and notice the scan has gone wrong vs. trying to write code to deal with all the edge cases. Unless of course you are trying to build a tool that will be used by people who know nothing about web applications or web security to which I say "Good luck with that" :) ++EJP On 2/1/11 7:35 PM, "Tasos Laskos" <tasos.laskos@gmail.com> wrote: >>> C'mon man don't keep to yourself share the examples. >> Well, they're pretty trivial, but for example, skipfish does a >> postprocessing round to eliminate duplicates and other loops; and >> generally looks at a variety of information collected in earlier >> checks to make decisions later on (e.g., the outcome of 404 probes, >> individual and cumulative - if there are too many 404 signatures, >> something has obviously gone wrong; etc). Nothing of this is special, >> and it does not prevent it from being extremely dumb at times, but >> it's probably unfair to say that absolutely no high-level >> meta-analysis is being done. >> >> /mz >> >Of course not, I didn't mean to say that all scanners just blindly spew >out their findings, >but they seem to be sticking with the bare minimum, just enough for >their results to make sence and reduce some noise. >(Myself included obviously.) > >Like you said, there's much room for improvement and we need to start >from somewhere. > > >_______________________________________________ >The Web Security Mailing List > >WebSecurity RSS Feed >http://www.webappsec.org/rss/websecurity.rss > >Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA > >WASC on Twitter >http://twitter.com/wascupdates > >websecurity@lists.webappsec.org >http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.or >g