websecurity@lists.webappsec.org

The Web Security Mailing List

View all threads

Repository of site URL structures?

RA
Robert A.
Tue, Jun 21, 2011 5:36 PM

Hello everyone,
Is anyone aware of a site that contains a list of funky url structures
used by production sites? I am not looking for a reply telling me I should
look at the RFC guidelines because not everyone may be following them.

Regards,

Hello everyone, Is anyone aware of a site that contains a list of funky url structures used by production sites? I am not looking for a reply telling me I should look at the RFC guidelines because not everyone may be following them. Regards, - Robert http://www.qasec.com/ http://www.webappsec.org http://www.cgisecurity.com
CW
Chris Weber
Tue, Jun 21, 2011 5:49 PM

What are you trying to do Robert?  I've been amassing a list of URIs and
IRIs for testing purposes, you can check it out here:

https://github.com/cweb/iri-tests/blob/master/tests.xml

Webkit also has a testing suite at
http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in
process of incorporating all of these tests into my test.xml above.

Everyone is definitely not following the RFC guidelines consistently.  I
built a test harness that correlates the DOM parsing of these URIs with the
HTTP request and the DNS queries.  The differences are dramatic in some
cases.

Thanks,
-Chris

-----Original Message-----
From: websecurity-bounces@lists.webappsec.org
[mailto:websecurity-bounces@lists.webappsec.org] On Behalf Of Robert A.
Sent: Tuesday, June 21, 2011 10:36 AM
To: websecurity@lists.webappsec.org
Subject: [WEB SECURITY] Repository of site URL structures?

Hello everyone,
Is anyone aware of a site that contains a list of funky url structures used
by production sites? I am not looking for a reply telling me I should look
at the RFC guidelines because not everyone may be following them.

Regards,


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org

What are you trying to do Robert? I've been amassing a list of URIs and IRIs for testing purposes, you can check it out here: https://github.com/cweb/iri-tests/blob/master/tests.xml Webkit also has a testing suite at http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in process of incorporating all of these tests into my test.xml above. Everyone is definitely not following the RFC guidelines consistently. I built a test harness that correlates the DOM parsing of these URIs with the HTTP request and the DNS queries. The differences are dramatic in some cases. Thanks, -Chris -----Original Message----- From: websecurity-bounces@lists.webappsec.org [mailto:websecurity-bounces@lists.webappsec.org] On Behalf Of Robert A. Sent: Tuesday, June 21, 2011 10:36 AM To: websecurity@lists.webappsec.org Subject: [WEB SECURITY] Repository of site URL structures? Hello everyone, Is anyone aware of a site that contains a list of funky url structures used by production sites? I am not looking for a reply telling me I should look at the RFC guidelines because not everyone may be following them. Regards, - Robert http://www.qasec.com/ http://www.webappsec.org http://www.cgisecurity.com _______________________________________________ The Web Security Mailing List WebSecurity RSS Feed http://www.webappsec.org/rss/websecurity.rss Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA WASC on Twitter http://twitter.com/wascupdates websecurity@lists.webappsec.org http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org
RA
Robert A.
Tue, Jun 21, 2011 5:58 PM

What are you trying to do Robert?  I've been amassing a list of URIs and
IRIs for testing purposes, you can check it out here:

There have been multiple situations where I've needed example of ! and ;
as URL delimeters (which I've seen before but lack urls for), or @ within
a URL (not in the context of user@domain.com auth). Or urls using comma's
such as http://site/foo?12,12,12 .

I am just looking for a central repository that I can point people to.

https://github.com/cweb/iri-tests/blob/master/tests.xml

Webkit also has a testing suite at
http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in
process of incorporating all of these tests into my test.xml above.

Cool this is helpful thanks.

Everyone is definitely not following the RFC guidelines consistently.  I
built a test harness that correlates the DOM parsing of these URIs with the
HTTP request and the DNS queries.  The differences are dramatic in some
cases.

So how come we haven't seen more advisories/bugs from you? Surely there
are tons to be found :)

Thanks Chris,

Thanks,
-Chris

-----Original Message-----
From: websecurity-bounces@lists.webappsec.org
[mailto:websecurity-bounces@lists.webappsec.org] On Behalf Of Robert A.
Sent: Tuesday, June 21, 2011 10:36 AM
To: websecurity@lists.webappsec.org
Subject: [WEB SECURITY] Repository of site URL structures?

Hello everyone,
Is anyone aware of a site that contains a list of funky url structures used
by production sites? I am not looking for a reply telling me I should look
at the RFC guidelines because not everyone may be following them.

Regards,


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org

> What are you trying to do Robert? I've been amassing a list of URIs and > IRIs for testing purposes, you can check it out here: There have been multiple situations where I've needed example of ! and ; as URL delimeters (which I've seen before but lack urls for), or @ within a URL (not in the context of user@domain.com auth). Or urls using comma's such as http://site/foo?12,12,12 . I am just looking for a central repository that I can point people to. > https://github.com/cweb/iri-tests/blob/master/tests.xml > > Webkit also has a testing suite at > http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in > process of incorporating all of these tests into my test.xml above. Cool this is helpful thanks. > Everyone is definitely not following the RFC guidelines consistently. I > built a test harness that correlates the DOM parsing of these URIs with the > HTTP request and the DNS queries. The differences are dramatic in some > cases. So how come we haven't seen more advisories/bugs from you? Surely there are tons to be found :) Thanks Chris, - Robert http://www.qasec.com/ http://www.webappsec.org/ http://www.cgisecurity.com/ > > Thanks, > -Chris > > > -----Original Message----- > From: websecurity-bounces@lists.webappsec.org > [mailto:websecurity-bounces@lists.webappsec.org] On Behalf Of Robert A. > Sent: Tuesday, June 21, 2011 10:36 AM > To: websecurity@lists.webappsec.org > Subject: [WEB SECURITY] Repository of site URL structures? > > Hello everyone, > Is anyone aware of a site that contains a list of funky url structures used > by production sites? I am not looking for a reply telling me I should look > at the RFC guidelines because not everyone may be following them. > > Regards, > - Robert > http://www.qasec.com/ > http://www.webappsec.org > http://www.cgisecurity.com > > > _______________________________________________ > The Web Security Mailing List > > WebSecurity RSS Feed > http://www.webappsec.org/rss/websecurity.rss > > Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA > > WASC on Twitter > http://twitter.com/wascupdates > > websecurity@lists.webappsec.org > http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org > > > _______________________________________________ > The Web Security Mailing List > > WebSecurity RSS Feed > http://www.webappsec.org/rss/websecurity.rss > > Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA > > WASC on Twitter > http://twitter.com/wascupdates > > websecurity@lists.webappsec.org > http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org >
CW
Chris Weber
Tue, Jun 21, 2011 6:36 PM

There have been multiple situations where I've needed example of ! and ;
as URL delimeters (which I've seen before but lack urls for), or @ within
a URL (not in the context of user@domain.com auth). Or urls using comma's
such as http://site/foo?12,12,12 .

I am just looking for a central repository that I can point people to.

https://github.com/cweb/iri-tests/blob/master/tests.xml

Webkit also has a testing suite at
http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in
process of incorporating all of these tests into my test.xml above.

Cool this is helpful thanks.

Those test cases are included in both of these repositories.  Webkit's is at
least stable but spread out across as arrays in a bunch of js files.  The
test suite itself is only concerned with the DOM parsing.  It's nice and
portable so you can easily run it in any browser.

My tests.xml will be changing a lot over the next few weeks to include as
many of Webkit's tests as I can plus others.  Contributions are welcome!
Some goals of mine are to include all of the tests in one XML file with a
unique id per test, plus an expected result.  The reason you see a domain
name like iris.test.ing is because I use a custom DNS zone in my test
harness.

Everyone is definitely not following the RFC guidelines consistently.  I
built a test harness that correlates the DOM parsing of these URIs with

the

HTTP request and the DNS queries.  The differences are dramatic in some
cases.

So how come we haven't seen more advisories/bugs from you? Surely there
are tons to be found :)

I expect the same :) But there's a lot of work to do still.  Right now it
seems like interop bugs mostly, and the exploit scenarios seem more
distributed, like depending how apps or security WAFs/filters/what-have-you
handle the strings.

Consider the following test cases, can you think of any 'general-purpose'
exploits?

---======
Test Case: http://0028.iris.test.ing;g

The DOM parsing is different in each FF, IE, and Opera - while both Safari
and Chrome error.  FF drops the ";g", Opera uses it in the path, and IE uses
it in the hostname...

Scheme    Hostname                Path    Browser
:
Chrome/12.0.742.100
http:          0028.iris.test.ing      /        Firefox/4.0.1
http:          0028.iris.test.ing      /;g      Opera/9.80
:
Safari/5.0.5
http:          0028.iris.test.ing;g              MSIE 7.0

But the raw HTTP request is interesting because Firefox does it differently
than its DOM parsing.  Neither Chrome, Safari, or IE even make an HTTP
request.

Path    Browser
/;g      Firefox/4.0.1
/;g      Opera/9.80

---======
Test Case: http://0029.iris.test.ing;./g

In this slightly different case Firefox has changed its handling of the ";"
trailing the hostname, and treats it instead as part of the path.

Scheme      Hostname                Path    Browser
:
Chrome/12.0.742.100
http:            0029.iris.test.ing      /;./g    Firefox/4.0.1
http:            0029.iris.test.ing      /;./g    Opera/9.80
http:            0029.iris.test.ing;.    g          MSIE 7.0
:
Safari/5.0.5

Similar results as above for the HTTP request.

Path    Browser
/;./g    Firefox/4.0.1
/;./g    Opera/9.80

There's many more edge cases.  Check out the DOM parsing results of the
following test case:

http://0152.iris.test.ing/foo|bar/

Path                    Browser
/foo%7Cbar/    Chrome/12
foo%7Cbar/      MSIE 7.0
/foo|bar/          Opera/9.80
/foo|bar/          Safari/5.0.5
/foo|bar/          Firefox/4.0.1

But the more interesting thing here is that the raw HTTP request doesn't
match for Safari:

Path                    Browser
/foo%7Cbar/    Chrome/12
/foo%7Cbar/    MSIE 7.0
/foo|bar/          Opera/9.80
/foo%7Cbar/    Safari/5.0.5
/foo|bar/          Firefox/4.0.1

In this case Safari's DOM 'path' property is different than the raw HTTP
request 'path' it generates to fetch the resource.

-Chris

> There have been multiple situations where I've needed example of ! and ; > as URL delimeters (which I've seen before but lack urls for), or @ within > a URL (not in the context of user@domain.com auth). Or urls using comma's > such as http://site/foo?12,12,12 . > > I am just looking for a central repository that I can point people to. > > > https://github.com/cweb/iri-tests/blob/master/tests.xml > > > > Webkit also has a testing suite at > > http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in > > process of incorporating all of these tests into my test.xml above. > > Cool this is helpful thanks. Those test cases are included in both of these repositories. Webkit's is at least stable but spread out across as arrays in a bunch of js files. The test suite itself is only concerned with the DOM parsing. It's nice and portable so you can easily run it in any browser. My tests.xml will be changing a lot over the next few weeks to include as many of Webkit's tests as I can plus others. Contributions are welcome! Some goals of mine are to include all of the tests in one XML file with a unique id per test, plus an expected result. The reason you see a domain name like iris.test.ing is because I use a custom DNS zone in my test harness. > > Everyone is definitely not following the RFC guidelines consistently. I > > built a test harness that correlates the DOM parsing of these URIs with the > > HTTP request and the DNS queries. The differences are dramatic in some > > cases. > > So how come we haven't seen more advisories/bugs from you? Surely there > are tons to be found :) I expect the same :) But there's a lot of work to do still. Right now it seems like interop bugs mostly, and the exploit scenarios seem more distributed, like depending how apps or security WAFs/filters/what-have-you handle the strings. Consider the following test cases, can you think of any 'general-purpose' exploits? ======================================= Test Case: http://0028.iris.test.ing;g The DOM parsing is different in each FF, IE, and Opera - while both Safari and Chrome error. FF drops the ";g", Opera uses it in the path, and IE uses it in the hostname... Scheme Hostname Path Browser : Chrome/12.0.742.100 http: 0028.iris.test.ing / Firefox/4.0.1 http: 0028.iris.test.ing /;g Opera/9.80 : Safari/5.0.5 http: 0028.iris.test.ing;g MSIE 7.0 But the raw HTTP request is interesting because Firefox does it differently than its DOM parsing. Neither Chrome, Safari, or IE even make an HTTP request. Path Browser /;g Firefox/4.0.1 /;g Opera/9.80 ======================================= Test Case: http://0029.iris.test.ing;./g In this slightly different case Firefox has changed its handling of the ";" trailing the hostname, and treats it instead as part of the path. Scheme Hostname Path Browser : Chrome/12.0.742.100 http: 0029.iris.test.ing /;./g Firefox/4.0.1 http: 0029.iris.test.ing /;./g Opera/9.80 http: 0029.iris.test.ing;. g MSIE 7.0 : Safari/5.0.5 Similar results as above for the HTTP request. Path Browser /;./g Firefox/4.0.1 /;./g Opera/9.80 There's many more edge cases. Check out the DOM parsing results of the following test case: http://0152.iris.test.ing/foo|bar/ Path Browser /foo%7Cbar/ Chrome/12 foo%7Cbar/ MSIE 7.0 /foo|bar/ Opera/9.80 /foo|bar/ Safari/5.0.5 /foo|bar/ Firefox/4.0.1 But the more interesting thing here is that the raw HTTP request doesn't match for Safari: Path Browser /foo%7Cbar/ Chrome/12 /foo%7Cbar/ MSIE 7.0 /foo|bar/ Opera/9.80 /foo%7Cbar/ Safari/5.0.5 /foo|bar/ Firefox/4.0.1 In this case Safari's DOM 'path' property is different than the raw HTTP request 'path' it generates to fetch the resource. -Chris
AR
Andres Riancho
Tue, Jun 21, 2011 6:58 PM

Chris,

On Tue, Jun 21, 2011 at 2:49 PM, Chris Weber chris@casabasecurity.com wrote:

What are you trying to do Robert?  I've been amassing a list of URIs and
IRIs for testing purposes, you can check it out here:

https://github.com/cweb/iri-tests/blob/master/tests.xml

Awesome stuff :) Quick question, how do you know what's the real
expected result? For example in:

<test> <id>0022</id> <uri>http://0022.iris.test.ing/a-umlaut/?x=ä&#x0023;#ä#ä</uri> <expected> <protocol>http:</protocol> <host>0022.iris.test.ing</host> <hostname>0022.iris.test.ing</hostname> <port></port> <pathname>/a-umlaut</pathname> <search>?x=ä&#x0023;</search> <hash>#ä#ä</hash> </expected> <comment>non-ASCII character in query string and fragment</comment> </test>

Where did all the stuff in <expected></expected> came from? Have you
tested all these in IE, Firefox, Safari, Opera and extracted expected
results from there?

Webkit also has a testing suite at
http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in
process of incorporating all of these tests into my test.xml above.

Everyone is definitely not following the RFC guidelines consistently.  I
built a test harness that correlates the DOM parsing of these URIs with the
HTTP request and the DNS queries.  The differences are dramatic in some
cases.

Thanks,
-Chris

-----Original Message-----
From: websecurity-bounces@lists.webappsec.org
[mailto:websecurity-bounces@lists.webappsec.org] On Behalf Of Robert A.
Sent: Tuesday, June 21, 2011 10:36 AM
To: websecurity@lists.webappsec.org
Subject: [WEB SECURITY] Repository of site URL structures?

Hello everyone,
Is anyone aware of a site that contains a list of funky url structures used
by production sites? I am not looking for a reply telling me I should look
at the RFC guidelines because not everyone may be following them.

Regards,


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org


The Web Security Mailing List

WebSecurity RSS Feed
http://www.webappsec.org/rss/websecurity.rss

Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA

WASC on Twitter
http://twitter.com/wascupdates

websecurity@lists.webappsec.org
http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org

--
Andrés Riancho
Director of Web Security at Rapid7 LLC
Founder at Bonsai Information Security
Project Leader at w3af

Chris, On Tue, Jun 21, 2011 at 2:49 PM, Chris Weber <chris@casabasecurity.com> wrote: > What are you trying to do Robert?  I've been amassing a list of URIs and > IRIs for testing purposes, you can check it out here: > > https://github.com/cweb/iri-tests/blob/master/tests.xml Awesome stuff :) Quick question, how do you know what's the real expected result? For example in: <test> <id>0022</id> <uri>http://0022.iris.test.ing/a-umlaut/?x=ä&#x0023;#ä#ä</uri> <expected> <protocol>http:</protocol> <host>0022.iris.test.ing</host> <hostname>0022.iris.test.ing</hostname> <port></port> <pathname>/a-umlaut</pathname> <search>?x=ä&#x0023;</search> <hash>#ä#ä</hash> </expected> <comment>non-ASCII character in query string and fragment</comment> </test> Where did all the stuff in <expected></expected> came from? Have you tested all these in IE, Firefox, Safari, Opera and extracted expected results from there? > Webkit also has a testing suite at > http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/ Note: I'm in > process of incorporating all of these tests into my test.xml above. > > Everyone is definitely not following the RFC guidelines consistently.  I > built a test harness that correlates the DOM parsing of these URIs with the > HTTP request and the DNS queries.  The differences are dramatic in some > cases. > > Thanks, > -Chris > > > -----Original Message----- > From: websecurity-bounces@lists.webappsec.org > [mailto:websecurity-bounces@lists.webappsec.org] On Behalf Of Robert A. > Sent: Tuesday, June 21, 2011 10:36 AM > To: websecurity@lists.webappsec.org > Subject: [WEB SECURITY] Repository of site URL structures? > > Hello everyone, > Is anyone aware of a site that contains a list of funky url structures used > by production sites? I am not looking for a reply telling me I should look > at the RFC guidelines because not everyone may be following them. > > Regards, > - Robert > http://www.qasec.com/ > http://www.webappsec.org > http://www.cgisecurity.com > > > _______________________________________________ > The Web Security Mailing List > > WebSecurity RSS Feed > http://www.webappsec.org/rss/websecurity.rss > > Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA > > WASC on Twitter > http://twitter.com/wascupdates > > websecurity@lists.webappsec.org > http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org > > > _______________________________________________ > The Web Security Mailing List > > WebSecurity RSS Feed > http://www.webappsec.org/rss/websecurity.rss > > Join WASC on LinkedIn http://www.linkedin.com/e/gis/83336/4B20E4374DBA > > WASC on Twitter > http://twitter.com/wascupdates > > websecurity@lists.webappsec.org > http://lists.webappsec.org/mailman/listinfo/websecurity_lists.webappsec.org > -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af
CW
Chris Weber
Tue, Jun 21, 2011 7:09 PM

Hi Andres,

-----Original Message-----
From: Andres Riancho [mailto:andres.riancho@gmail.com]
Sent: Tuesday, June 21, 2011 11:59 AM
To: Chris Weber
Cc: websecurity@lists.webappsec.org
Subject: Re: [WEB SECURITY] Repository of site URL structures?

Chris,

On Tue, Jun 21, 2011 at 2:49 PM, Chris Weber chris@casabasecurity.com
wrote:

What are you trying to do Robert?  I've been amassing a list of URIs
and IRIs for testing purposes, you can check it out here:

https://github.com/cweb/iri-tests/blob/master/tests.xml

Awesome stuff :) Quick question, how do you know what's the real expected
result? For example in:

That's an important question isn't it :) Please ignore the <expected> stuff
for now, it's in flux.  Webkit has its own idea of what's expected, so some
of it comes from there, others of it come from  the RFCs.  But it's still
questionable why Webkit chose it's expected results.  I'm planning to keep
Webkit's expected result for now, and considering basing the expected result
on the majority browser implementation, which means more testing and data
collection first.

-Chris

Hi Andres, > -----Original Message----- > From: Andres Riancho [mailto:andres.riancho@gmail.com] > Sent: Tuesday, June 21, 2011 11:59 AM > To: Chris Weber > Cc: websecurity@lists.webappsec.org > Subject: Re: [WEB SECURITY] Repository of site URL structures? > > Chris, > > On Tue, Jun 21, 2011 at 2:49 PM, Chris Weber <chris@casabasecurity.com> > wrote: > > What are you trying to do Robert?  I've been amassing a list of URIs > > and IRIs for testing purposes, you can check it out here: > > > > https://github.com/cweb/iri-tests/blob/master/tests.xml > > Awesome stuff :) Quick question, how do you know what's the real expected > result? For example in: That's an important question isn't it :) Please ignore the <expected> stuff for now, it's in flux. Webkit has its own idea of what's expected, so some of it comes from there, others of it come from the RFCs. But it's still questionable why Webkit chose it's expected results. I'm planning to keep Webkit's expected result for now, and considering basing the expected result on the majority browser implementation, which means more testing and data collection first. -Chris
AR
Andres Riancho
Tue, Jun 21, 2011 7:40 PM

Chris,

On Tue, Jun 21, 2011 at 4:09 PM, Chris Weber chris@casabasecurity.com wrote:

Hi Andres,

-----Original Message-----
From: Andres Riancho [mailto:andres.riancho@gmail.com]
Sent: Tuesday, June 21, 2011 11:59 AM
To: Chris Weber
Cc: websecurity@lists.webappsec.org
Subject: Re: [WEB SECURITY] Repository of site URL structures?

Chris,

On Tue, Jun 21, 2011 at 2:49 PM, Chris Weber chris@casabasecurity.com
wrote:

What are you trying to do Robert?  I've been amassing a list of URIs
and IRIs for testing purposes, you can check it out here:

https://github.com/cweb/iri-tests/blob/master/tests.xml

Awesome stuff :) Quick question, how do you know what's the real expected
result? For example in:

That's an important question isn't it :) Please ignore the <expected> stuff
for now, it's in flux.  Webkit has its own idea of what's expected, so some
of it comes from there, others of it come from  the RFCs.  But it's still
questionable why Webkit chose it's expected results.   I'm planning to keep
Webkit's expected result for now, and considering basing the expected result
on the majority browser implementation, which means more testing and data
collection first.

Cool, I'm eager to see more work done on tests.xml , it will be a
perfect thing for testing w3af's url parsing module! We already have
lots of doctests [0] but I would be really happy to integrate
tests.xml into a unit-test that reads each of the <test> , makes
url_object parse the <uri> and compares the object with <expected>

Just noticed that you might be missing the test where you have a
param:    http://www.w3af.com/foo/bar?spam;eggs=1    (eggs=1) is the
param.

[0] https://sourceforge.net/apps/trac/w3af/browser/trunk/core/data/parsers/urlParser.py

Regards,

-Chris

--
Andrés Riancho
Director of Web Security at Rapid7 LLC
Founder at Bonsai Information Security
Project Leader at w3af

Chris, On Tue, Jun 21, 2011 at 4:09 PM, Chris Weber <chris@casabasecurity.com> wrote: > Hi Andres, > >> -----Original Message----- >> From: Andres Riancho [mailto:andres.riancho@gmail.com] >> Sent: Tuesday, June 21, 2011 11:59 AM >> To: Chris Weber >> Cc: websecurity@lists.webappsec.org >> Subject: Re: [WEB SECURITY] Repository of site URL structures? >> >> Chris, >> >> On Tue, Jun 21, 2011 at 2:49 PM, Chris Weber <chris@casabasecurity.com> >> wrote: >> > What are you trying to do Robert?  I've been amassing a list of URIs >> > and IRIs for testing purposes, you can check it out here: >> > >> > https://github.com/cweb/iri-tests/blob/master/tests.xml >> >> Awesome stuff :) Quick question, how do you know what's the real expected >> result? For example in: > > That's an important question isn't it :) Please ignore the <expected> stuff > for now, it's in flux.  Webkit has its own idea of what's expected, so some > of it comes from there, others of it come from  the RFCs.  But it's still > questionable why Webkit chose it's expected results.   I'm planning to keep > Webkit's expected result for now, and considering basing the expected result > on the majority browser implementation, which means more testing and data > collection first. Cool, I'm eager to see more work done on tests.xml , it will be a perfect thing for testing w3af's url parsing module! We already have lots of doctests [0] but I would be really happy to integrate tests.xml into a unit-test that reads each of the <test> , makes url_object parse the <uri> and compares the object with <expected> Just noticed that you might be missing the test where you have a param: http://www.w3af.com/foo/bar?spam;eggs=1 (eggs=1) is the param. [0] https://sourceforge.net/apps/trac/w3af/browser/trunk/core/data/parsers/urlParser.py Regards, > -Chris > > > -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af
CW
Chris Weber
Tue, Jun 21, 2011 10:19 PM

-----Original Message-----
From: Andres Riancho [mailto:andres.riancho@gmail.com]
Sent: Tuesday, June 21, 2011 12:41 PM
To: Chris Weber
Cc: websecurity@lists.webappsec.org
Subject: Re: [WEB SECURITY] Repository of site URL structures?

Cool, I'm eager to see more work done on tests.xml , it will be a perfect

thing

for testing w3af's url parsing module! We already have lots of doctests

[0]

but I would be really happy to integrate tests.xml into a unit-test that

reads

each of the <test> , makes url_object parse the <uri> and compares the
object with <expected>

Stay in touch, I'll get the Webkit tests integrated next, then clean up the
expected results.

Just noticed that you might be missing the test where you have a
param:    http://www.w3af.com/foo/bar?spam;eggs=1    (eggs=1) is the
param.

Test cases are always welcome!

> -----Original Message----- > From: Andres Riancho [mailto:andres.riancho@gmail.com] > Sent: Tuesday, June 21, 2011 12:41 PM > To: Chris Weber > Cc: websecurity@lists.webappsec.org > Subject: Re: [WEB SECURITY] Repository of site URL structures? > > Cool, I'm eager to see more work done on tests.xml , it will be a perfect thing > for testing w3af's url parsing module! We already have lots of doctests [0] > but I would be really happy to integrate tests.xml into a unit-test that reads > each of the <test> , makes url_object parse the <uri> and compares the > object with <expected> Stay in touch, I'll get the Webkit tests integrated next, then clean up the expected results. > Just noticed that you might be missing the test where you have a > param: http://www.w3af.com/foo/bar?spam;eggs=1 (eggs=1) is the > param. Test cases are always welcome!
AH
Achim Hoffmann
Wed, Jun 22, 2011 8:02 PM

Hi Andres,

Just noticed that you might be missing the test where you have a
param:    http://www.w3af.com/foo/bar?spam;eggs=1    (eggs=1) is the
param.

not sure what's your question here, but according RFC1738 you have a
"searchpart" (aka query string) which is in your example
spam;eggs=1

For those tools/frameworks/whatever which believe that a query string
consist of key=value pairs which must be separated by & the key here
would be
spam;eggs
and the value
1

The ; in the path of an URL is the delimiter for parameters, it should
not be a special character in the searchpart. Example:
http://f.q.d.n//path/to/file;parameter=2;par=3?search&key=val;ue

Therefore you have to URL-encode ; in the path, 'cause it separates path
from parameters, but it's not necessary in the searchpart.

All RFCs are wake about URL-encoding of special characters like / ; = | @

IIRC the same applies to | but don't have seen examples for that since
a very long time (may be back when Netscape Servers dominated Internet:)

Sorry for being a bit off-topic, but hope it helps. At least Robert's
examples with the ; in behind the FQDN are subject to it too, somehow.

Ciao,
Achim

Hi Andres, > Just noticed that you might be missing the test where you have a > param: http://www.w3af.com/foo/bar?spam;eggs=1 (eggs=1) is the > param. not sure what's your question here, but according RFC1738 you have a "searchpart" (aka query string) which is in your example spam;eggs=1 For those tools/frameworks/whatever which believe that a query string consist of key=value pairs which must be separated by & the key here would be spam;eggs and the value 1 The ; in the path of an URL is the delimiter for parameters, it should not be a special character in the searchpart. Example: http://f.q.d.n//path/to/file;parameter=2;par=3?search&key=val;ue Therefore you have to URL-encode ; in the path, 'cause it separates path from parameters, but it's not necessary in the searchpart. All RFCs are wake about URL-encoding of special characters like / ; = | @ IIRC the same applies to | but don't have seen examples for that since a very long time (may be back when Netscape Servers dominated Internet:) Sorry for being a bit off-topic, but hope it helps. At least Robert's examples with the ; in behind the FQDN are subject to it too, somehow. Ciao, Achim
AR
Andres Riancho
Thu, Jun 23, 2011 2:04 PM

Achim,

On Wed, Jun 22, 2011 at 5:02 PM, Achim Hoffmann websec10@sic-sec.org wrote:

Hi Andres,

Just noticed that you might be missing the test where you have a
param:    http://www.w3af.com/foo/bar?spam;eggs=1    (eggs=1) is the
param.

not sure what's your question here, but according RFC1738 you have a
"searchpart" (aka query string) which is in your example
       spam;eggs=1

Actually, what I meant was this:

import urlparse
urlparse.urlparse('http://www.w3af.com/filename.py;SESSION=321?id=1')

ParseResult(scheme='http', netloc='www.w3af.com', path='/filename.py',
params='SESSION=321', query='id=1', fragment='')

And I called it "param" not because that's the name in the RFC (AFAIK)
but because that's how python shows it to me :)

For those tools/frameworks/whatever which believe that a query string
consist of key=value pairs which must be separated by & the key here
would be
       spam;eggs
and the value
       1

The ; in the path of an URL is the delimiter for parameters, it should
not be a special character in the searchpart. Example:
       http://f.q.d.n//path/to/file;parameter=2;par=3?search&key=val;ue

Not sure if we're saying the same thing or not. What I'm trying to say
is that URLs can have a "special" section that starts with a ";" after
the filename, and tests.xml (as far as I could see) did not cover that
case.

Therefore you have to URL-encode ; in the path, 'cause it separates path
from parameters, but it's not necessary in the searchpart.

All RFCs are wake about URL-encoding of special characters like / ; = | @

IIRC the same applies to | but don't have seen examples for that since
a very long time (may be back when Netscape Servers dominated Internet:)

Sorry for being a bit off-topic, but hope it helps. At least Robert's
examples with the ; in behind the FQDN are subject to it too, somehow.

Ciao,
Achim

--
Andrés Riancho
Director of Web Security at Rapid7 LLC
Founder at Bonsai Information Security
Project Leader at w3af

Achim, On Wed, Jun 22, 2011 at 5:02 PM, Achim Hoffmann <websec10@sic-sec.org> wrote: > Hi Andres, > >> Just noticed that you might be missing the test where you have a >> param:    http://www.w3af.com/foo/bar?spam;eggs=1    (eggs=1) is the >> param. > > not sure what's your question here, but according RFC1738 you have a > "searchpart" (aka query string) which is in your example >        spam;eggs=1 Actually, what I meant was this: >>> import urlparse >>> urlparse.urlparse('http://www.w3af.com/filename.py;SESSION=321?id=1') ParseResult(scheme='http', netloc='www.w3af.com', path='/filename.py', params='SESSION=321', query='id=1', fragment='') And I called it "param" not because that's the name in the RFC (AFAIK) but because that's how python shows it to me :) > For those tools/frameworks/whatever which believe that a query string > consist of key=value pairs which must be separated by & the key here > would be >        spam;eggs > and the value >        1 > > The ; in the path of an URL is the delimiter for parameters, it should > not be a special character in the searchpart. Example: >        http://f.q.d.n//path/to/file;parameter=2;par=3?search&key=val;ue Not sure if we're saying the same thing or not. What I'm trying to say is that URLs can have a "special" section that starts with a ";" after the filename, and tests.xml (as far as I could see) did not cover that case. > Therefore you have to URL-encode ; in the path, 'cause it separates path > from parameters, but it's not necessary in the searchpart. > > All RFCs are wake about URL-encoding of special characters like / ; = | @ > > IIRC the same applies to | but don't have seen examples for that since > a very long time (may be back when Netscape Servers dominated Internet:) > > Sorry for being a bit off-topic, but hope it helps. At least Robert's > examples with the ; in behind the FQDN are subject to it too, somehow. > > Ciao, > Achim > > -- Andrés Riancho Director of Web Security at Rapid7 LLC Founder at Bonsai Information Security Project Leader at w3af