<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <div class="moz-cite-prefix">On 06.01.2017 15:27, Amos Jeffries

      wrote:<br>

    </div>

    <blockquote

      cite="mid:23c730b91294cecb5d78b59c2120fa27@treenet.co.nz"

      type="cite">As a result, the code responsible for lower-case

      <br>

      <blockquote type="cite">transformation was not executed.

        <br>

      </blockquote>

      <br>

      That is intentional behaviour for several reasons;

      <br>

      <br>

      1) it improves transparency and reduces risks from proxy

      fingerprinting by systems probing the URI scheme handling by the

      transport agents (ie, fingerprinting Squid).

      <br>

      <br>

      2) unknown URI schemes are not necessarily handled properly as

      case-insensitive by the experimental agents sending and receiving

      the messages.

      <br>

      <br>

      also, (and more importantly);

      <br>

    </blockquote>

    <br>

    The patch does not change this, i.e., "unknown" images are still

    stored without<br>

    down-casing.<br>

    <br>

    <blockquote

      cite="mid:23c730b91294cecb5d78b59c2120fa27@treenet.co.nz"

      type="cite">

      <br>

      3) the transport protocol label and URI scheme label are still

      conflated. The scheme down-casing procedure is _only_ applicable

      when translating from ProtocolType_str labels (upper case) to

      scheme label (lower case).

      <br>

    </blockquote>

    <br>

    To avoid misunderstanding I pay your attention that the unpatched

    Squid did not<br>

    down-case at all (i.e. for known ProtocolType_str schemes too). In

    other words, when<br>

    receiving

    <meta http-equiv="content-type" content="text/html;

      charset=windows-1252">

    <a class="moz-txt-link-freetext" href="HTTP://example.com">HTTP://example.com</a> "HTTP" was not down-cased. Just this violates

    HTTP<br>

    caching rules: two different cache entries were created for

    <a class="moz-txt-link-freetext" href="HTTP://example.com">HTTP://example.com</a><br>

    and <a class="moz-txt-link-freetext" href="http://example.com">http://example.com</a> requests.<br>

    <br>

    <blockquote

      cite="mid:23c730b91294cecb5d78b59c2120fa27@treenet.co.nz"

      type="cite"><br>

      <br>

      4) storing the down-cased string for registered protocols of each

      URI avoids many explicit down-casing operations on use/display of

      the URI scheme. Note that is specific to the known protocols.

      <br>

      <br>

       - There are many more points of code displaying the scheme than

      setting it. So this is a significant performance gain despite the

      overhead of allocating and own-casing a new SBuf per UriScheme

      object your patch notes with an XXX.

      <br>

    </blockquote>

    <br>

    I am not against allocating and storing down-cased SBuf "image_"

    (for performance sake).<br>

    The related XXX is about  allocating SBuf which we probably can

    avoid in future optimization.<br>

    For example, we could do this by converting ProtocolType_str to a

    const array of SBufs, thus<br>

    avoiding image_ member allocation when dealing with known protocols.<br>

    <br>

    <blockquote

      cite="mid:23c730b91294cecb5d78b59c2120fa27@treenet.co.nz"

      type="cite">

      <br>

      <blockquote type="cite">

        <br>

        There are a couple of XXX and one of them (about "broken caller"

        for

        <br>

        PROTO_NONE) needs clarification. This caller uses PROTO_NONE

        <br>

        (i.e., absent scheme) together with "http" scheme image inside

        <br>

        clientReplyContext::createStoreEntry() which looks inconsistent

        and

        <br>

        probably is a bug. Am I missing anything there?

        <br>

      </blockquote>

      <br>

      Thats an odd case. I'm not exactly sure what we should be doing

      there. Thus the unusual parameter combo. It was done that way to

      minimize breakage with old code - so the request can be detected

      as missing scheme (avoid confusing with a real HTTP URL) if

      anything tries to compare it, but still generates a reasonably

      correct URL if its dumped to store files.

      <br>

      <br>

    </blockquote>

    <br>

    Since you agree that it is an "odd" case (and we are not going to

    fix it right now) I would leave<br>

    the "broken caller" XXX.<br>

    <br>

    Please note that the major problem here is probably the caching

    problem as I noted above.<br>

    I have not fully investigated whether it has security aspects, i.e.,

    affects some URL based ACLs<br>

    while comparing with stored URLs. <br>

    <br>

    <br>

    Eduard.<br>

    <br>

  </body>

</html>