[squid-dev] [PATCH] Case-insensitive URI schemes

Eduard Bagdasaryan eduard.bagdasaryan at measurement-factory.com
Fri Jan 6 16:41:41 UTC 2017


On 06.01.2017 15:27, Amos Jeffries wrote:
> As a result, the code responsible for lower-case
>> transformation was not executed.
>
> That is intentional behaviour for several reasons;
>
> 1) it improves transparency and reduces risks from proxy 
> fingerprinting by systems probing the URI scheme handling by the 
> transport agents (ie, fingerprinting Squid).
>
> 2) unknown URI schemes are not necessarily handled properly as 
> case-insensitive by the experimental agents sending and receiving the 
> messages.
>
> also, (and more importantly);

The patch does not change this, i.e., "unknown" images are still stored 
without
down-casing.

>
> 3) the transport protocol label and URI scheme label are still 
> conflated. The scheme down-casing procedure is _only_ applicable when 
> translating from ProtocolType_str labels (upper case) to scheme label 
> (lower case).

To avoid misunderstanding I pay your attention that the unpatched Squid 
did not
down-case at all (i.e. for known ProtocolType_str schemes too). In other 
words, when
receiving HTTP://example.com "HTTP" was not down-cased. Just this 
violates HTTP
caching rules: two different cache entries were created for 
HTTP://example.com
and http://example.com requests.

>
>
> 4) storing the down-cased string for registered protocols of each URI 
> avoids many explicit down-casing operations on use/display of the URI 
> scheme. Note that is specific to the known protocols.
>
>  - There are many more points of code displaying the scheme than 
> setting it. So this is a significant performance gain despite the 
> overhead of allocating and own-casing a new SBuf per UriScheme object 
> your patch notes with an XXX.

I am not against allocating and storing down-cased SBuf "image_" (for 
performance sake).
The related XXX is about  allocating SBuf which we probably can avoid in 
future optimization.
For example, we could do this by converting ProtocolType_str to a const 
array of SBufs, thus
avoiding image_ member allocation when dealing with known protocols.

>
>>
>> There are a couple of XXX and one of them (about "broken caller" for
>> PROTO_NONE) needs clarification. This caller uses PROTO_NONE
>> (i.e., absent scheme) together with "http" scheme image inside
>> clientReplyContext::createStoreEntry() which looks inconsistent and
>> probably is a bug. Am I missing anything there?
>
> Thats an odd case. I'm not exactly sure what we should be doing there. 
> Thus the unusual parameter combo. It was done that way to minimize 
> breakage with old code - so the request can be detected as missing 
> scheme (avoid confusing with a real HTTP URL) if anything tries to 
> compare it, but still generates a reasonably correct URL if its dumped 
> to store files.
>

Since you agree that it is an "odd" case (and we are not going to fix it 
right now) I would leave
the "broken caller" XXX.

Please note that the major problem here is probably the caching problem 
as I noted above.
I have not fully investigated whether it has security aspects, i.e., 
affects some URL based ACLs
while comparing with stored URLs.


Eduard.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20170106/b948b1b4/attachment.html>


More information about the squid-dev mailing list