[squid-dev] Digest related question.

Amos Jeffries squid3 at treenet.co.nz
Sun Feb 22 00:46:15 UTC 2015


On 22/02/2015 12:31 p.m., Eliezer Croitoru wrote:
> From what I understand, the HTTP protocol and some RFC docs that was
> mentioned in the list allow or provide a way to utilize Digest header
> and\or Link headers which might contain some digest data.
> 
> So the first question is about the current md5 hash which is being used
> by the internal index hashing.
> Assuming we want to allow the admin change the default md5 hash into
> sha1 or sha256 hash, how complicated will it be? can it be considered a
> wanted feature?

Not complicated at all. There is even a UFS cache meta tag (TLV) already
assigned from when it was added years ago.

The problem back then, and still today, is that SHA1 is 300% slower than
MD5.

But there is no real need for anything else in the context of a cache
hash ID. The MD5 issues are specifically in how the rare-but-possible
collisions can be generated to discover keys when its used as part of an
encryption cipher. Cache lookup hash has no secret info to decrypt, and
when sent over the wire represents a token *intended* to be replayable
to fetch public data anyway.


I've been keeping an eye on the AES hash speeds instead of SHA, since
AES is only 60% slower than MD5 today and has fewer collisions than even
SHA1. If there gets to be much more hardware acceleration AES might
become a real contender for better cache hashing. But until it does so
its not worth the trouble coding.



> 
> And the second question, about metalinks related integration with squid.
> In any scenario I see possible a digest of cache objects from the server
> side would require digest update of any of the in transit objects and
> another index or maybe two.
> Another aspect of this thing is the integrity of the src server.
> If the origin-server is indeed a hostile one we must not rely on it.
> So there is a policy which needs to be implemented in some way to allow
> an origin server which we rely on.
> 
> In order to prove this is indeed possible and applicable for what ever
> system there is out-there I was thinking about writing a proof of
> concept of the idea.
> 
> I would like to not touch squid code at all in the first steps while
> implementing the proof of concept.
> 
> I need your help with the right point of view and ideas about how to
> prove the idea.
> What API or what options squid gives that can be used to implement the
> idea?
> What available programming resources are there that can help me with the
> task that you can think might help with the task?(assuming I am not a
> c\c++ programmer)

I expect it shodul be workable to use a StoreID helper and some
request/reply_header_replace/add config options to set the Link headers
from a kv-pair note.

> 
> Another pointer is that I do not have an option(from an outside
> software) to run a lookup at the cache index for cached objects.

Not on the in-memory one, but store.log (and swap.state) should be
written to constantly with journal records of what is going on. So for a
proof of concept tailing those files that should be sufficient to
approximate the current memory state.


> The way things are now, when I am trying to access the object with a GET
> request I can get a result which will tell me if the object is in the
> cache using the headers but will force me\UA to fetch the object or ABORT.
> If I would use the HEAD method to request an object I will get a
> HIT\MISS but it will not be related at all to the GET object state since
> it's another object.

The response to a HEAD request is supposed to be exactly identical to a
response to the GET, but with the body/payload/entity cropped off. Even
the Content-Length headers etc should be present saying what size the
body would have been.


> So, assuming I will need some http interface\API that will allow me to
> run a query on squid index DB, will it make sense to write one?(If I
> missed it and there is a way to do so already..)
> 
> I had some time in the past to learn this document:
> https://cwiki.apache.org/confluence/display/TS/Metalink
> 
> which actually describes one approach to the subject.


Amos



More information about the squid-dev mailing list