[squid-dev] Digest related question.

Sun Feb 22 01:31:47 UTC 2015

On 22/02/2015 02:46, Amos Jeffries wrote:
> The response to a HEAD request is supposed to be exactly identical to a
> response to the GET, but with the body/payload/entity cropped off. Even
> the Content-Length headers etc should be present saying what size the
> body would have been.

So just to make sure I understand the reality:
In a case I start squid with 0 objects and run only one transaction of a 
GET request it would create two objects?
 From what I have seen in the code in the past(very long ago) I remember 
that it might not be this way.
So I cannot just run a HEAD request and expect it to reflect too much 
information about the cached data.

About the store log, I have one issue with it and it is that I would not 
be able to "know" if the object is in the cache unless I will follow 
everything in the store log including cache removals.

So one process that will follow the store log and will store it will be 
enough to supply the currently "cached" object list per squid instance 
start till the squid shuts down.
In a case I would try to "save" it for after a shutdown would be more 
difficult.

 From another point of view which is the actual content digest I would 
need some way to receive the file\response content and I was thinking 
about an ICAP based solution which might be combined with the store log 
"follower".

I have a situation in mind for some situation with this kind of a setup:
Fetch a URL which will be digested as we go and then will be "followed" 
and in a case the associated url will be purged from the cache the 
digest will also be erased from the DB.

Now the next part is the right way to "know" if there is a way to 
pre-find the request digest from the origin server.
And it seems like a HEAD request to the origin server might be enough.
So the origin server must be publicly accessible so the "follower" might 
be able to fetch a HEAD of the file and lookup any digest data in the 
response.
In turn a metalink file link in the HEAD headers might be usable for a 
more complicated options.
There is another issue with digest "validation" which can be considered 
as an option for black\white list a server for digest\metalink credibility.

Do you think this idea looks a bit sane?

Eliezer