[squid-users] cache object with vary

Amos Jeffries squid3 at treenet.co.nz
Sun Aug 28 16:48:16 UTC 2016


On 28/08/2016 12:56 p.m., joe wrote:
> is this bug or its made to work like that
> lets say we have object in cache name 000000A5
> url.com/some.js
> vary=accept-encoding="gzip"
> 
> if some browser get the same object
> url.com/some.js
> vary=accept-encoding="deflate"
> 
> the md5 key wont match 

Correct.

> and it delete the old cached object with
> accept-encoding="gzip" and replace with
> new one with vary=accept-encoding="deflate" and prosess as TCP_MISS

Incorrect.

The first thing Squid does is lookup the URL (only). That finda a 'vark
marker' object which tels Squid the Vary header pattern to append to the
hash key and do another lookup for that.

The amended hash key for the second query finds no object ==> a MISS.
Period.

The "gzip" object existence or absence is not related nor touched.

> 
> that will result in "varyEvaluateMatch: Oops. Not a Vary match on second
> attempt
> no match and the code in client_side.cc
> return VARY_CANCEL

IF:
 * the second lookup with the amended hash key *did* find an object, and
 * it was for the same URL, and
 * it has no Vary header;
then a warning message (the above?) is output and the found object will
be replaced with whatever comes back from the MISS resolving actions.


I think you can get yourself into this type of situation when using
Store-ID in ways prohibited by the Store-ID design.

 Requirement #1 for Store-ID is that all objects found by the custom ID
key are identical.

 Variants are non-identical by definition. So at least one variant of
objects that Vary is not going to be identical to objects that lack Vary!


You can also encounter it with SMP workers at times. Since the workers
are processing more traffic than ever before the churn and key hash
collisions rate is potentially greater.


> 
> and in client_side_reply.cc
> 
>     case VARY_CANCEL:
>         /* varyEvaluateMatch found a object loop. Process as miss */
>         debugs(88, DBG_IMPORTANT, "clientProcessHit: Vary object loop!");

NP: the above statements may or may not be true. The code was written a
long time ago and things around it have changed a lot in the meantime.

>         http->logType = LOG_TCP_MISS; // we lack a more precise LOG_*_MISS
> code
>         processMiss();
>         return;
> 
> the way it should be instead of replacing the existing obj  should be
> another object with the 
> new vary
> shuld be 2 file 000000A5
> and    000000A6     example each one has different vary to match the correct
> obj if its gzip or ident or deflate or with useragent
> wen vary not matching shuld be new obj file to be saved as diferent cache
> name 000000A7
> so it match the correct object name with its vary
> 

If the above wasn't clear enough. This is how squid does it:

 key: MD5("http://url.com/some.js")
 data: vary-marker object ("Vary:Accept-Encoding", ...)

 key: MD5("http://url.com/some.js" + "accept-encoding=")
 data: no- Accept-Encoding variant response

 key: MD5("http://url.com/some.js" + "accept-encoding=identity")
 data: "identity" variant response

 key: MD5("http://url.com/some.js" + "accept-encoding=gzip")
 data: "gzip" variant response

 key: MD5("http://url.com/some.js" + "accept-encoding=deflate")
 data: "deflate" variant response

 key: MD5("http://url.com/some.js" + "accept-encoding=deflate,gzip")
 data: "deflate,gzip" variant response

 key: MD5("http://url.com/some.js" + "accept-encoding=gzip,deflate")
 data: "gzip,deflate" variant response

 ... and so on for all possible unique strings that could be sent in
Accept-Encoding.


If one of those 'data' objects contains a 'wrong' response object. The
transaction encountering it MISS'es  / VARY_CANCEL and that store
location gets updated with correct content resulting from the server fetch.

Amos



More information about the squid-users mailing list