[squid-users] Vary object loop returns

Tue Jun 7 11:00:12 UTC 2016

On 7/06/2016 9:12 p.m., Yuri Voinov wrote:
> 
> 
> 
> 07.06.2016 5:13, Amos Jeffries пишет:
>> On 7/06/2016 6:20 a.m., Yuri Voinov wrote:
>>>
>>> By the way, we have another problem. Caching is greatly reduced by the
>>> presence of User-Agent header Vary. Although I know that Amos says - he
>>> says, we should break the Internet and cached separately all types of
>>> user agents.
> 
>> Please do not put your own beliefs into my mouth. I am very much in
> Sorry, Amos. This was sarcasm. May be, not too relevant.
> 
>> favour of denying Vary:User-Agent from being cached *at all* until the
>> UA start sending sensible contents in the header. That would lower your
>> precious HIT ratio a tiny amount.
> Maybe. However, this greatly increases the content duplication. Аgree?

I'm not sure what you are asking me to agree with there.

The Vary:User-Agent has three cases that can happen:

1) caching it properly results in huge number of probably duplicated
variants in the cache. Each response though is guaranteed to be correct
for that UA when HIT happen.
 I think we agree that situation is bad. Currently the store_miss
directive is the best way to avoid that. Which makes #2 happen instead
of this #1 case.

2) not caching it at all will lower the HIT ratio from sites using it.
However a potentially large amount of cache space will become available
for other better caching sites to use and raise their HIT ratio.
Again each response is guaranteed to be correct for that UA.

3) caching one random copy from the server and delivering it to all
clients regardless of their UA will produce a high HIT ratio. Something
like the HIT ratio from #1 plus ratio gained by #2.
 However, each response will randomy break a) clients expectations, b)
the servers expectations, and c) the site authors behaviour expectations
(ie security model).

I believe #2 is the best tradeoff. You seem to be arguing for #3 solely
on the IT ratio numbers, without considering the actual nasty side effect.

> 
>> BTW: Do you know what "breaking the Internet" actually means? It's a bit
>> ironic that you would throw that insult at me while praising what these
>> patches currently do.
> :) Really sorry.
> 
> Sometimes our experiments scratching the hell out of the display content
> in browsers. That's what I called "break the Internet". :) But
> seriously, we're trying to find a compromise between a high cache and
> the desire to satisfy customers. It seems to me, mutually exclusive things.

Ah. What I have been trying to get across is that they are not mutually
exclusive like that.

You have seen the display breakage. That is caused directly by things
like client requesting identity encoding but being delivered a cached
gzip object. Or worse, the garbage output from trying to decompress an
identity object with gzip decompressor. The display gets F*'d.

Note that the reason the wrong content came back in all the above
display problems was that it was a cache HIT and the proxy ignored part
of the Vary header when producing its response.

> 
> Meanwhile, Amos, I still believe that the team should now, in 2016, to
> seriously reconsider attitude to compress the content and possibly
> revise Vary processing algorithms in favor of more vysokgo cache hit
> ratio. Under the conditions of high-speed Internet only a high hit ratio
> justifies the use of caching proxy.

What attitude? For my part the response (to every submission) is that a
patch doing it right will go in, not another hacked up job.

Who is this "the team" ? It is all of us including you.

We cannot revise the Vary algorithm, it is RFC defined. The best we can
do for the forseeable future is hacked up tricks like filtering what
headers get sent through to the server. I am part-time working with the
Apache and Chrome devs in IETF to design a replacement Key header that
works far better.

Amos