[squid-users] Vary object loop returns

Yuri Voinov yvoinov at gmail.com
Tue Jun 7 11:17:34 UTC 2016

Hash: SHA256

07.06.2016 17:00, Amos Jeffries пишет:
> On 7/06/2016 9:12 p.m., Yuri Voinov wrote:
>> 07.06.2016 5:13, Amos Jeffries пишет:
>>> On 7/06/2016 6:20 a.m., Yuri Voinov wrote:
>>>> By the way, we have another problem. Caching is greatly reduced by the
>>>> presence of User-Agent header Vary. Although I know that Amos says - he
>>>> says, we should break the Internet and cached separately all types of
>>>> user agents.
>>> Please do not put your own beliefs into my mouth. I am very much in
>> Sorry, Amos. This was sarcasm. May be, not too relevant.
>>> favour of denying Vary:User-Agent from being cached *at all* until the
>>> UA start sending sensible contents in the header. That would lower your
>>> precious HIT ratio a tiny amount.
>> Maybe. However, this greatly increases the content duplication. Аgree?
> I'm not sure what you are asking me to agree with there.
> The Vary:User-Agent has three cases that can happen:
> 1) caching it properly results in huge number of probably duplicated
> variants in the cache. Each response though is guaranteed to be correct
> for that UA when HIT happen.
>  I think we agree that situation is bad. Currently the store_miss
> directive is the best way to avoid that. Which makes #2 happen instead
> of this #1 case.
> 2) not caching it at all will lower the HIT ratio from sites using it.
> However a potentially large amount of cache space will become available
> for other better caching sites to use and raise their HIT ratio.
> Again each response is guaranteed to be correct for that UA.
> 3) caching one random copy from the server and delivering it to all
> clients regardless of their UA will produce a high HIT ratio. Something
> like the HIT ratio from #1 plus ratio gained by #2.
>  However, each response will randomy break a) clients expectations, b)
> the servers expectations, and c) the site authors behaviour expectations
> (ie security model).
> I believe #2 is the best tradeoff. You seem to be arguing for #3 solely
> on the IT ratio numbers, without considering the actual nasty side effect.

It's understandable reasons, Amos. We discussed them previously. You
proceed from the presumption of good faith Webmasters. However,
experience has shown that it is not. Many - including especially Google!
- Actively oppose caching, using all possible mechanisms. Adding the
hash in the URL - and this is the most harmless! - URL encryption,
header modifications, including UA. Accordingly, I want to have a
mechanism for combating such unscrupulous webmasters. And if possible
without a huge amount of night work by hand. I have things to do, and
besides nightly reverse engineering sites.

>>> BTW: Do you know what "breaking the Internet" actually means? It's a bit
>>> ironic that you would throw that insult at me while praising what these
>>> patches currently do.
>> :) Really sorry.
>> Sometimes our experiments scratching the hell out of the display content
>> in browsers. That's what I called "break the Internet". :) But
>> seriously, we're trying to find a compromise between a high cache and
>> the desire to satisfy customers. It seems to me, mutually exclusive
> Ah. What I have been trying to get across is that they are not mutually
> exclusive like that.
> You have seen the display breakage. That is caused directly by things
> like client requesting identity encoding but being delivered a cached
> gzip object. Or worse, the garbage output from trying to decompress an
> identity object with gzip decompressor. The display gets F*'d.
> Note that the reason the wrong content came back in all the above
> display problems was that it was a cache HIT and the proxy ignored part
> of the Vary header when producing its response.
Side effects in some cases, can be ignored. (By the way, I just have
reason to doubt the integrity of users with respect to me and my work,
to seriously worry about the invisible side effects. On the grave
violations in the sites notified me. Everything else is on my conscience).

In the case of a number of sites that create problems in caching for me,
I am ready to come to terms with some inaccuracy in the work - for
example, this applies to ad networks or Web sites opposing caching for
selfish purposes. I have often wondered what, in fact, the devil so
counteracts Google caching terabytes Youtube content. When the counting
of views was enough to make a tiny non-cached JS who have recorded the
client-side viewing.
>> Meanwhile, Amos, I still believe that the team should now, in 2016, to
>> seriously reconsider attitude to compress the content and possibly
>> revise Vary processing algorithms in favor of more vysokgo cache hit
>> ratio. Under the conditions of high-speed Internet only a high hit ratio
>> justifies the use of caching proxy.
> What attitude? For my part the response (to every submission) is that a
> patch doing it right will go in, not another hacked up job.
> Who is this "the team" ? It is all of us including you.
> We cannot revise the Vary algorithm, it is RFC defined. The best we can
> do for the forseeable future is hacked up tricks like filtering what
> headers get sent through to the server. I am part-time working with the
> Apache and Chrome devs in IETF to design a replacement Key header that
> works far better.
> Amos
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users

Version: GnuPG v2

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0x613DEC46.asc
Type: application/pgp-keys
Size: 2437 bytes
Desc: not available
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20160607/84d822d4/attachment.key>

More information about the squid-users mailing list