[squid-users] Squid4 has extremely low hit ratio due to lacks of ignore-no-cache

Yuri Voinov yvoinov at gmail.com
Mon Oct 26 10:41:45 UTC 2015


I understand perfectly. However - I changed version of the proxy and 
everywhere at the same time changed the headlines? Now I will return the 
version 3.4 and everything will be as it was. I already did.


That's why I ask the question - what has changed so much that the same 
configuration I get 10 times smaller cache hit.

26.10.15 16:29, Eliezer Croitoru пишет:
> Hey Yuri,
>
> What have you tried until now to understand the situation of the issue?
> From your basic question I was sure that you ran some tests on some 
> well defined objects.
> To asses the state of squid you would need some static objects and 
> some changing objects.
> You would also be required to test using a simple UA like a 
> script\wget\curl and using a fully fledged UA such as 
> chrome\firefox\explorer\opera\safari.
>
> I can try to give you couple cache friendly sites which will show you 
> what the basic status of squid.
> In the past someone here asked how to cache the site: http://djmaza.info/
> which by default is built using lots of static content such as html 
> pages, pictures and good css and also implements cache headers nicely.
> I still do not know why wordpress main pages do not have a valid cache 
> headers, and what I mean by that is: Is it not possible to have a page 
> cached for 60 seconds? how much updates can accrue in 60 seconds? 
> would a "must-revalidate" cost that much compared to what there is 
> now?(maybe it does).
> Another one would be https://www.yahoo.com/ which is mostly fine with 
> cache and define their main page with "Cache-Control: no-store, 
> no-cache, private, max-age=0" since it's a news page which updates too 
> much.
>
> I was looking for cache testing subjects and I am still missing couple 
> examples\options.
>
> Until now what have mainly used for basic test was redbot.org and a 
> local copy of it for testing purposes.
> And while writing to you looking for test subjects I have seen that my 
> own web server have a question for squid.
> I have an example case which gives a great example how to "break" 
> squid cache.
> So I have my website and the main page:
> http://ngtech.co.il/
> https://ngtech.co.il/ (self signed certificate)
>
> that would be cached properly by squid!
> If for any reason I would remove the Last-Modified header from the 
> page it would become un-cachable in squid (3.5.10) with default settings.
> Accidentally I turned ON the apache option to treat html as php using 
> the apache configuration line:
> AddType application/x-httpd-php .html .htm
>
> Which is recommended to be used by the first google result for "html 
> as php".
> Once you will try the next page(with this setting on):
> http://ngtech.co.il/squidblocker/
> https://ngtech.co.il/squidblocker/ (self signed certificate)
>
> You will see that it is not being cached at all.
> Redbot.org claims that the cache is allowed it's own freshness for the 
> object but squid (3.5.10) will not cache it what so ever and no matter 
> what I do.
> When I am removing "http as php" tweak the page response with a 
> Last-Modified header and can be cached again.
>
> I am unsure who is the culprit for the issue but I will ask about it 
> in a separated thread *if* I will get no response here.(sorry for 
> partially top-posting)
>
> Eliezer
>
>
> On 25/10/2015 21:29, Yuri Voinov wrote:
>> In a nutshell - I need no possible explanation. I want to know - it's a
>> bug or so conceived?
>>
>> 26.10.15 1:17, Eliezer Croitoru пишет:
>>> Hey Yuri,
>>>
>>> I am not sure if you think that Squid version 4 with extreme low hit
>> ratio is bad or not but I can understand your sight about things.
>>> Usually I am redirecting to this page:
>> http://wiki.squid-cache.org/Features/StoreID/CollisionRisks#Several_real_world_examples 
>>
>>>
>>> But this time I can proudly say that the squid project is doing things
>> the right way while it might not be understood by some.
>>> Before you or anyone declares that there is a low hit ratio due to
>> something that is missing I will try to put some sense into how things
>> looks in the real world.
>>> Small thing from a nice day of mine:
>>> I was sitting talking with a friend of mine, a MD to be exact and
>> while we were talking I was just comforting him about the wonders of
>> Computers.
>>> He was complaining on how the software in the office moves so slow and
>> he needs to wait for the software to response with results. So I
>> hesitated a bit but then I asked him "What would have happen if some MD
>> here in the office will receive the wrong content\results on a patient
>> from the software? he described it to me terrified from the question 'He
>> can get the wrong decision!' and then I described to him how he is in
>> such a good place when he doesn't need to fear from such scenarios.
>>> In this same office Squid is being used for many things and it's
>> crucial that besides the option to cache content the possibility to
>> validate cache properly will be set right.
>>>
>>> I do understand that there is a need for caches and sometimes it is
>> crucial in order to give the application more CPU cycles or more RAM but
>> sometimes the hunger for cache can consume the actual requirement for
>> the content integrity and it must be re-validated from time to time.
>>>
>>> I have seen couple times how a cache in a DB or other levels results
>> with a very bad and unwanted result while I do understand some of the
>> complexity and caution that the programmers take when building all sort
>> of systems with cache in them.
>>>
>>> If you do want to understand more about the subject pick your favorite
>> scripting language and just try to implement a simple object caching.
>>> You would then see how complex the task can be and you can maybe then
>> understand why caches are not such a simple thing and specially why
>> ignore-no-cache should not be used in any environment if it is possible.
>>>
>>> While I do advise you to not use it I would hint you and others on
>> another approach to the subject.
>>> If you are greedy and you have hunger for cache for specific
>> sites\traffic and you would like to be able to benefit from over-caching
>> there is a solution for that!
>>> - You can alter\hack squid code to meet your needs
>>> - You can write an ICAP service that will be able to alter the
>> response headers so squid would think it is cachable by default.
>>> - You can write an ECAP module that will be able to alter the response
>> headers ...
>>> - Write your own cache service with your algorithms in it.
>>>
>>> Take in account that the squid project tries to be as fault tolerance
>> as possible due to it being a very sensitive piece of software in very
>> big production systems.
>>> Squid doesn't try to meet the requirement of "Maximum Cache" and it is
>> not squid that as a caching proxy makes a reduction of any cache 
>> percentage!
>>> The reason that the content is not cachable is due to all these
>> application that describe their content as not cachable!
>>> For a second of sanity from the the squid project, try to contact
>> google\youtube admins\support\operators\forces\what-ever to understand
>> how would you be able to benefit from a local cache.
>>> If and when you do manage to contact them let them know I was looking
>> for a contact and I never managed to find one of these available to me
>> on the phone or email. You cannot say anything like that on the squid
>> project, the squid project can be contacted using an email and if
>> required you can get a hold of the man behind the software(while he is a
>> human).
>>>
>>> And I will try to write it in a geeky way:
>>> deny_info 302:https://support.google.com/youtube/
>> big_system_that_doesnt_want_to_be_cached
>>>
>>> Eliezer
>>>
>>> * P.S If you do want to write an ICAP service or an ECAP module to
>> replace the "ignore-no-cache" I can give you some code that will might
>> help you as a starter.
>>>
>>>
>>> On 25/10/2015 17:17, Yuri Voinov wrote:
>>>>
>>> Hi gents,
>>>
>>> Pay attention to whether someone from the test SQUID 4 as extremely low
>>> of cache hits from the new version? Particularly with respect to sites
>>> HTTPS directive "no cache"? After replacing the Squid 3.4 to 4 squid
>>> cache hit collapsed from 85 percent or more on the level of 5-15
>>> percent. I believe this is due to the exclusion of support guidelines
>>> ignore-no-cache, which eliminates the possibility of aggressive caching
>>> and reduces the value of caching proxy to almost zero.
>>>
>>> This HTTP caches normally. However, due to the widespread use of HTTPS
>>> trends - caching dramatically decreased to unacceptable levels.
>>>
>>> Noticed there anyone else this effect? And what is now with caching?
>>>
>>>>
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20151026/4e7f3ddc/attachment-0001.html>


More information about the squid-users mailing list