[squid-users] Squid4 has extremely low hit ratio due to lacks of ignore-no-cache

Eliezer Croitoru eliezer at ngtech.co.il
Mon Oct 26 10:29:07 UTC 2015


Hey Yuri,

What have you tried until now to understand the situation of the issue?
 From your basic question I was sure that you ran some tests on some 
well defined objects.
To asses the state of squid you would need some static objects and some 
changing objects.
You would also be required to test using a simple UA like a 
script\wget\curl and using a fully fledged UA such as 
chrome\firefox\explorer\opera\safari.

I can try to give you couple cache friendly sites which will show you 
what the basic status of squid.
In the past someone here asked how to cache the site: http://djmaza.info/
which by default is built using lots of static content such as html 
pages, pictures and good css and also implements cache headers nicely.
I still do not know why wordpress main pages do not have a valid cache 
headers, and what I mean by that is: Is it not possible to have a page 
cached for 60 seconds? how much updates can accrue in 60 seconds? would 
a "must-revalidate" cost that much compared to what there is now?(maybe 
it does).
Another one would be https://www.yahoo.com/ which is mostly fine with 
cache and define their main page with "Cache-Control: no-store, 
no-cache, private, max-age=0" since it's a news page which updates too much.

I was looking for cache testing subjects and I am still missing couple 
examples\options.

Until now what have mainly used for basic test was redbot.org and a 
local copy of it for testing purposes.
And while writing to you looking for test subjects I have seen that my 
own web server have a question for squid.
I have an example case which gives a great example how to "break" squid 
cache.
So I have my website and the main page:
http://ngtech.co.il/
https://ngtech.co.il/ (self signed certificate)

that would be cached properly by squid!
If for any reason I would remove the Last-Modified header from the page 
it would become un-cachable in squid (3.5.10) with default settings.
Accidentally I turned ON the apache option to treat html as php using 
the apache configuration line:
AddType application/x-httpd-php .html .htm

Which is recommended to be used by the first google result for "html as 
php".
Once you will try the next page(with this setting on):
http://ngtech.co.il/squidblocker/
https://ngtech.co.il/squidblocker/ (self signed certificate)

You will see that it is not being cached at all.
Redbot.org claims that the cache is allowed it's own freshness for the 
object but squid (3.5.10) will not cache it what so ever and no matter 
what I do.
When I am removing "http as php" tweak the page response with a 
Last-Modified header and can be cached again.

I am unsure who is the culprit for the issue but I will ask about it in 
a separated thread *if* I will get no response here.(sorry for partially 
top-posting)

Eliezer


On 25/10/2015 21:29, Yuri Voinov wrote:
> In a nutshell - I need no possible explanation. I want to know - it's a
> bug or so conceived?
>
> 26.10.15 1:17, Eliezer Croitoru пишет:
>> Hey Yuri,
>>
>> I am not sure if you think that Squid version 4 with extreme low hit
> ratio is bad or not but I can understand your sight about things.
>> Usually I am redirecting to this page:
> http://wiki.squid-cache.org/Features/StoreID/CollisionRisks#Several_real_world_examples
>>
>> But this time I can proudly say that the squid project is doing things
> the right way while it might not be understood by some.
>> Before you or anyone declares that there is a low hit ratio due to
> something that is missing I will try to put some sense into how things
> looks in the real world.
>> Small thing from a nice day of mine:
>> I was sitting talking with a friend of mine, a MD to be exact and
> while we were talking I was just comforting him about the wonders of
> Computers.
>> He was complaining on how the software in the office moves so slow and
> he needs to wait for the software to response with results. So I
> hesitated a bit but then I asked him "What would have happen if some MD
> here in the office will receive the wrong content\results on a patient
> from the software? he described it to me terrified from the question 'He
> can get the wrong decision!' and then I described to him how he is in
> such a good place when he doesn't need to fear from such scenarios.
>> In this same office Squid is being used for many things and it's
> crucial that besides the option to cache content the possibility to
> validate cache properly will be set right.
>>
>> I do understand that there is a need for caches and sometimes it is
> crucial in order to give the application more CPU cycles or more RAM but
> sometimes the hunger for cache can consume the actual requirement for
> the content integrity and it must be re-validated from time to time.
>>
>> I have seen couple times how a cache in a DB or other levels results
> with a very bad and unwanted result while I do understand some of the
> complexity and caution that the programmers take when building all sort
> of systems with cache in them.
>>
>> If you do want to understand more about the subject pick your favorite
> scripting language and just try to implement a simple object caching.
>> You would then see how complex the task can be and you can maybe then
> understand why caches are not such a simple thing and specially why
> ignore-no-cache should not be used in any environment if it is possible.
>>
>> While I do advise you to not use it I would hint you and others on
> another approach to the subject.
>> If you are greedy and you have hunger for cache for specific
> sites\traffic and you would like to be able to benefit from over-caching
> there is a solution for that!
>> - You can alter\hack squid code to meet your needs
>> - You can write an ICAP service that will be able to alter the
> response headers so squid would think it is cachable by default.
>> - You can write an ECAP module that will be able to alter the response
> headers ...
>> - Write your own cache service with your algorithms in it.
>>
>> Take in account that the squid project tries to be as fault tolerance
> as possible due to it being a very sensitive piece of software in very
> big production systems.
>> Squid doesn't try to meet the requirement of "Maximum Cache" and it is
> not squid that as a caching proxy makes a reduction of any cache percentage!
>> The reason that the content is not cachable is due to all these
> application that describe their content as not cachable!
>> For a second of sanity from the the squid project, try to contact
> google\youtube admins\support\operators\forces\what-ever to understand
> how would you be able to benefit from a local cache.
>> If and when you do manage to contact them let them know I was looking
> for a contact and I never managed to find one of these available to me
> on the phone or email. You cannot say anything like that on the squid
> project, the squid project can be contacted using an email and if
> required you can get a hold of the man behind the software(while he is a
> human).
>>
>> And I will try to write it in a geeky way:
>> deny_info 302:https://support.google.com/youtube/
> big_system_that_doesnt_want_to_be_cached
>>
>> Eliezer
>>
>> * P.S If you do want to write an ICAP service or an ECAP module to
> replace the "ignore-no-cache" I can give you some code that will might
> help you as a starter.
>>
>>
>> On 25/10/2015 17:17, Yuri Voinov wrote:
>>>
>> Hi gents,
>>
>> Pay attention to whether someone from the test SQUID 4 as extremely low
>> of cache hits from the new version? Particularly with respect to sites
>> HTTPS directive "no cache"? After replacing the Squid 3.4 to 4 squid
>> cache hit collapsed from 85 percent or more on the level of 5-15
>> percent. I believe this is due to the exclusion of support guidelines
>> ignore-no-cache, which eliminates the possibility of aggressive caching
>> and reduces the value of caching proxy to almost zero.
>>
>> This HTTP caches normally. However, due to the widespread use of HTTPS
>> trends - caching dramatically decreased to unacceptable levels.
>>
>> Noticed there anyone else this effect? And what is now with caching?
>>
>>>


More information about the squid-users mailing list