[squid-dev] What is your opinion: Debugging and cache enhancement tool, what is the preferred implementation?

Thu Nov 19 20:20:08 UTC 2015

On 19/11/2015 18:35, Alex Rousskov wrote:
> ICAP and eCAP services do not get enough information to triage most
> caching problems. IMO, to be generally useful, this should be done in Squid.

Well I do understand that they do not get enough information on the 
decision itself... and I do believe that it is better to implement it 
inside squid if possible.

I have written code already so I cannot just take it all back but I will 
try to explain more about the relevancy of triaging low caching.
(I had a plan to write a wiki article about couple things but it will 
take me more time to do that)

There are couple things which led me to think about the issue.
I have seen something in squid which I am unsure about.
I have tried couple times to understand the logic behind things from the 
client side and the server side.
There are two separated request headers in http:
if-modified-since and
cache-control

When I am running an if-modified-since only request squid just sends me 
the cached result since it was not expired yet and the object is still 
considered "non-stale" and I get the age header which is fine while 
served from cache.

But firefox for example is using both if-modfied-since and 
cache-control( max-age=0) and forces squid from unknown reason to 
re-fetch the object in specific cases.(maybe an RFC compatibility thing)
So firefox is nice and obeys it's client order to refresh and demand a 
non fresh object from the origin.
But as a reverse proxy I am also obeying the request and I am trying a 
combined request with IMS and max-age=0.
In my specific case the file is considered an application(php\html) and 
there for the service do not support IMS request... since the html file 
is considered an application despite the fact that there is not one php 
line in the file.
Apache disables the default existing feature of IMS on any of the html 
files in a whole tree of directories.
The issue is that I have two options in squid refresh_pattern that I can 
try to use ignore-reload or reload-into-ims.
ignore-reload will always return the original object while 
reload-into-ims will send an IMS to the origin service and will fetch 
the full object in this specific case.
So stopping for a sec and thinking: as a reverse proxy I do not have any 
option in squid that allows me to look at this weird request and decide 
that the client have a split personality issue.(from one side of the 
picture)

This lead me to the real issue which there is probably a reason why 
cache HITs were dropped over the releases.
 From logging point of view we have "refresh" but from refresh_pattern 
side we look at the request with "reload".

By default squid logs this request as "refresh" but it has a property of 
IMS and it means that it's not a 100% "refresh" or "reload" request but 
also an IMS.

And back to the reverse proxy situation there is an issue!
Even if I am forcing reload-into-ims it is done against the origin 
server and not the cache.
This got a bit weird when I have seen that there is a config directive:
"refresh_all_ims" and it is off by default.
This left me confused since we have an option to force an IMS in a case 
of a "refresh" but using F5 refresh:
Internet Explorer uses: pragma = no-cache
Firefox uses: both IMS + max-age=0
Chrome uses: both IMS + max-age=0
When using Shift F5:
Internet Explorer(ctrl F5) uses: pragma = no-cache
Firefox uses: cache-control = no-cache + IMS
Chrome uses: cache-control = no-cache + IMS

And the truth is that I am really OK in general with the current state 
of squid since it's about 100% reliable these days.
But still there is one case which I do not like and this is the reverse 
proxy situation.
I am thinking about a solution to the issue which led me to one of two 
ICAP services:
- strip the "Cache-Control: max-age=0" in the request
- answer any split personality with an IMS with a 304 did not modified 
based on something, maybe even to hold a list of LM\EXPIRES in ram.

I have spoken with Amos about couple things and I am really unsure what 
would be the right thing to do.
 From one side I feel much better since ignore-no-cache was removed and 
more then that I must admit that the Internet got better this way!!
I know about couple ISPs that was using squid and their clients had a 
very hard time with it present(including me).

As a default behavior I think most would want to be the modest and 
honest citizen while some would be greedy for caching.

I think that this is the result of a tiny "triage" which I extends  over 
the last year and more.

I would like to know what do you and others think about this issue ie 
leaving the RFC aside for a sec, was squid and other caching solutions 
resulting this situation which instead an if-modified-since a max-age-0 
is being used on each F5 key press?
Also specifically about what should reverse proxies do and about the two 
ideas I had in mind.

Thanks,
Eliezer

* Later I will try to file a bug about a wanted feature while I am 
pretty sure I know which are the main suspects in the low caching issue.