[squid-users] css not loading

Fri Sep 7 14:54:02 UTC 2018

On 8/09/18 12:47 AM, Alex Gutiérrez Martínez wrote:
> Hello community, 3 days ago my bosses asked me to optimize access to
> several pages prioritized in the company where i work, luckily for me
> these pages are accessible via http. The problem is that after i put the
> rules to cache the pages do not load the CSS and look in pure html.
> After reversing the changes and doing a "squid -z" the problem persists.

squid -z only validates and repairs the *format* of the cache. It does
not affect content stored there.

If (and only if) the content was forced to stay in cache past its normal
expiry time by your rules. Then the change in config would normally
cause it to be dropped from cache on next use.

You can expedite that (for ufs, aufs, diskd caches only) by finding the
cache file containing the object and deleting it. Then restarting Squid
to clear the memory copies of things.

NP: the above all assumes that there is actually an object in cache
being the problem. There are a number of other things which can lead to
identical "missing CSS" issues.

> Here i leave the configuration that i used to keep the pages in cache.
> 

What rules exactly are the new ones you added?

There are several problems and mistakes in the below config. I have
pointed them out, but it is not clear what rules you had working (or
*appearing* to work) and what you added which "broke" things - so a
direct answer to your question is not possible with only this config to
work from.

> 
> #########################################################################
> #Cache #
> #########################################################################
> delay_initial_bucket_level 75
> maximum_object_size 32 MB
> #cache_dir aufs /var/cache/squid 10240 16 256
> cache_dir aufs /opt/squid 1024 16 256

> cache_mem 256 MB

This is the default cache_mem value for Squid-3. No need to configure it.

> cache_store_log /opt/squid/cache_store.log

Only useful if you are debugging storage problems (ie right now perhapse
- if you can understand what is being logged), usually just a waste of
I/O delays in regular proxy usage. So keep that in mind for optimization
later.

> coredump_dir /opt/squid/dump
> minimum_expiry_time 550 seconds

If you check the docs for this directive, notice the text talks about
situations where things are improved by making this timeout *shorter*
than the default.

Making it _long_ increases the chances that things will *not* be cached.
It is used in the event that server revalidation failed (some upstream
error occured). Only objects in cache whose TTL is _over_ this *minimum*
value will be honoured (eg treated as still valid). Things with shorter
TTL will be treated as stale (expired already) - and probably discarded.

So you want this to be relatively short, but not 0 for a regular proxy.
IIRC, it is set to 60s to allow quick detection of server errors when
they provide objects with very short (or zero) lifetimes, while allowing
other servers to have slightly slower auto-recovery form failures.

> ############################
> #uso cache
> ###########################
> client_db off
> offline_mode off

NP: offline_mode is not what its often thought to be. Since Squid-3.2
the HTTP/1.1 proxies pretty much do by default what this directive
caused the older HTTP/1.0-only Squid versions to do. What is left are
some very narrow and specific use-cases (eg migrating cache between two
proxies without downtime, or an origin server upgrade scenario).

OFF is also its default value, so best to just completely remove the
above directive from your config.

> cache_swap_low 93
> cache_swap_high 97
> cache_replacement_policy heap LUDFA
> memory_replacement_policy heap GDSF
> maximum_object_size_in_memory 512 KB
> half_closed_clients off
> ###############################################################################
> # establecemos los archivos de volcado en /var/cache/squid/
> coredump_dir /opt/squid/
> ###############################################################################
> #Establecemos los patrones de refrescamiento de la cache #
> #patron de refrescamiento -- tipo de archivo -- tiempo del objeto -- %de
> refrescamiento -- tiempo #
> #1440 minutos equivalen a 24 horas #
> ################################
> #Refrescamiento de la cache
> ################################
> refresh_pattern ^ftp: 1440 20% 4320
> refresh_pattern ^gopher: 1440 0% 4320
> refresh_pattern -i \.(gif|png|jpg|jpeg|ico)$ 1440 90% 4320
> override-expire ignore-no-store ignore-private
> refresh_pattern -i \.(iso|avi|wav|mp3|mp4|mpeg|swf|flv|x-flv)$ 1440 90%
> 43200 override-expire ignore-no-store ignore-private
> refresh_pattern -i
> \.(deb|exe|zip|tar|tgz|ram|rar|bin|ppt|doc|tiff|pdf|xls|docx|xlsx|pptx)$
> 1440 90% 4320 override-expire ignore-no-store ignore-private
> refresh_pattern -i \.index.(html|htm)$ 1440 10% 4320
> refresh_pattern -i \.(html|htm|css|js)$ 1440 10% 4320
> ################################
> #prioritized in the company

NP: by placing these lines underneath the above file type lines any
requests which match the above rules are *not* affected in any way by
the below.

> ################################
> refresh_pattern \.web1\.org\/? 1440 99% 14400 override-expire
> override-lastmod ignore-reload ignore-private
> refresh_pattern \.web2\.org\\/? 1440 99% 14400 override-expire
> override-lastmod ignore-reload ignore-private
> refresh_pattern \.web3\.org\\/? 1440 99% 14400 override-expire
> override-lastmod ignore-reload ignore-private
> 

The '\\/?' in those regex means the path section of this domain URLs
always starts with at '\' character - which is not a valid
first-character for URL paths. I think you probably wanted '/?' instead,
just like web1.org pattern has.

.. BUT, since the '/?' means '/' is optional there is no point in even
having it part of the pattern - anything which is not a '/' is already
allowed to follow the 'g' when the optional bit is missing.

So ... all the above lines can be reduced to just one :
 refresh_pattern \.web[123]\.org 1440 99% 14400 ...

If you want that to match only the end of the domain name remove the '?'
since Squid always matches the regex against effective-URL where at
least one path '/' exists.

Also, since your patterns looking for filename extensions are all ended
with just the "$" anchor they cannot match any URLs where there is a
query-string after the filename.
 Things like "http://example.com/file.css?hello" will not have the .css
rules applied to them. If your website uses a CMS like Wordpress or
similar they add a session ID in query-string to all page sub-resources.
That may be part of your problem.

To avoid that problem use patterns like:
  \.(css|js|html)(\?.*)?$

I don't see the default "(/cgi-bin/|\?)" and "." patterns. Having those
at the end of your refresh_pattern rules can avoid a lot of weird
caching problems, especially with dynamic content coming from broken
servers.

Since you say these are company websites you probably have a better
contact with the developers managing them than anyone here. The absolute
best thing to do is encourage them to design the site with caching in
mind and send HTTP headers that optimize storage. That way your company
can leverage cost savings from HTTP caches all around the world for free
on your public traffic to those sites.
 If you just have your own proxy being optimized to store a badly
designed website, you give a false sense of the site "working" to people
internally. When actually the external visitors are seeing bad things
happen and have a terrible experience visiting your site(s).

Some things about these ignore-* and override-* options which many
people are not aware of:

1) ignore-private - does not make objects with CC:private headers become
HITs. It does allow those objects to be cached "privately", but Squid
revalidates them on every use to check that they are valid for the next
user wanting it.

While this might save (some) *bandwidth* expenses, it is definitely not
a speed boost. It trades that unlikely gain against storage space needed
by other cacheable content that might have better savings for bandwidth
and faster speed.

2) ignore-reload - not as useful as rumors have it. All it does is stop
client software (or users) from being able to tell the proxy that the
content in cache is broken/outdated and needing a reload.

I expect that while looking into this problem one of the first things
you did when seeing the broken CSS on the web page was hit the browser
refresh/reload button? well, yeah.

3) override-lastmod - this one is a bit odd. It pretty much does the
exact *opposite* of causing things to stay in cache.

The header tells the proxy cache how much _older_ the object was at the
time it was received by the cache than the Date header indicates. In
heuristics LM-factor consideration the older an object was at time of
arrival the longer it is likely to still be usable as a HIT.

So by ignoring the header, Squid actually thinks the object is younger
and more likely to need replacing (MISS / REFRESH) than it normally
would be.

NP: If the Last-Modified being presented by your company server(s) is
*SO* bad that it has to be overridden to get a good experience. Think
how that appears in the view of other external visitors.

Sorry this is so long. Cache optimization is not a simple topic.

HTH
Amos