[squid-users] Squid not caching some files

Amos Jeffries squid3 at treenet.co.nz
Thu Jul 28 10:30:03 UTC 2016


On 28/07/2016 1:33 p.m., John Pearson wrote:
> Hi,
> 
> main problem: different squid configurations are not caching certain files.
> 
> These are my conf files `1_squid.conf` and `2_squid.conf` both can be found
> here:
> 
> https://gist.github.com/ironpillow/e6b86354f4ac3941f74db86d893008f1
> 
> I am using http://www.thinkbroadband.com/download/ to download the 5MB zip
> file but it's always a tcp_miss UNLESS I uncomment (use) lines 57 and 58 in
> 1_squid.conf. dmg files are being cached.
> 
> But when using 2_squid.conf, the above zip file is cached (tcp_hit) but dmg
> files (https://support.apple.com/kb/dl1870?locale=en_US) are not being
> cached.
> 
> Any advice?

Quite a lot.

Firstly, the design of those two configs is quite different about what
they do when caching. Some of the below details about #1 config should
explain why #2 config does them differently, the rest of the changes
apply to both configs.

Specifics:

1) there are no such things as "files" in HTTP. "file" is a disk storage
concept. Network transfer protocols are about resources and where they
are located (URL). Any relationship between URL and a filename is a
coincidence of that domains designer having made it so, and certainly
not reliable in the general case. That effects the (3) behaviour below.

2) in HTTP the relationship between "site" and URL is tenuous at best.
Just because one URL is displayed as being where to fetch an object does
not mean thats where the object resides. Redirects can happen in between
initial fetch, and your Store-ID helper will also be having effects on
what URL the refresh_pattern see as representing the object.

3) the regex patterns you have for URLs *ending* with specific 4-letter
sequences between lines 54-70 are;
 a) specifically bound to individual domain names (thats good because of
#1 above), and
 b) do not include the domains you mention having trouble with (which
explains why they do not do what you expect to those domains).

4) due to the way you have configured the "cache" directives. Only
domain names listed in /etc/squid/updatesites.txt will ever be stored by
Squid. This effects the behaviours created by (2) and (3) -
refresh_pattern is only relevant for stored content.

5) Squid *will not* store responses for intercepted traffic unless it
can verify the server being contacted is actually the authoritative
origin server for that URL domain.
 * The DNS servers behind "8.8.8.8" are expicitly configured to rotate
teh IP addresses on every single lookup. Which makes it almost
guaranteed that Squid and the client being intercepted will be seeing
different sets of origin servers when they lookup the domain.

6) configuring "dns_defnames" to pass *single label* domain names out to
the global 8.8.8.8 service is plain wrong. Remove that line.

7) "logformat squid" - do not redefine Squid's built-in log formats. It
will *not* record the values you think it records.

8) remove the comment from line 84 of 1_squid.conf. That line defines
the proper way to deal with URLs when they have query strings.

9) remove the "regex_pattern -i cgi-bin" lines at 86-87. Its an old and
wrong config setting.

10) you can remove the "always_direct allow all" it is about whether to
use cache_peer's and is pointless in your configuration that doesn't use
any peers.

Amos



More information about the squid-users mailing list