[squid-users] Squid config is failing to cache data
Amos Jeffries
squid3 at treenet.co.nz
Wed Jan 13 18:19:33 UTC 2016
On 14/01/2016 6:16 a.m., Hardik Dangar wrote:
> Hi all,
>
> I handle small network and we have 40 systems ( most having Ubuntu 14.04
> and couple of system have windows ). We use squid to cache. Due to the
> country where i live there is huge data charges so i am using squid to
> cache things like Ubuntu updates and certain applications.
>
> Issue i have is, My squid configuration is either failing to cache Ubuntu
> updates mostly Debian packages. I see following status codes frequently in
> my squid log file.
> TCP_REFRESH_UNMODIFIED/304
> TCP_REFRESH_UNMODIFIED/200
> TCP_REFRESH_MODIFIED/200
>
> which confuses me. as i know only few options like
> TCP_HIT/TCP_MEM_HIT/TCP_MISS.
Please start by reading this
<http://wiki.squid-cache.org/SquidFaq/SquidLogs#Squid_result_codes>
> and searching about them explains that i
> might have my squid_patterns wrong or data might be changing but the
> problem is i have setup schedule so two system update on tuesday and then
> on wednesday all system updates. yet i see lots of data with status
> TCP_REFRESH_UNMODIFIED/200 or TCP_REFRESH_UNMODIFIED/304. Most clients
> get about 40% to 50% cache. I could totally understand the updates are
> there but i get stumped when TCP_REFRESH_UNMODIFIED/200 happens.
Welcome to HTTP/1.1. Those are all HTTP/1.1 revalidation requests
updating the cached content before delivery to the client. While saving
bandwidth in ways that HIT and MISS cannot.
Squid is a cache, not an archive. It self-updates the cache content as
needed.
- the UNMODIFIED are when the copy the Squid already has cached is not
changed. No payload object is fetched from the server.
- the MODIFIED are where both the Squid cached object is outdated. A
replacment object is delivered by the server.
- the 304 are when the client copy has not changed. So no payload is
delivered from Squid to client.
- the 200 are when the client copy is outdated. A replacment object is
delivered by Squid.
>
> I also noticed TCP_REFRESH_UNMODIFIED/200 happens for google chrome debian
> package reguarly even though same file is downloaded previous day by some
> clients. i see entries of TCP_REFRESH_UNMODIFIED/200 or TCP_MISS/200 or
> TCP_REFRESH_MODIFIED/200. i have the entry in my configuration file for
> deb.google url like "refresh_pattern dl.google.com/.*\.(deb)
> <http://dl.google.com/.*%5C.(deb)> 129600 100% 129600 reload-into-ims
> ignore-reload override-expire override-lastmod ignore-no-store
> ignore-private ignore-must-revalidate ".
Blindly turning off performance and bandwidth saving mechanisms simply
because you cant understand the log is not a great idea.
FYI: almost all tutorials you will find online are from people working
with old HTTP/1.0 Squid versions or only understanding HTTP/1.0
behaviour (like you and your HIT/MISS focus).
* reload-into-ims is fine, that changes the forced-MISS Chrome is trying
to make happen into these nicer refresh/revalidation that save some
bandwidth but still deliver the full-sized 200 status reply Chrome demands.
* ignore-reload cancels the effect of the above (if it works).
* override-expire is also fine. It only forces content to stay in cache
longer than Expires/max-age header says it should. Old content cached by
this will lead to MISS/200 becoming MODIFIED/200 in your logs - if you
are lucky it might become UNMODIFIED.
The other options do more harm than good.
You heard about how Steam recently had a big issue about showing gamers
each others account details? that was a cache somewhere in their system
doing its equivalent of "ignore-no-store ignore-private
ignore-must-revalidate".
>
> Can any one help me with this issue? is this normal? Or there is an issue
> in my squid config? I have attached my squid config and some sample log
> which confuses me.
So far as I can tell from your description what is happening is both
normal and Good. So dont panic.
>
> My squid version is : 3.3.8 for detail options and squid config file i have
> pasted the content of both at, ( Operating system is Ubuntu 14.04.3 LTS
> "trusty" )
> http://pastebin.com/raw/mEjZ24KT
It is fine to just add "refresh-into-ims" to the end of the
"refresh_pattern ." line. Then you can remove all those special
PackagesSources/Release/Translations patterns.
The udeb$ pattern is not doing anything because the deb$ pattern already
matches all those URLs. So you can remove the "refresh_pattern udeb$" line.
I suspect the "dl.google.com/.*\.(deb)" line is not doing anything for
the same reason. But it does not require the end of URL to be "deb", so
may be randomly matching URLs with query-strings.
I'm not sure what will change if you remove the deb$ pattern as well. I
found repository servers vary by mirror software what HTTP headers they
produce - which affects whether Squid default patterns cache them nicely
or MISS a lot.
Up to you, the rule as-is should be harmless at worst.
>
>
> My squid access.log file sample is available at,
> http://pastebin.com/raw/A6kyksY8
If you look closely all the UNMODIFIED/304 are very tiny 0.33 KB
transfers. Even though they are equivalent to a full HIT. Those Packages
files can be MB in size.
Amos
More information about the squid-users
mailing list