[squid-users] Squid3: 100 % CPU load during object caching

Tue Jul 21 11:42:29 UTC 2015

On 21/07/2015 7:59 p.m., Jens Offenbach wrote:
> I am running Squid3 3.3.8 on Ubuntu 14.04. Squid3 has been installed from the Ubuntu package repository. In my scenario, Squid has to cache big files >= 1 GB. At the moment, I am getting very bad transfer rates lower that 1 MB/sec. I have checked the connectivity using iperf3. It gives my a bandwith of 853 Mbits/sec between the nodes.
> 
> I have tried to investigate the problem and recognized that when there is no cache hit for a requested object, the Squid process reaches shortly after startup 100 % of one CPU core. The download rate drops down to 1 MB/sec. When I have a cache hit, I only get 30 MB/sec in my download.
> 
> Is there someting wrong with my config? I have already used Squid 3.3.14. I get the same result. Unfortunately, I was not able to build Squid 3.5.5 and 3.5.6.
> 

Squid-3 is better able to cope with large objects than Squid-2 was. But
there are still significant problems.

Firstly, you only have space in memory for 4x 1GB objects. Total. if you
are dealing with such large objects at any regular frequency, you need
at least a much larger cache_mem setting.

Secondly, consider that Squid-3.3 places *all* active transactions into
cache_mem. 4GB of memory cache can store ~4 million x 1KB transactions,
or only 4 x 1GB transactions.

 If you have a cache full of small objects happily sitting in memory.
Then a requests for a 1GB object comes in, a hugh number of those small
objects need to be pushed out of memory cache onto disk, the memory
reallocated for use by the big one, and possibly 1GB object loaded from
disk into memory cache.

Then consider that GB sized object sitting in cache as it gets near to
being the oldest in memory. The next request is probably a puny little
0-1KB object, Squid may have to repeat all the GB size shufflings to and
from disk just to make memory space for that KB.

As you can imagine any one part of that process takes a lot of work and
time with a big object involved as compared to only small objects being
involved. The whole set of actions can be excruciatingly painful if the
proxy is busy.

Thirdly, you also only have 88GB of disk cache total. Neither that nor
the memory cache is sufficient to be trying to cache GB sized objects.
The tradeoff is whether one GB size object is going to get enough HITs
often enough to be worth not caching the million or so smaller objects
that could be taking its place. For most uses the tradeoff only makes
sense with high traffic on the large objects and/or TB of disk space.

My rule-of-thumb advice for caching is to keep it so that you can store
at least a few thousand maximum-sized objects at once in the allocated size.

So 4GB memory cache reasonable for 1MB size objects, 80GB disk cache is
reasonable for ~100MB sized objects.

That keeps almost all web page traffic able to be in memory, bigger but
popular media/video objects on disk. And the big things like Windows
Service Packs or whole DVD downloads get slower network fetches as
needed. If those latter are actually a problem for you get a bigger disk
cache, you *will* need it.

And a free audit for your config...

> Here is my squid.conf:
> # ACCESS CONTROLS
> # ----------------------------------------------------------------------------
>   acl intranet    src 139.2.0.0/16
>   acl intranet    src 193.96.112.0/21
>   acl intranet    src 192.109.216.0/24
>   acl intranet    src 100.1.4.0/22
>   acl localnet    src 10.0.0.0/8
>   acl localnet    src 172.16.0.0/12
>   acl localnet    src 192.168.0.0/16
>   acl localnet    src fc00::/7
>   acl localnet    src fe80::/10
>   acl to_intranet dst 139.2.0.0/16
>   acl to_intranet dst 193.96.112.0/21
>   acl to_intranet dst 192.109.216.0/24
>   acl to_intranet dst 100.1.4.0/22
>   acl to_localnet dst 10.0.0.0/8
>   acl to_localnet dst 172.16.0.0/12
>   acl to_localnet dst 192.168.0.0/16
>   acl to_localnet dst fc00::/7
>   acl to_localnet dst fe80::/10

The intended purpose behind the localnet and to_localnet ACLs is that
they are matching your intranet / LAN  / local network ranges.

The ones we distribute are just common standard ranges. You can simplify
your config by adding the intranet ranges to localnet and dropping all
the 'intranet' ACLs.

... BUT ...

>   http_access allow manager localhost
>   http_access deny  manager
>   http_access allow localnet
>   http_access allow localhost
>   http_access deny all

... noting how the intranet ACLs are not used to permit access through
the proxy. Maybe just dropping them entirely is better. If this is a
working proxy they are not being used.

> 
> # NETWORK OPTIONS
> # ----------------------------------------------------------------------------
>   http_port 0.0.0.0:3128
> 
> # OPTIONS WHICH AFFECT THE NEIGHBOR SELECTION ALGORITHM
> # ----------------------------------------------------------------------------
>   cache_peer proxy.mycompany.de parent 8080 0 no-query no-digest
> 
> # MEMORY CACHE OPTIONS
> # ----------------------------------------------------------------------------
>   maximum_object_size_in_memory 1 GB
>   memory_replacement_policy heap LFUDA
>   cache_mem 4 GB
> 
> # DISK CACHE OPTIONS
> # ----------------------------------------------------------------------------
>   maximum_object_size 10 GB
>   cache_replacement_policy heap GDSF
>   cache_dir aufs /var/cache/squid3 88894 16 256 max-size=10737418240
> 
> # LOGFILE OPTIONS
> # ----------------------------------------------------------------------------
>   access_log daemon:/var/log/squid3/access.log squid
>   cache_store_log daemon:/var/log/squid3/store.log
> 
> # OPTIONS FOR TROUBLESHOOTING
> # ----------------------------------------------------------------------------
>   cache_log /var/log/squid3/cache.log
>   coredump_dir /var/log/squid3
> 
> # OPTIONS FOR TUNING THE CACHE
> # ----------------------------------------------------------------------------
>   cache allow localnet
>   cache allow localhost
>   cache allow intranet
>   cache deny  all

I think that does not do what you think.

The only traffic allowed to use this proxy (by http_access) is localnet
or localhost traffic.

Only traffic fetched by localnet or localhost cane be cached.
THerefore, all traffic allowed to go through this proxy can be cached.

You can simplify and speed up Squid a bit by removing the "cache ..."
ACLs entirely form your config. The default is "allow all"

>   refresh_pattern ^ftp:              1440    20%    10080
>   refresh_pattern ^gopher:           1440     0%     1440
>   refresh_pattern -i (/cgi-bin/|\?)     0     0%        0
>   refresh_pattern .                     0    20%     4320
> 
> # HTTP OPTIONS
> # ----------------------------------------------------------------------------
>   via off
> 
> # ADMINISTRATIVE PARAMETERS
> # ----------------------------------------------------------------------------
>   cache_effective_user proxy
>   cache_effective_group proxy
> 

The above should all be unnecessary for 99.99% of Squid instalalations.
In which case you would explicitly not have added the proxy user to
unnecessary system groups anyway.

> # ICP OPTIONS
> # ----------------------------------------------------------------------------
>   icp_port 0
> 
> # OPTIONS INFLUENCING REQUEST FORWARDING 
> # ----------------------------------------------------------------------------
>   nonhierarchical_direct on
>   prefer_direct off
>   always_direct allow to_localnet
>   always_direct allow to_localhost
>   always_direct allow to_intranet
>   never_direct  allow all
> 
> # MISCELLANEOUS
> # ----------------------------------------------------------------------------
>   memory_pools off
>   forwarded_for off

<http://www.squid-cache.org/Versions/v3/3.3/cfgman/forwarded_for.html>

Most Squid installations (and tuorials for very old Squid) using
"forwarded_for off" actually wanted to perform the action of "truncate".
Remainign few cases wanted "delete".

Ironicaly, revealing your clients LAN IPs is the *desirable* behaviour.
Since the purpose of XFF header is to allow sites like Wikipedia to
block individual malicious clients, without casting your entire network
into a traffic black hole. And client software with an interest in
privacy can alter the details it presents for the proxy to use in that
header - obfuscating their existance better than aggregating the proxy
traffic ready for trackers to use.

Amos