[squid-users] squid on openwrt: RAM usage and header forgery

Wed Oct 10 13:15:53 UTC 2018

On 10/10/18 8:18 PM, reinerotto wrote:
> 
> Using squid 4.0.24 on openwrt,

Please upgrade. All 4.0.z versions are beta releases and no longer
supported.

> I see it grabbing significant amount of
> additional RAM after short period of activity, although I tried to downsize
> squid as much as possible. Any suggestion for further significant reduction
> of mem requirements after startup, or why is there such a growth (> 10MB)
> after short period of time ?

Your machines /proc details below show the large numbers to be virtual,
not real memory usage

> 
> Now mem requirements for kid-1, shortly after boot:
> cat /proc/1447/status
> Name:   squid
...
> VmPeak:    15836 kB <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> VmSize:    15836 kB
> VmLck:         0 kB
> VmPin:         0 kB
> VmHWM:     11324 kB
> VmRSS:     11324 kB
> RssAnon:            4596 kB
> RssFile:            6660 kB
> RssShmem:             68 kB
> VmData:     5708 kB <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> VmStk:       132 kB
> VmExe:      3744 kB
> VmLib:      4196 kB
> VmPTE:        28 kB
> VmPMD:         0 kB
> VmSwap:        0 kB

> 
> #1h later, after some usage:
>  cat /proc/1447/status
> Name:   squid
...
> VmPeak:    28844 kB <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> VmSize:    28844 kB
> VmLck:         0 kB
> VmPin:         0 kB
> VmHWM:     23064 kB
> VmRSS:     23064 kB
> RssAnon:           15856 kB
> RssFile:            7140 kB
> RssShmem:             68 kB
> VmData:    18716 kB <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> VmStk:       132 kB
> VmExe:      3744 kB
> VmLib:      4196 kB
> VmPTE:        40 kB
> VmPMD:         0 kB
> VmSwap:        0 kB

Note that the first value you are pointing at in the above report is
*peak* virtual memory usage. In other words the highest *ever* amount of
memory allocated, during all of that "previous usage" traffic going
through the proxy.

The second value is lower, so there is no memory leak nor problems
similar to leaks.

I expect it is the natural outcome of using helpers. The way fork()
operates on most systems is to allocate a block of virtual memory equal
in size to the real memory used at that time by the Squid worker process
starting the helper. It is never actually used by the child/helper
process. So as long as your machine can cope with the size existing in
virtual memory it can be ignored.

Your helpers are external ACL helpers, for which the default behaviour
is to not start any running until Squid is active and the first HTTP
message received.
 The second VmData value is just over 3x the initial VmData value. I
would not be surprised to see 3 helper processes running when that /proc
listing was done.

There have been a few bugs and code polish changes since that old beta
version which have been about memory reductions and improvements. But I
would not expect any major difference, just more streamlined use under load.

The other thing you can do to improve memory usage by Squid is to leave
the memory pools feature *enabled*. That allows you to define how much
memory Squid uses for its pools
(<http://www.squid-cache.org/Doc/config/memory_pools_limit/>). It also
allows you to look at the cachemgr "mem" report to see what Squid is
using memory for (showing allocation rates, active amounts and peak
values). Without the memory pooling one can only guess at these numbers.

FYI: The doc for that pools limit directive mentions pools as a way to
avoid memory thrashing. Squid allocates and deallocates memory in small
packet-sized blocks at a rate approx 4x that of the traffic being
handled. eg. a 1MB/sec traffic rate will be allocating and then
deallocating around 4MB of memory in that same second.
 The way OS tend to cope with such high turnover cycles from a process
like Squid is to leave free()'d memory reserved in virtual-memory state
for the process that used them last to re-grab. This is also what Squid
pools do, but in a way optimized to also prevent fragmentation issues
accumulating due to each HTTP message being ever so slightly different
in memory needs.

> Initial mem requirements OK, but then the huge increase in size afterwards
> it not appreciated.
> (Don't need caching at all. Compiled without IPv6)

You do however need to process traffic. Simply receiving that traffic
requires memory allocations, processing the messages requires more, and
so on. All this processing accumulates small bits of information about
the traffic and memory is used to store that data to both produce mgr
reports and to optimize later traffic handling speeds. Without it your
proxy would be very, very slow.

> 
> First the (anon) squid.conf:
> acl localnet src 192.168.182.0/24
> acl ssl_ports port 443
> acl safe_ports port 80
> acl safe_ports port 443
> acl safe_ports port 3128
> acl connect method connect
> 
> http_access deny !safe_ports
> http_access deny connect !ssl_ports
> 
> acl acl1 url_regex -i .*/string1$
> acl acl2 url_regex -i .*/string2$
> acl acl3 url_regex -i .*/string3$
> 
> external_acl_type check_test ttl=0 cache=0 %SRC /etc/squid/check_test.sh
> external_acl_type check_test_2 ttl=30 negative_ttl=3 cache=32 %SRC
> /etc/squid/check_test_2.sh
> acl check_2 check_test_2
> acl check  external check_test
> 
> http_access deny acl1 check
> http_access deny acl2 check
> http_access deny acl3 check
> 
> http_access allow localnet
> http_access allow localhost
> http_access deny all
> 
> cache deny all
> access_log none
> cache_log /var/log/squid/cache.log
> cache_store_log stdio:/dev/null

Don't do that. Just remove the cache_store_log line entirely from your
config. The store.log has not been enabled by default since Squid-3.0.

> logfile_rotate 0
> logfile_daemon /dev/null

The above logfile_daemon line can be removed once you upgrade. It was
only needed to workaround bug 4171, which is fixed in the more recent
Squid-4 releases.

> 
> http_port 3128
> http_port 8888 intercept
> 
> https_port 4443  intercept ssl-bump cert=/etc/squid/ssl_cert/myCA.pem \
>   generate-host-certificates=off dynamic_cert_mem_cache_size=1MB
> sslflags=NO_DEFAULT_CA

Squid-4 release notes:
 "New option tls-default-ca replaces sslflags=NO_DEFAULT_CA, the default
is also changed to OFF."

So you can remove the about sslflags= option. It does nothing but waste
memory during loading of the config file.

> acl step1 at_step SslBump1
> ssl_bump peek step1 all
> 
> acl sni_block ssl::server_name .a.com
> acl sni_block ssl::server_name .b.com
> acl sni_block ssl::server_name .c.com
> ssl_bump terminate !check_2 sni_block check
> ssl_bump splice all
> 
> 
> cache_mem 0 MB
> shutdown_lifetime 10 seconds
> httpd_suppress_version_string on
> dns_v4_first on
> forwarded_for delete
> via off
> reply_header_access Cache deny all

Do you actually see "Cache:" headers in any traffic? if so what do they
mean?

I ask because that header name has recently been proposed for
standardization by the HTTPbis working group at IETF as a formal
mechanism to replace the X-Cache and X-Cache-* headers. There should not
be any existing use of that "Cache:" name and I/we need to make the IETF
working group aware of any existing uses that may clash with the proposal.

> client_idle_pconn_timeout 1 minute
> server_idle_pconn_timeout 5 minute
> memory_pools off
> ipcache_size 128
> fqdncache_size 128
> reply_header_access Alternate-Protocol deny all
> reply_header_access alternate-protocol deny all

Header names are case-insensitive. The above is duplicate definition and
just wastes memory both at config parsing time, and in the permanent
memory representation of these header mangling rules.

> reply_header_access alt-svc deny all
> pinger_enable off
> digest_generation off
> netdb_filename none
> dns_nameservers 127.0.0.1
> reply_body_max_size 4 MB
> 
> 
...

> 
> I get quite a lot of messages in cache.log:
> 2018/10/09 12:38:49 kid1| ALE missing adapted HttpRequest object
> 2018/10/09 12:38:49 kid1| ALE missing URL
> 2018/10/09 12:38:49 kid1| ALE missing adapted HttpRequest object
> 2018/10/09 12:40:18 kid1| SECURITY ALERT: Host header forgery detected on
> local=212.95.165.32:443 remote=192.168.182.3:51304 FD 36 flags=33 (local IP
> does not match any domain IP)
> 2018/10/09 12:40:18 kid1| SECURITY ALERT: on URL:
> b.scorecardresearch.com:443
> 2018/10/09 12:40:28 kid1| SECURITY ALERT: Host header forgery detected on
> local=104.193.83.156:443 remote=192.168.182.3:51400 FD 183 flags=33 (local
> IP does not match any domain IP)
> 2018/10/09 12:40:28 kid1| SECURITY ALERT: on URL:
> csm2waycm-atl.netmng.com:443
> 2018/10/09 12:40:28 kid1| SECURITY ALERT: Host header forgery detected on
> local=104.193.83.156:443 remote=192.168.182.3:51402 FD 226 flags=33 (local
> IP does not match any domain IP)
> 
> My guess is, that the "header forgery" might be caused be inconsistency
> between browsers DNS-cache, my clients DNS-cache (Win 7) and the DNS-cache
> on the device, running squid. As practically all these "header forgeries"
> are for ad-networks, I consider it only an annoyance.Or is it _not_ ?
> 

As the warning message says: "local IP does not match any domain IP".
The client is trying to fetch data from a server which apparently is not
related to the origin server for the domain of the URL being fetched.

With your non-caching configuration it is just an annoyance. At worst it
means one of your clients is infected or being hijacked by a web-bug
script. So something to keep an eye on (thus the notices don't go away),
but not worry overly much about.

The issue of frequent false-positives is usually only seen when there
are very different recursive resolver chains being used to fetch the
information, or the parent resolver in that chain having non-standard
behaviour (as is the case with the 8.8.8.8 DNS services).
 Some things you can do to reduce these type of annoying warnings are
listed at <https://wiki.squid-cache.org/KnowledgeBase/HostHeaderForgery>.

Amos