[squid-users] Crash: every 19 hours: kernel: Out of memory: Kill process (squid)

Cherukuri, Naresh ncherukuri at partycity.com
Fri Aug 11 13:13:24 UTC 2017


Amos,



Please find below my squid conf and access logs and memory output in MB. Appreciate any help.



Memory Info:

[root@******prod ~]# free -m

             total       used       free     shared    buffers     cached

Mem:         11845       4194       7651         41        190       1418

-/+ buffers/cache:       2585       9260

Swap:        25551        408      25143



Squidconf:

[root@******prod squid]# more squid.conf

#

# Recommended minimum configuration:

#

max_filedesc 4096

acl manager proto cache_object

visible_hostname ******prod

logfile_rotate 10



access_log /cache/access.log



acl localnet src 172.16.0.0/16

acl backoffice_users src 10.136.0.0/13

acl h****_backoffice_users src 10.142.0.0/15

acl re****_users src 10.128.0.0/13

acl hcity_r*****_users src 10.134.0.0/15

acl par**** url_regex par****



acl SSL_ports port 443

acl Safe_ports port 80          # http

#acl Safe_ports port 21         # ftp

acl Safe_ports port 443         # https

#acl Safe_ports port 70         # gopher

#acl Safe_ports port 210                # wais

#acl Safe_ports port 1025-65535 # unregistered ports

#acl Safe_ports port 280                # http-mgmt

#acl Safe_ports port 488                # gss-http

#acl Safe_ports port 591                # filemaker

#acl Safe_ports port 777                # multiling http

acl CONNECT method CONNECT

acl backoffice_allowed_sites url_regex "/etc/squid/backoffice_allowed_sites"

acl h***_backoffice_allowed_sites url_regex "/etc/squid/backoffice_allowed_sites"

acl backoffice_blocked_sites url_regex "/etc/squid/backoffice_blocklist"

acl h***_backoffice_blocked_sites url_regex "/etc/squid/backoffice_blocklist"

acl re****_allowed_sites url_regex "/etc/squid/re****_allowed_sites"

acl h****_reg****_allowed_sites url_regex "/etc/squid/h***_reg*****_allowed_sites"

#



http_access allow localnet reg***_allowed_sites

http_access deny backoffice_users backoffice_blocked_sites

http_access deny h***_backoffice_users backoffice_blocked_sites

http_access allow backoffice_users backoffice_allowed_sites

http_access allow h***_backoffice_users backoffice_allowed_sites

http_access allow reg****_users reg****_allowed_sites

http_access allow h***_reg****_users h***_reg****_allowed_sites

no_cache deny par****

http_access deny all



#http_access allow manager localhost

#http_access deny manager



# Deny requests to certain unsafe ports

http_access deny !Safe_ports



# Deny CONNECT to other than secure SSL ports

#http_access deny CONNECT !SSL_ports

http_access  allow CONNECT SSL_ports

# We strongly recommend the following be uncommented to protect innocent

# web applications running on the proxy server who think the only

# one who can access services on "localhost" is a local user

http_access deny to_localhost



# Example rule allowing access from your local networks.

# Adapt localnet in the ACL section to list your (internal) IP networks

# from where browsing should be allowed

#http_access allow localnet

http_access allow localhost



# And finally deny all other access to this proxy

http_access deny all



# Squid normally listens to port 3128

http_port 3128 ssl-bump \

key=/etc/squid/pc****sslcerts/pc*****prod.pkey \

cert=/etc/squid/pc******sslcerts/pc*****prod.crt \

generate-host-certificates=on dynamic_cert_mem_cache_size=4MB



acl step1 at_step SslBump1

ssl_bump peek step1

#ssl_bump bump all

ssl_bump bump backoffice_users !localnet !h***_backoffice_users !reg****_users !h***_reg***_users !par***



#sslproxy_capath /etc/ssl/certs

sslproxy_cert_error allow all

always_direct allow all

sslproxy_flags DONT_VERIFY_PEER



sslcrtd_program /usr/lib64/squid/ssl_crtd -s /var/lib/ssl_db -M 4MB sslcrtd_children 8 startup=1 idle=1



# Uncomment and adjust the following to add a disk cache directory.

#cache_dir ufs /cache/squid 10000 16 256



# Leave coredumps in the first cache dir

#rdescoredump_dir /var/spool/squid

#coredump_dir /var/log/squid/squid

coredump_dir /cache/squid



# Add any of your own refresh_pattern entries above these.

refresh_pattern ^ftp:           1440    20%     10080

refresh_pattern ^gopher:        1440    0%      1440

refresh_pattern -i (/cgi-bin/|\?) 0     0%      0

refresh_pattern .               0       20%     4320



#url_rewrite_access allow all

#url_rewrite_program /usr/bin/squidGuard -c /etc/squid/squidguard.conf



Accesslogs:

1502424001.504      0 10.138.142.6 TCP_DENIED/403 4175 GET http://update.scansoft.com/GetCertificate.asp? - HIER_NONE/- text/html

1502424001.533    329 10.140.230.6 TAG_NONE/200 0 CONNECT watson.telemetry.microsoft.com:443 - HIER_DIRECT/65.55.252.202 -

1502424001.543      0 10.141.80.6 TCP_DENIED/403 4167 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.546    331 10.140.230.6 TAG_NONE/200 0 CONNECT watson.telemetry.microsoft.com:443 - HIER_DIRECT/65.55.252.202 -

1502424001.551  29923 10.130.27.24 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par****.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.571      0 10.141.108.6 TCP_DENIED/403 4269 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html

1502424001.572      0 10.138.142.6 TCP_DENIED/403 4175 GET http://update.scansoft.com/GetCertificate.asp? - HIER_NONE/- text/html

1502424001.579      0 10.140.167.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.590  27992 10.140.248.7 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par*****.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.631      0 10.141.108.6 TCP_DENIED/403 4269 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html

1502424001.643      0 10.138.142.6 TCP_DENIED/403 4175 GET http://update.scansoft.com/GetCertificate.asp? - HIER_NONE/- text/html

1502424001.646      0 10.136.76.7 TCP_MISS_ABORTED/000 0 POST https://watson.telemetry.microsoft.com/Telemetry.Request - HIER_DIRECT/65.55.252.202 -

1502424001.654  29864 10.133.222.25 TCP_MISS_ABORTED/000 0 GET http://10.1.2.35:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.670      1 10.140.167.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.676      0 10.141.215.6 TCP_DENIED/403 3998 OPTIONS http://172.16.4.19/PrintQueue/completed/ - HIER_NONE/- text/html

1502424001.678  29927 10.132.157.21 TCP_MISS_ABORTED/000 0 GET http://10.1.2.35:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.688      0 10.141.108.6 TCP_DENIED/403 4269 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html

1502424001.700    363 10.136.171.6 TAG_NONE/200 0 CONNECT watson.telemetry.microsoft.com:443 - HIER_DIRECT/65.55.252.202 -

1502424001.702    365 10.138.31.10 TAG_NONE/200 0 CONNECT ent-shasta-rrs.symantec.com:443 - HIER_DIRECT/104.40.50.196 -

1502424001.716      0 10.138.142.6 TCP_DENIED/403 4175 GET http://update.scansoft.com/GetCertificate.asp? - HIER_NONE/- text/html

1502424001.756      0 10.141.108.6 TCP_DENIED/403 4269 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html

1502424001.782      0 10.140.230.6 TCP_MISS_ABORTED/000 0 POST https://watson.telemetry.microsoft.com/Telemetry.Request - HIER_DIRECT/65.55.252.202 -

1502424001.782      0 10.140.230.6 TCP_MISS_ABORTED/000 0 POST https://watson.telemetry.microsoft.com/Telemetry.Request - HIER_DIRECT/65.55.252.202 -

1502424001.787  29983 10.132.141.21 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par***.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.792      2 10.138.142.6 TCP_DENIED/403 4175 GET http://update.scansoft.com/GetCertificate.asp? - HIER_NONE/- text/html

1502424001.792  29928 10.128.101.24 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par****.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.815      0 10.141.108.6 TCP_DENIED/403 4269 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html

1502424001.841      0 10.141.160.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.843      0 10.141.215.6 TCP_DENIED/403 3998 OPTIONS http://172.16.4.19/PrintQueue/completed/ - HIER_NONE/- text/html

1502424001.873      0 10.141.108.6 TCP_DENIED/403 4269 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html

1502424001.892  29805 10.141.82.6 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par****.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.907      0 10.141.160.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.912  29990 10.128.10.24 TCP_MISS_ABORTED/000 0 GET http://10.1.2.35:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.927      0 10.136.147.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.938     77 10.138.31.10 TCP_MISS/200 514 POST https://ent-shasta-rrs.symantec.com/mrclean? - HIER_DIRECT/104.40.50.196 -

1502424001.946  29949 10.130.45.23 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par*****.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.974      0 10.141.160.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424001.976  28002 10.136.84.6 TCP_MISS_ABORTED/000 0 GET http://pc-sep.pcwhq.par****.net:8014/secars/secars.dll? - HIER_DIRECT/10.1.2.35 -

1502424001.997      0 10.136.171.6 TCP_MISS_ABORTED/000 0 POST https://watson.telemetry.microsoft.com/Telemetry.Request - HIER_DIRECT/65.55.252.202 -

1502424002.003      1 10.136.147.6 TCP_DENIED/403 4168 GET http://update.scansoft.com/Version.asp? - HIER_NONE/- text/html

1502424002.013      0 10.138.33.6 TCP_DENIED/403 4268 GET http://update.scansoft.com/GetMessages.asp? - HIER_NONE/- text/html



Cachelog errors I am seeing daily:



Error negotiating SSL connection on FD 26: error:140A1175:SSL routines:SSL_BYTES_TO_CIPHER_LIST:inappropriate fallback (1/-1)

Error negotiating SSL connection on FD 1175: error:14094416:SSL routines:SSL3_READ_BYTES:sslv3 alert certificate unknown (1/0)

2017/08/02 09:01:02 kid1| Error negotiating SSL on FD 989: error:00000000:lib(0):func(0):reason(0) (5/-1/104) ##Very rare i found few not frequently

2017/08/02 09:01:43 kid1| Queue overload, rejecting # too many times

2017/08/02 09:01:45 kid1| Error negotiating SSL connection on FD 1749: (104) Connection reset by peer ## too many times

2017/08/02 10:12:58 kid1| WARNING: Closing client connection due to lifetime timeout ## only one

2017/08/07 22:37:56 kid1| comm_open: socket failure: (24) Too many open files

2017/08/07 22:39:37 kid1| WARNING: Error Pages Missing Language: en

2017/08/07 22:39:37 kid1| '/usr/share/squid/errors/en-us/ERR_DNS_FAIL': (24) Too many open files

2017/08/07 22:39:37 kid1| WARNING: Error Pages Missing Language: en-us

2017/08/07 22:01:42 kid1| WARNING: All 32/32 ssl_crtd processes are busy.

2017/08/07 22:01:42 kid1| WARNING: 32 pending requests queued

2017/08/07 22:01:42 kid1| WARNING: Consider increasing the number of ssl_crtd processes in your config file.

2017/08/11 00:58:56 kid1| WARNING: Closing client connection due to lifetime timeout

2017/08/09 12:55:45 kid1| WARNING! Your cache is running out of filedescriptors



Thanks,

Naresh

-----Original Message-----
From: squid-users [mailto:squid-users-bounces at lists.squid-cache.org] On Behalf Of Amos Jeffries
Sent: Friday, August 11, 2017 12:51 AM
To: squid-users at lists.squid-cache.org
Subject: Re: [squid-users] Crash: every 19 hours: kernel: Out of memory: Kill process (squid)



On 11/08/17 06:36, Cherukuri, Naresh wrote:

> Eliezer,

>

> I cannot say by client or network. But, for sure I can say we have around 7000 computers using squid as a proxy.

>



FYI: that number means peak load may be as high as 70K RPS (~100 req/client), a quad-core machine might be able to handle that but the

2.13 GHz CPU speed makes me doubtful. I'd plan for up to 4 machines of this type to be built in the medium-long term.





>

> From: Cherukuri, Naresh

> Sent: Thursday, August 10, 2017 16:27

>

> No this a physical box and we are using only for squid. We have 4

> cpu's and

> 16 cores. Please find below for reference Accesslogs : redirected to

> /cache/squid

>



NP: I don't see any access logs details in your post. Just CPU specs and disk FS mappings.





> -----Original Message-----

> From: Cherukuri, Naresh

> Sent: Thursday, August 10, 2017 16:03

>

> Hello Eliezer,

>

> We are using OS "Redhat 7" and squid version 3.5.20.

>

> As of now we are not using any cache, we already commented out. You want me try using cache by uncommenting the following line.

>

> # Uncomment and adjust the following to add a disk cache directory.

> #cache_dir ufs /cache/squid 10000 16 256

>



No, suggestion was for the default memory-only cache. Which means no

cache_mem or cache_dir entries in your squid.conf (letting Squid use its

defaults). It should still be caching, just much less.





> From: squid-users On Behalf Of Cherukuri, Naresh

> Sent: Tuesday, August 8, 2017 16:28

>

> Hello,

>

> I am new to squid. I am getting a problem every 19 hours squid takes all RAM memory, then started taking swap in  20 minutes my swap is full. Then server side (OOM) is activating and killing all squid child's then finally killing squid parent. Can someone help me how to address this problem?

>



FYI: My brief understanding of the OOM is that when the sum total of all

processes on the machine start consuming too mush RAM it kills off the

largest user. So it may be that Squid is using some large (but

reasonable) amount of RAM and something else entirely pushes it over the

edge - OOM just killing Squid because it has the most memory at that time.



So, if you have any record of the machines memory usage by process over

time it would be good to know for certain whether it is Squid alone, or

squid + something else that is the problem. Something along the lines of

a log processor that kicks in once a day and uses lots of RAM briefly

may exist.





> Why every 19 hours my memory is going to full?

>

> How much Ram do I need for following squid version?

>

> Squid Cache: Version 3.5.20

>

>               total       used       free     shared    buffers     cached

> Mem:         11845       2713       9132         14         71       1641 -/+ buffers/cache:       1000      10845

> Swap:        25551        421      25130





The biggest problem I see here is that there are no units. These could

be KB or GB for all we know.





Please clarify that missing info.





FWIW: Squid with some minimal tuning runs fine on embeded devices with

32MB of RAM and has not had a memory leak in a long time. So it is

usually a matter of some feature misconfigured. That said there is an

issue with OpenSSL objects that can look like a memory leak in 3.x if

one is not careful.



To see if anything is misconfigured please post your squid.conf (without

the #commented out lines) so we can review it for problems.







Cheers

Amos

_______________________________________________

squid-users mailing list

squid-users at lists.squid-cache.org<mailto:squid-users at lists.squid-cache.org>

http://lists.squid-cache.org/listinfo/squid-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20170811/700f69e7/attachment-0001.html>


More information about the squid-users mailing list