[squid-users] squid hangs and dies and can not be killed - needs system reboot
Alex Rousskov
rousskov at measurement-factory.com
Tue Dec 19 15:16:17 UTC 2023
On 2023-12-18 22:29, Amish wrote:
> On 19/12/23 01:14, Alex Rousskov wrote:
>> On 2023-12-18 09:35, Amish wrote:
>>
>>> I use Arch Linux and today I updated squid from squid 5.7 to squid 6.6.
>>
>> > Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199
>>
>> I do not know whether the above problem is the primary problem in your
>> setup, but it is a red flag. Transactions on the same connection may
>> get stuck after that message; it is essentially a Squid bug.
>>
>> I am not sure at all, but this bug might be related to Bug 5187
>> workaround that went into Squid v6.2 (commit c44cfe7):
>> https://bugs.squid-cache.org/show_bug.cgi?id=5187
>>
>> Does Squid accept new TCP connections after it enters what you
>> describe as a dead state? For example, does "telnet 127.0.0.1 8080"
>> establishes a connection if executed on the same machine as Squid?
>>
> Yes it establishes connection. But I do not know what to do next.
This tells us that your Squid is still listening for incoming
connections. Most likely, it is not "dead" but running and just unable
to make progress with those connections (for yet unknown reasons). That
information is helpful but not sufficient (for me) to solve the problem
you are describing.
The next step that I would recommend is to collect debugging information
from the running process and share a pointer to the corresponding
compressed cache.log file:
* Ideally, start collection when Squid starts and reproduce the problem
while collecting full debugging information:
http://wiki.squid-cache.org/SquidFaq/BugReporting#full-debug-output
* If you have to, start collection after Squid is already in bad state
and just before you use telnet or browser to tickle Squid:
http://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction
Do not use any secret information (e.g., production certificate keys)
for these tests (unless you are going to share the logs privately with
those you trust).
Do not downgrade to v5 for these tests.
HTH,
Alex.
> Browser showed "Connection timed out" message. But I believe browser's
> also connected but nothing happened afterwards.
>
>>
>> > kill -9 does nothing
>>
>> Is it possible that you are trying to kill the wrong process? You
>> should be killing this process AFAICT:
>>
>> > root 601 0.0 0.2 73816 22528 ? Ss 12:59 0:02
>> > /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC
>
> I did not clarify but all processes needed SIGKILL and vanished except
> the Dead squid process which still remained.
>
> # systemctl stop squid
>
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: State 'stop-sigterm'
> timed out. Killing.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 601
> (squid) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604
> (squid) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 607
> (security_file_c) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 608
> (security_file_c) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 609
> (security_file_c) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 610
> (security_file_c) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 611
> (security_file_c) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 622
> (log_file_daemon) with signal SIGKILL.
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Main process exited,
> code=killed, status=9/KILL
> Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604
> (squid) with signal SIGKILL.
>
> Waited for 2 minutes for squid to stop then pressed ctrl-c to systemctl
> stop squid command.
>
> As you can see in last line shows that attempt was made to kill DEAD
> process with PID 604.
>
> # ps aux |grep squid
> proxy 604 0.0 0.0 0 0 ? D Dec18 0:03 [squid]
>
> Now only DEAD squid process remains.
>
> What next? Should I downgrade to 5.9 and check?
>
> Regards
>
> Amish
>
>> Alex.
>>
>>
>>> After the update from 5.7 to 6.6, squid starts but then reaches Dead
>>> state in a minute or two.
>>>
>>> # ps aux | grep squid
>>> root 601 0.0 0.2 73816 22528 ? Ss 12:59 0:02
>>> /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC
>>> proxy 604 0.0 0.0 0 0 ? D 12:59 0:03 [squid]
>>> proxy 607 0.0 0.0 11976 7424 ? S 12:59 0:00
>>> (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
>>> proxy 608 0.0 0.0 11976 7168 ? S 12:59 0:00
>>> (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
>>> proxy 609 0.0 0.0 11712 5632 ? S 12:59 0:00
>>> (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
>>> proxy 610 0.0 0.0 11712 5376 ? S 12:59 0:00
>>> (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
>>> proxy 611 0.0 0.0 11712 5504 ? S 12:59 0:00
>>> (security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
>>> proxy 622 0.0 0.0 6116 3200 ? S 12:59 0:00
>>> (logfile-daemon) /var/log/squid/access.log
>>>
>>> And then all requests get stuck. Notice the D (dead) state of squid.
>>>
>>> I use multiple ports for multiple purposes. (It all worked fine in
>>> squid 5.7)
>>>
>>> Dec 18 12:59:10 mumbai squid[601]: Starting Authentication on port
>>> [::]:3128
>>> Dec 18 12:59:10 mumbai squid[601]: Disabling Authentication on port
>>> [::]:3128 (interception enabled)
>>> Dec 18 12:59:10 mumbai squid[601]: Starting Authentication on port
>>> [::]:8081
>>> Dec 18 12:59:10 mumbai squid[601]: Disabling Authentication on port
>>> [::]:8081 (interception enabled)
>>> Dec 18 12:59:12 mumbai squid[601]: Starting Authentication on port
>>> [::]:8082
>>> Dec 18 12:59:12 mumbai squid[601]: Disabling Authentication on port
>>> [::]:8082 (interception enabled)
>>> Dec 18 12:59:12 mumbai squid[601]: Starting Authentication on port
>>> [::]:8083
>>> Dec 18 12:59:12 mumbai squid[601]: Disabling Authentication on port
>>> [::]:8083 (interception enabled)
>>> Dec 18 12:59:13 mumbai squid[601]: Starting Authentication on port
>>> [::]:8084
>>> Dec 18 12:59:13 mumbai squid[601]: Disabling Authentication on port
>>> [::]:8084 (interception enabled)
>>> Dec 18 12:59:13 mumbai squid[601]: Starting Authentication on port
>>> [::]:3136
>>> Dec 18 12:59:13 mumbai squid[601]: Disabling Authentication on port
>>> [::]:3136 (interception enabled)
>>> Dec 18 12:59:13 mumbai squid[601]: Starting Authentication on port
>>> [::]:3137
>>> Dec 18 12:59:13 mumbai squid[601]: Disabling Authentication on port
>>> [::]:3137 (interception enabled)
>>> ...
>>> Dec 18 12:59:29 mumbai squid[604]: Adaptation support is on
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted HTTP
>>> Socket connections at conn19 local=[::]:3128 remote=[::] FD 27 flags=41
>>> listening port: 3128
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket
>>> connections at conn21 local=[::]:8080 remote=[::] FD 28 flags=9
>>> listening port: 8080
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL
>>> bumped HTTPS Socket connections at conn23 local=[::]:8081 remote=[::]
>>> FD 29 flags=41
>>> listening port: 8081
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket
>>> connections at conn25 local=[::]:8092 remote=[::] FD 30 flags=9
>>> listening port: 8092
>>> Dec 18 12:59:29 mumbai systemd[1]: Started Squid Web Proxy Server.
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket
>>> connections at conn27 local=[::]:8093 remote=[::] FD 31 flags=9
>>> listening port: 8093
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting SSL bumped HTTP Socket
>>> connections at conn29 local=[::]:8094 remote=[::] FD 32 flags=9
>>> listening port: 8094
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL
>>> bumped HTTPS Socket connections at conn31 local=[::]:8082 remote=[::]
>>> FD 33 flags=41
>>> listening port: 8082
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL
>>> bumped HTTPS Socket connections at conn33 local=[::]:8083 remote=[::]
>>> FD 34 flags=41
>>> listening port: 8083
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted SSL
>>> bumped HTTPS Socket connections at conn35 local=[::]:8084 remote=[::]
>>> FD 35 flags=41
>>> listening port: 8084
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted HTTP
>>> Socket connections at conn37 local=[::]:3136 remote=[::] FD 36 flags=41
>>> listening port: 3136
>>> Dec 18 12:59:29 mumbai squid[604]: Accepting NAT intercepted HTTP
>>> Socket connections at conn39 local=[::]:3137 remote=[::] FD 37 flags=41
>>> listening port: 3137
>>>
>>> And then following errors came:
>>>
>>>
>>> Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn41 local=192.168.0.1:8080
>>> remote=192.168.0.111:53867 FD 12 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master53
>>> Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn42 local=192.168.0.1:8080
>>> remote=192.168.0.111:53868 FD 14 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master53
>>> Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn43 local=192.168.0.1:8080
>>> remote=192.168.0.111:53869 FD 16 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master57
>>> Dec 18 12:59:45 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn44 local=192.168.0.1:8080
>>> remote=192.168.0.111:53870 FD 12 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master57
>>> Dec 18 12:59:56 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn62 local=192.168.0.1:8080
>>> remote=192.168.0.111:53887 FD 12 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master95
>>> Dec 18 12:59:59 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn64 local=192.168.0.1:8080
>>> remote=192.168.0.111:53888 FD 12 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master99
>>> Dec 18 13:00:02 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn65 local=192.168.0.1:8080
>>> remote=192.168.0.178:56115 FD 12 flags=1: SQUID_TLS
>>> _ERR_ACCEPT+TLS_LIB_ERR=A000418+TLS_IO_ERR=1
>>> current master transaction:
>>> master53
>>> Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199
>>> local=192.168.0.1:8093 remote=192.168.0.101:52211 FD 52 flags=1
>>> connection: conn199
>>> local=192.168.0.1:8093 remote=192.168.0.101:52211 FD 52 flags=1
>>> Dec 18 13:01:45 mumbai squid[604]: ERROR: failure while accepting a
>>> TLS connection on conn240 local=192.168.0.1:8093
>>> remote=192.168.0.111:53931 FD 48 flags=1: SQUID_TL
>>> S_ERR_ACCEPT+TLS_LIB_ERR=A000416+TLS_IO_ERR=1
>>> current master transaction:
>>> master314
>>>
>>>
>>> After this point there is nothing in systemd journal (via: journalctl
>>> -f -u squid) and same lines are in cache.log.
>>>
>>> Squid got stuck (DEAD state) at 13:01 and right now it 19:26 (6 hours
>>> passed) and squid is still in dead state.
>>>
>>> kill -9 or kill -ALRM or -HUP also does nothing.
>>>
>>> So to restart squid - I will need to restart whole system.
>>>
>>> I have sslbump directives but it is not really applied.
>>>
>>> #NOTE: nosslbump_ips below contains 192.168.0.0/24 (whole LAN) so
>>> effectively there is no SSL bump after step1.
>>>
>>> acl nosslbump_ips src 192.168.0.0/24
>>> ssl_bump splice ssl_step1 nosslbump_ips
>>> ssl_bump peek ssl_step1
>>> ssl_bump splice nosslbump_domains
>>> ssl_bump stare sslbump_domains
>>> ssl_bump splice ssl_step2
>>> ssl_bump bump all
>>>
>>>
>>> Any idea? If anything changed from 5.7 to 6.6 that may cause this
>>> behaviour?
>>>
>>> Looking at changelog:
>>>
>>> Bug 5256: Intercepting port fails to accept
>>> https://bugs.squid-cache.org/show_bug.cgi?id=5256
>>>
>>> Bug 5154: Do not open IPv6 sockets when IPv6 is disabled
>>> https://bugs.squid-cache.org/show_bug.cgi?id=5154
>>>
>>> Not sure if above two bug FIXES (in between v5.7 to v6.6) are related
>>> to my issue.
>>>
>>> I ran netstat:
>>>
>>> # netstat -ntlp
>>> Active Internet connections (only servers)
>>> Proto Recv-Q Send-Q Local Address Foreign Address
>>> State PID/Program name
>>> ...
>>> tcp6 33 0 :::3137 :::* LISTEN -
>>> tcp6 0 0 :::3136 :::* LISTEN -
>>> tcp6 4 0 :::3128 :::* LISTEN -
>>> tcp6 0 0 :::8081 :::* LISTEN -
>>> tcp6 0 0 :::8080 :::* LISTEN -
>>> tcp6 0 0 :::8083 :::* LISTEN -
>>> tcp6 0 0 :::8082 :::* LISTEN -
>>> tcp6 0 0 :::8084 :::* LISTEN -
>>> tcp6 4097 0 :::8093 :::* LISTEN -
>>> tcp6 0 0 :::8092 :::* LISTEN -
>>> tcp6 0 0 :::8094 :::* LISTEN -
>>> ...
>>>
>>> I do not have IPv6 enabled, yet there are 33 and 4097 numbers in
>>> Recv-Q and also no process/PID owns these ports.
>>>
>>> Same IPv4 ports are not shown in use by netstat, so only IPv6 ports
>>> remain open, that too orphaned!
>>>
>>> So what is happening?
>>>
>>> Any idea to solve or any workaround?
>>>
>>> Thank you,
>>>
>>> Amish.
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> https://lists.squid-cache.org/listinfo/squid-users
More information about the squid-users
mailing list