[squid-users] squid.service with Type=Notify is not always reliable (Arch Linux)
Amish
anon.amish at gmail.com
Tue Sep 1 14:27:08 UTC 2020
On 01/09/20 7:17 pm, Alex Rousskov wrote:
> On 9/1/20 2:32 AM, Amish wrote:
>
>> I have frequently observed an issue with squid.service but I am not able
>> to detect the real cause. As mostly it works but sometimes does not.
>> What happens is squid starts correctly, but systemd does not seem to be
>> getting the notification from squid that it has started.
>> Sep 01 06:40:04 foo systemd[1]: Starting Squid Web Proxy Server...
>> Sep 01 06:41:34 foo systemd[1]: squid.service: start operation timed out. Terminating.
>> Sep 01 06:42:06 foo systemd[1]: squid.service: Failed with result 'timeout'.
>> Sep 01 06:42:06 foo systemd[1]: Failed to start Squid Web Proxy Server.
> When did your Squid workers print their "Accepting ... connections at
> ..." message(s)? Those messages will give you the approximate time of
> sd_notify() calls made by Squid workers.
Accepting ... connections at ... message came almost immediately (in 1
second).
Sep 01 06:40:05 foo squid[8446]: Set Current Directory to /var/cache/squid
Sep 01 06:40:05 foo squid[8446]: Starting Squid Cache version 4.13 for
x86_64-pc-linux-gnu...
Sep 01 06:40:05 foo squid[8446]: Service Name: squid
Sep 01 06:40:05 foo squid[8446]: Process ID 8446
Sep 01 06:40:05 foo squid[8446]: Process Roles: worker
Sep 01 06:40:05 foo squid[8446]: With 16384 file descriptors available
Sep 01 06:40:05 foo squid[8446]: Initializing IP Cache...
Sep 01 06:40:05 foo squid[8446]: DNS Socket created at [::], FD 5
Sep 01 06:40:05 foo squid[8446]: DNS Socket created at 0.0.0.0, FD 10
Sep 01 06:40:05 foo squid[8446]: Adding domain localdomain from
/etc/resolv.conf
Sep 01 06:40:05 foo squid[8446]: Adding nameserver 127.0.0.1 from
/etc/resolv.conf
Sep 01 06:40:05 foo squid[8446]: helperOpenServers: Starting 5/32
'security_file_certgen' processes
Sep 01 06:40:05 foo squid[8446]: helperOpenServers: Starting 1/1 'prrdr'
processes
Sep 01 06:40:05 foo squid[8446]: helperOpenServers: Starting 1/1 'prusr'
processes
Sep 01 06:40:05 foo squid[8446]: Logfile: opening log
daemon:/var/log/squid/access.log
Sep 01 06:40:05 foo squid[8446]: Logfile Daemon: opening log
/var/log/squid/access.log
Sep 01 06:40:05 foo squid[8446]: Local cache digest enabled;
rebuild/rewrite every 3600/3600 sec
Sep 01 06:40:05 foo squid[8446]: Store logging disabled
Sep 01 06:40:05 foo squid[8446]: Swap maxSize 0 + 262144 KB, estimated
20164 objects
Sep 01 06:40:05 foo squid[8446]: Target number of buckets: 1008
Sep 01 06:40:05 foo squid[8446]: Using 8192 Store buckets
Sep 01 06:40:05 foo squid[8446]: Max Mem size: 262144 KB
Sep 01 06:40:05 foo squid[8446]: Max Swap size: 0 KB
Sep 01 06:40:05 foo squid[8446]: Using Least Load store dir selection
Sep 01 06:40:05 foo squid[8446]: Set Current Directory to /var/cache/squid
Sep 01 06:40:05 foo squid[8446]: Finished loading MIME types and icons.
Sep 01 06:40:05 foo squid[8446]: HTCP Disabled.
Sep 01 06:40:05 foo squid[8446]: Squid plugin modules loaded: 0
Sep 01 06:40:05 foo squid[8446]: Adaptation support is on
Sep 01 06:40:05 foo squid[8446]: Accepting SSL bumped HTTP Socket
connections at local=[::]:3128 remote=[::] FD 27 flags=9
Sep 01 06:40:06 foo squid[8446]: storeLateRelease: released 0 objects
Sep 01 06:40:06 foo squid[8446]: ERROR: negotiating TLS on FD 23:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:40:06 foo squid[8446]: ERROR: negotiating TLS on FD 26:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Browser was able to access the websites after this point. Many such
lines and then
Sep 01 06:41:26 foo squid[8446]: parse URL too large (11871 bytes)
Sep 01 06:41:26 foo squid[8446]: ERROR: negotiating TLS on FD 96:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:27 foo squid[8446]: ERROR: negotiating TLS on FD 76:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:30 foo squid[8446]: Error negotiating SSL connection on FD
54: error:00000001:lib(0):func(0):reason(1) (1/-1)
Sep 01 06:41:32 foo squid[8446]: ERROR: negotiating TLS on FD 78:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:33 foo squid[8446]: ERROR: negotiating TLS on FD 90:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:33 foo squid[8446]: ERROR: negotiating TLS on FD 95:
error:1416F086:SSL routines:tls_process_server_certificate:certificate
verify failed (1/-1/0)
Sep 01 06:41:34 foo systemd[1]: squid.service: start operation timed
out. Terminating.
Sep 01 06:41:34 foo squid[8446]: Preparing for shutdown after 692 requests
Sep 01 06:41:34 foo squid[8446]: Waiting 30 seconds for active
connections to finish
Sep 01 06:41:34 foo squid[8446]: Closing HTTP(S) port [::]:3128
Sep 01 06:42:05 foo squid[8446]: Shutdown: NTLM authentication.
Sep 01 06:42:05 foo squid[8446]: Shutdown: Negotiate authentication.
Sep 01 06:42:05 foo squid[8446]: Shutdown: Digest authentication.
Sep 01 06:42:05 foo squid[8446]: Shutdown: Basic authentication.
Sep 01 06:42:06 foo squid[8446]: Shutting down...
Sep 01 06:42:06 foo squid[8446]: storeDirWriteCleanLogs: Starting...
Sep 01 06:42:06 foo squid[8446]: Finished. Wrote 0 entries.
Sep 01 06:42:06 foo squid[8446]: Took 0.00 seconds ( 0.00 entries/sec).
Sep 01 06:42:06 foo squid[8446]: Logfile: closing log
daemon:/var/log/squid/access.log
Sep 01 06:42:06 foo squid[8446]: Logfile Daemon: closing log
daemon:/var/log/squid/access.log
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 11
security_file_certgen #1
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 13
security_file_certgen #2
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 15
security_file_certgen #3
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 17
security_file_certgen #4
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 19
security_file_certgen #5
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 21 prrdr #1
Sep 01 06:42:06 foo squid[8446]: Open FD READ/WRITE 24 prusr #1
Sep 01 06:42:06 foo squid[8446]: Open FD UNSTARTED 25 IPC UNIX STREAM
Parent
Sep 01 06:42:06 foo squid[8446]: Squid Cache (Version 4.13): Exiting
normally.
Sep 01 06:42:06 foo squid[8444]: Squid Parent: squid-1 process 8446
exited with status 0
Sep 01 06:42:06 foo squid[8444]: Removing PID file (/run/squid.pid)
Sep 01 06:42:06 foo systemd[1]: squid.service: Failed with result 'timeout'.
Sep 01 06:42:06 foo systemd[1]: Failed to start Squid Web Proxy Server.
Not sure that parse URL line has anything to do with this bug as
sd_notify() was expected to happen long before that.
There are 3-4 other servers (all Arch linux and squid 4.13) which I
control, as of now they are working fine with Type=Notify.
>> Squid cache.log shows no error w.r.t. notify and seems to start correctly.
> AFAICT, Squid code does not report sd_notify() failure to notify iff
> that failure was due to an unset $NOTIFY_SOCKET variable. The first
> sd_notify() call in Squid also unsets $NOTIFY_SOCKET variable (in the
> calling worker). However, I cannot explain why that variable would be
> unset during the first sd_notify() call _sometimes_.
>
> Perhaps systemd is confused by concurrent notifications coming from
> multiple workers?
>
> Can you enable some kind of sd_notify() debugging that would show us
> what the first sd_notify() call was doing and when/whether systemd
> received the notification from Squid?
I have no idea how to do that. If you know then please do let me know.
I will try if I can. Server is production / live server, and without
knowing how to replicate this bug, its difficult for me to test. As I
can not have too much downtime for testing.
Thank you for your response,
Amish.
More information about the squid-users
mailing list