[squid-users] squid.service with Type=Notify is not always reliable (Arch Linux)

Amish anon.amish at gmail.com
Thu Sep 3 00:38:19 UTC 2020


On 02/09/20 8:05 pm, Amos Jeffries wrote:
> On 2/09/20 7:01 pm, Amish wrote:
>> On 01/09/20 8:31 pm, Alex Rousskov wrote:
>>> On 9/1/20 10:27 AM, Amish wrote:
>>>
>>>> Accepting ... connections at ...  message came almost immediately (in 1
>>>> second).
>>>> Sep 01 06:40:05 foo squid[8446]: Accepting SSL bumped HTTP Socket
>>>> connections at local=[::]:3128 remote=[::] FD 27 flags=9
>>> OK, so you are not using SMP Squid and, assuming your Squid build
>>> supports calling sd_notify(), sd_notify() was called. We need to figure
>>> out why it had no effect. Suggested next steps:
>>>
>>> 1. Adjust Squid to print the value of $NOTIFY_SOCKET together with the
>>> "Accepting..." line. This will confirm that the variable is set. It
>>> should be set. This debugging can be added without fear of producing too
>>> much info but it is unlikely to be insightful.
>>>
>>> 2. Add some kind of sd_notify() debugging that would show us what the
>>> first sd_notify() call was doing and when/whether systemd received the
>>> notification from Squid. I have not researched how to do that, but I am
>>> sure it is possible. I bet there are not enough notifications happening
>>> on your production server to cause problems, but you should practice on
>>> a lab server first, of course.
>> Ok. I will try above. But here is a note from "man sd_notify" about race
>> condition that MAY occur.
> You are guessing now. Please don't do that until every possible check
> has been done and narrowed the vast range of possibilities down.
>
> The purpose of the checks suggested by Alex was to identify whether a
> race is occuring or not.

For initial testing, I had already added debug message for step 1. And I 
tried replicate the problem at my end.

But I simply could not replicate the problem at my end. Debug output was 
on expected lines. (on my own machine)

Sep 03 05:11:54 amish squid[1086]: Accepting NAT intercepted HTTP Socket 
connections at local=[::]:3128 remote=[::] FD 33 flags=41
Sep 03 05:11:54 amish squid[1086]: NOTIFY_SOCKET before sd_notify() is 
/run/systemd/notify
Sep 03 05:11:54 amish systemd[1]: Started Squid Web Proxy Server.
Sep 03 05:11:54 amish squid[1086]: NOTIFY_SOCKET after sd_notify() is
Sep 03 05:11:54 amish squid[1086]: Accepting SSL bumped HTTP Socket 
connections at local=[::]:8080 remote=[::] FD 34 flags=9
Sep 03 05:11:54 amish squid[1086]: NOTIFY_SOCKET before sd_notify() is
Sep 03 05:11:54 amish squid[1086]: NOTIFY_SOCKET after sd_notify() is
Sep 03 05:11:54 amish squid[1086]: Accepting NAT intercepted SSL bumped 
HTTPS Socket connections at local=[::]:8081 remote=[::] FD 35 flags=41
Sep 03 05:11:54 amish squid[1086]: NOTIFY_SOCKET before sd_notify() is
Sep 03 05:11:54 amish squid[1086]: NOTIFY_SOCKET after sd_notify() is

For step 2, I first need to read systemd documentation about sd_notify() 
and thats when I came across mention of above race condition.

Race conditions are not easy to reproduce. That's when you look at the 
code and guess where the race can be, if there can be a race. Since 
sd_notify() documentation kind of insists on using sd_notify_barrier() 
and that does not exist in squid code. I thought that could be missing 
piece.

What is strange is I had observed that when there was this issue, no 
matter how many times I restart the squid, the problem came every single 
time. And then changing Type=forking worked. May be there is no race. 
But something else?

But now based on your and Alex replies, I will go ahead and find a way 
to add more debug to code for tracking systemd and squid notify 
communication. And then hope that issue occurs in my system.

Now I will post back about this, if at all I can do something and find 
something. :)

Thank you to both and regards,

Amish.



More information about the squid-users mailing list