[squid-users] IPv4 addresses go missing - markAsBad wrong?
Stephen Borrill
squid at borrill.org.uk
Mon Feb 19 13:48:59 UTC 2024
On 12/02/2024 12:02, Stephen Borrill wrote:
> On 12/02/2024 11:48, NgTech LTD wrote:
>> What distro are you using?
>
> NetBSD 9.2_STABLE and building with gcc 8.5.0
I will stick with 6.6 for the debugging due to the missing files in 6.7
(I'm one of the maintainers of the pkgsrc port for squid and we're not
going to adjust the port just for this one broken release tarball).
>> בתאריך יום ב׳, 12 בפבר׳ 2024, 13:47, מאת Stephen Borrill
>> <squid at borrill.org.uk <mailto:squid at borrill.org.uk>>:
>>
>> On 16/01/2024 14:37, Alex Rousskov wrote:
>> > On 2024-01-16 06:01, Stephen Borrill wrote:
>> >> The problem is no different with 6.6. Is there any more
>> debugging I
>> >> can provide, Alex?
>> >
>> > Yes, but I need to give you a patch that adds that (temporary)
>> debugging
>> > first (assuming I fail to reproduce the problem in the lab). The
>> ball is
>> > on my side (unless somebody else steps in). Unfortunately, I do
>> not have
>> > any free time for any of that right now. If you do not hear
>> from me
>> > sooner, please ping me again on or after February 8, 2024.
>>
>> PING!
>>
>> I will get 6.7 compiled up so we can add debugging to it quickly. It
>> would be good if we could get something in place this week as it is
>> school holidays next week in the UK and so there will be little
>> opportunity to test until afterwards.
>>
>> >> On 10/01/2024 12:40, Stephen Borrill wrote:
>> >>> On 09/01/2024 15:42, Alex Rousskov wrote:
>> >>>> On 2024-01-09 05:56, Stephen Borrill wrote:
>> >>>>> On 09/01/2024 09:51, Stephen Borrill wrote:
>> >>>>>> On 09/01/2024 03:41, Alex Rousskov wrote:
>> >>>>>>> On 2024-01-08 08:31, Stephen Borrill wrote:
>> >>>>>>>> I'm trying to determine why squid 6.x (seen with 6.5)
>> connected
>> >>>>>>>> via IPv4-only periodically fails to connect to the
>> destination
>> >>>>>>>> and then requires a restart to fix it (reload is not
>> sufficient).
>> >>>>>>>>
>> >>>>>>>> The problem appears to be that a host that has one address
>> each
>> >>>>>>>> of IPv4 and IPv6 occasionally has its IPv4 address go
>> missing as
>> >>>>>>>> a destination. On closer inspection, this appears to
>> happen when
>> >>>>>>>> the IPv6 address (not the IPv4) address is marked as bad.
>> >>>>
>> >>>>> ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in
>> >>>>> 216.239.38.120 #1/2-0
>> >>>>
>> >>>>
>> >>>> Thank you for sharing more debugging info!
>> >>>
>> >>> The following seemed odd to. It finds an IPv4 address (this
>> host does
>> >>> not have IPv6), puts it in the cache and then says "No DNS
>> records":
>> >>>
>> >>> 2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617)
>> nbgethostbyname:
>> >>> schoolbase.online
>> >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313)
>> ipcacheRelease:
>> >>> ipcacheRelease: Releasing entry for 'schoolbase.online'
>> >>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670)
>> >>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for
>> >>> 'schoolbase.online'
>> >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480)
>> ipcacheParse: 1
>> >>> answers for schoolbase.online
>> >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have: no
>> >>> 20.54.32.34 in [no cached IPs]
>> >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have: no
>> >>> 20.54.32.34 in [no cached IPs]
>> >>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549)
>> updateTtl: use
>> >>> first 69 from RR TTL 69
>> >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood:
>> >>> schoolbase.online #1 20.54.32.34
>> >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp:
>> >>> 20.54.32.34
>> >>> 2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174)
>> handlePath:
>> >>> PeerSelector72389 found conn564274 local=0.0.0.0
>> >>> remote=20.54.32.34:443 <http://20.54.32.34:443> HIER_DIRECT
>> flags=1, destination #1 for
>> >>> schoolbase.online:443
>> >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError:
>> >>> ERROR: DNS failure while resolving schoolbase.online: No DNS
>> records
>> >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586)
>> >>> ipcacheHandleReply: done with schoolbase.online: 20.54.32.34
>> #1/1-0
>> >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236)
>> finalCallback:
>> >>> 0x1b7381f38 lookup_err=No DNS records
>> >>>
>> >>> It seemed to happen about the same time as the other failure, so
>> >>> perhaps another symptom of the same.
>> >>>
>> >>>> The above log line is self-contradictory AFAICT: It says
>> that the
>> >>>> cache has both IPv6-looking and IPv4-looking address at the
>> same
>> >>>> cache position (0) and, judging by the corresponding code,
>> those two
>> >>>> IP addresses are equal. This is not possible (for those
>> specific IP
>> >>>> address values). The subsequent Squid behavior can be
>> explained by
>> >>>> this (unexplained) conflict.
>> >>>>
>> >>>> I assume you are running official Squid v6.5 code.
>> >>>
>> >>> Yes, compiled from source on NetBSD. I have the patch I refer
>> to here
>> >>> applied too:
>> >>>
>>
>> https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html <https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html>
>> >>>
>> >>>> I can suggest the following two steps for going forward:
>> >>>>
>> >>>> 1. Upgrade to the latest Squid v6 in hope that the problem
>> goes away.
>> >>>
>> >>> I have just upgraded to 6.6.
>> >>>
>> >>>> 2. If the problem is still there, patch the latest Squid v6 to
>> add
>> >>>> more debugging in hope to explain what is going on. This may
>> take a
>> >>>> few iterations, and it will take me some time to produce the
>> >>>> necessary debugging patch.
>> >>>
>> >>> Unfortunately, I don't have a test case that will cause the
>> problem
>> >>> so I need to run this at a customer's production site that is
>> >>> particularly affected by it. Luckily, the problem recurs pretty
>> quickly.
>> >>>
>> >>> Here's a run with 6.6 where the number of destinations drops
>> from 2
>> >>> to 1 before reverting. Not seen this before - usually once it
>> has
>> >>> dropped to 1 (the IPv6 address), it stays there until a restart
>> (and
>> >>> this did happen about a minute after this log fragment).
>> Happy to
>> >>> test out any debugging patch.
>> >>>
>> >>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617)
>> nbgethostbyname:
>> >>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>> >>> 2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389)
>> lookupHostIP:
>> >>> Given Non-IP 'forcesafesearch.google.com
>> <http://forcesafesearch.google.com>': hostname or servname not
>> >>> provided or not known
>> >>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657)
>> >>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>> >>> 'forcesafesearch.google.com <http://forcesafesearch.google.com>'
>> >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
>> >>> [2001:4860:4802:32::78]
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174)
>> handlePath:
>> >>> PeerSelector300176 found conn2388484 local=[::]
>> >>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1,
>> destination
>> >>> #1 for forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180)
>> handlePath:
>> >>> always_direct = ALLOWED
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181)
>> handlePath:
>> >>> never_direct = DENIED
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182)
>> handlePath:
>> >>> timedout = 0
>> >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
>> >>> 216.239.38.120
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174)
>> handlePath:
>> >>> PeerSelector300176 found conn2388485 local=0.0.0.0
>> >>> remote=216.239.38.120:443 <http://216.239.38.120:443>
>> HIER_DIRECT flags=1, destination #2 for
>> >>> forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180)
>> handlePath:
>> >>> always_direct = ALLOWED
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181)
>> handlePath:
>> >>> never_direct = DENIED
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182)
>> handlePath:
>> >>> timedout = 0
>> >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(236)
>> finalCallback:
>> >>> 0x12208e038
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(479)
>> >>> resolveSelected: PeerSelector300176 found all 2 destinations for
>> >>> forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(480)
>> >>> resolveSelected: always_direct = ALLOWED
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(481)
>> >>> resolveSelected: never_direct = DENIED
>> >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(482)
>> >>> resolveSelected: timedout = 0
>> >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(990) have:
>> >>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78]
>> #2/2-0
>> >>> 2024/01/10 11:55:49.849 kid1| 14,2| ipcache.cc(1031) markAsBad:
>> >>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>> <http://forcesafesearch.google.com>
>> >>> 2024/01/10 11:55:49.855 kid1| 14,7| ipcache.cc(990) have:
>> >>> 216.239.38.120:443 <http://216.239.38.120:443> at 0 in
>> [2001:4860:4802:32::78] #2/2-1
>> >>> 2024/01/10 11:55:49.855 kid1| 14,2| ipcache.cc(1055)
>> forgetMarking:
>> >>> 216.239.38.120:443 <http://216.239.38.120:443> of
>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>> >>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389)
>> lookupHostIP:
>> >>> Given Non-IP 'forcesafesearch.google.com
>> <http://forcesafesearch.google.com>': hostname or servname not
>> >>> provided or not known
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(460)
>> >>> resolveSelected: Find IP destination for:
>> >>> forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>' via
>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>> >>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(617)
>> nbgethostbyname:
>> >>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>> >>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389)
>> lookupHostIP:
>> >>> Given Non-IP 'forcesafesearch.google.com
>> <http://forcesafesearch.google.com>': hostname or servname not
>> >>> provided or not known
>> >>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(657)
>> >>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>> >>> 'forcesafesearch.google.com <http://forcesafesearch.google.com>'
>> >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
>> >>> [2001:4860:4802:32::78]
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174)
>> handlePath:
>> >>> PeerSelector300177 found conn2388493 local=[::]
>> >>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1,
>> destination
>> >>> #1 for forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180)
>> handlePath:
>> >>> always_direct = ALLOWED
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181)
>> handlePath:
>> >>> never_direct = DENIED
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182)
>> handlePath:
>> >>> timedout = 0
>> >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
>> >>> 216.239.38.120
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174)
>> handlePath:
>> >>> PeerSelector300177 found conn2388494 local=0.0.0.0
>> >>> remote=216.239.38.120:443 <http://216.239.38.120:443>
>> HIER_DIRECT flags=1, destination #2 for
>> >>> forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180)
>> handlePath:
>> >>> always_direct = ALLOWED
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181)
>> handlePath:
>> >>> never_direct = DENIED
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182)
>> handlePath:
>> >>> timedout = 0
>> >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(236)
>> finalCallback:
>> >>> 0x12208e038
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(479)
>> >>> resolveSelected: PeerSelector300177 found all 2 destinations for
>> >>> forcesafesearch.google.com:443
>> <http://forcesafesearch.google.com:443>
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(480)
>> >>> resolveSelected: always_direct = ALLOWED
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(481)
>> >>> resolveSelected: never_direct = DENIED
>> >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(482)
>> >>> resolveSelected: timedout = 0
>> >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(990) have:
>> >>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78]
>> #2/2-0
>> >>> 2024/01/10 11:55:49.877 kid1| 14,2| ipcache.cc(1031) markAsBad:
>> >>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>> <http://forcesafesearch.google.com>
>> >>> 2024/01/10 11:55:49.882 kid1| 14,7| ipcache.cc(990) have:
>> >>> 216.239.38.120:443 <http://216.239.38.120:443> at 0 in
>> [2001:4860:4802:32::78] #2/2-1
>> >>> 2024/01/10 11:55:49.882 kid1| 14,2| ipcache.cc(1055)
>> forgetMarking:
>> >>> 216.239.38.120:443 <http://216.239.38.120:443> of
>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>> >>>
>> >>
>> >> _______________________________________________
>> >> squid-users mailing list
>> >> squid-users at lists.squid-cache.org
>> <mailto:squid-users at lists.squid-cache.org>
>> >> https://lists.squid-cache.org/listinfo/squid-users
>> <https://lists.squid-cache.org/listinfo/squid-users>
>> >
>> > _______________________________________________
>> > squid-users mailing list
>> > squid-users at lists.squid-cache.org
>> <mailto:squid-users at lists.squid-cache.org>
>> > https://lists.squid-cache.org/listinfo/squid-users
>> <https://lists.squid-cache.org/listinfo/squid-users>
>> >
>>
>> -- Dr. Stephen Borrill, Director and Solutions Architect
>> Precedence Technologies Ltd T: +44 (0) 1223 359 900
>> Technology House, 36a Union Lane E:
>> sborrill at precedence.co.uk <mailto:sborrill at precedence.co.uk>
>> Cambridge, CB4 1QB, United Kingdom W:
>> http://www.precedence.co.uk/ <http://www.precedence.co.uk/>
>> Limited company registered in England and Wales. Company number
>> 3725626
>>
>> _______________________________________________
>> squid-users mailing list
>> squid-users at lists.squid-cache.org
>> <mailto:squid-users at lists.squid-cache.org>
>> https://lists.squid-cache.org/listinfo/squid-users
>> <https://lists.squid-cache.org/listinfo/squid-users>
>>
>
More information about the squid-users
mailing list