[squid-users] IPv4 addresses go missing - markAsBad wrong?

Stephen Borrill squid at borrill.org.uk
Mon Feb 12 12:02:02 UTC 2024


On 12/02/2024 11:48, NgTech LTD wrote:
> What distro are you using?

NetBSD 9.2_STABLE and building with gcc 8.5.0

> בתאריך יום ב׳, 12 בפבר׳ 2024, 13:47, מאת Stephen Borrill 
> ‏<squid at borrill.org.uk <mailto:squid at borrill.org.uk>>:
> 
>     On 16/01/2024 14:37, Alex Rousskov wrote:
>      > On 2024-01-16 06:01, Stephen Borrill wrote:
>      >> The problem is no different with 6.6. Is there any more debugging I
>      >> can provide, Alex?
>      >
>      > Yes, but I need to give you a patch that adds that (temporary)
>     debugging
>      > first (assuming I fail to reproduce the problem in the lab). The
>     ball is
>      > on my side (unless somebody else steps in). Unfortunately, I do
>     not have
>      > any free time for any of that right now. If you do not hear from me
>      > sooner, please ping me again on or after February 8, 2024.
> 
>     PING!
> 
>     I will get 6.7 compiled up so we can add debugging to it quickly. It
>     would be good if we could get something in place this week as it is
>     school holidays next week in the UK and so there will be little
>     opportunity to test until afterwards.
> 
>      >> On 10/01/2024 12:40, Stephen Borrill wrote:
>      >>> On 09/01/2024 15:42, Alex Rousskov wrote:
>      >>>> On 2024-01-09 05:56, Stephen Borrill wrote:
>      >>>>> On 09/01/2024 09:51, Stephen Borrill wrote:
>      >>>>>> On 09/01/2024 03:41, Alex Rousskov wrote:
>      >>>>>>> On 2024-01-08 08:31, Stephen Borrill wrote:
>      >>>>>>>> I'm trying to determine why squid 6.x (seen with 6.5)
>     connected
>      >>>>>>>> via IPv4-only periodically fails to connect to the
>     destination
>      >>>>>>>> and then requires a restart to fix it (reload is not
>     sufficient).
>      >>>>>>>>
>      >>>>>>>> The problem appears to be that a host that has one address
>     each
>      >>>>>>>> of IPv4 and IPv6 occasionally has its IPv4 address go
>     missing as
>      >>>>>>>> a destination. On closer inspection, this appears to
>     happen when
>      >>>>>>>> the IPv6 address (not the IPv4) address is marked as bad.
>      >>>>
>      >>>>> ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in
>      >>>>> 216.239.38.120 #1/2-0
>      >>>>
>      >>>>
>      >>>> Thank you for sharing more debugging info!
>      >>>
>      >>> The following seemed odd to. It finds an IPv4 address (this
>     host does
>      >>> not have IPv6), puts it in the cache and then says "No DNS
>     records":
>      >>>
>      >>> 2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617)
>     nbgethostbyname:
>      >>> schoolbase.online
>      >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313)
>     ipcacheRelease:
>      >>> ipcacheRelease: Releasing entry for 'schoolbase.online'
>      >>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670)
>      >>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for
>      >>> 'schoolbase.online'
>      >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480)
>     ipcacheParse: 1
>      >>> answers for schoolbase.online
>      >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no
>      >>> 20.54.32.34 in [no cached IPs]
>      >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no
>      >>> 20.54.32.34 in [no cached IPs]
>      >>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use
>      >>> first 69 from RR TTL 69
>      >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood:
>      >>> schoolbase.online #1 20.54.32.34
>      >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp:
>      >>> 20.54.32.34
>      >>> 2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174)
>     handlePath:
>      >>> PeerSelector72389 found conn564274 local=0.0.0.0
>      >>> remote=20.54.32.34:443 <http://20.54.32.34:443> HIER_DIRECT
>     flags=1, destination #1 for
>      >>> schoolbase.online:443
>      >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError:
>      >>> ERROR: DNS failure while resolving schoolbase.online: No DNS
>     records
>      >>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586)
>      >>> ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
>      >>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback:
>      >>> 0x1b7381f38  lookup_err=No DNS records
>      >>>
>      >>> It seemed to happen about the same time as the other failure, so
>      >>> perhaps another symptom of the same.
>      >>>
>      >>>> The above log line is self-contradictory AFAICT: It says that the
>      >>>> cache has both IPv6-looking and IPv4-looking address at the same
>      >>>> cache position (0) and, judging by the corresponding code,
>     those two
>      >>>> IP addresses are equal. This is not possible (for those
>     specific IP
>      >>>> address values). The subsequent Squid behavior can be
>     explained by
>      >>>> this (unexplained) conflict.
>      >>>>
>      >>>> I assume you are running official Squid v6.5 code.
>      >>>
>      >>> Yes, compiled from source on NetBSD. I have the patch I refer
>     to here
>      >>> applied too:
>      >>>
>     https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html <https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html>
>      >>>
>      >>>> I can suggest the following two steps for going forward:
>      >>>>
>      >>>> 1. Upgrade to the latest Squid v6 in hope that the problem
>     goes away.
>      >>>
>      >>> I have just upgraded to 6.6.
>      >>>
>      >>>> 2. If the problem is still there, patch the latest Squid v6 to
>     add
>      >>>> more debugging in hope to explain what is going on. This may
>     take a
>      >>>> few iterations, and it will take me some time to produce the
>      >>>> necessary debugging patch.
>      >>>
>      >>> Unfortunately, I don't have a test case that will cause the
>     problem
>      >>> so I need to run this at a customer's production site that is
>      >>> particularly affected by it. Luckily, the problem recurs pretty
>     quickly.
>      >>>
>      >>> Here's a run with 6.6 where the number of destinations drops
>     from 2
>      >>> to 1 before reverting. Not seen this before - usually once it has
>      >>> dropped to 1 (the IPv6 address), it stays there until a restart
>     (and
>      >>> this did happen about a minute after this log fragment). Happy to
>      >>> test out any debugging patch.
>      >>>
>      >>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617)
>     nbgethostbyname:
>      >>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>      >>> 2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP:
>      >>> Given Non-IP 'forcesafesearch.google.com
>     <http://forcesafesearch.google.com>': hostname or servname not
>      >>> provided or not known
>      >>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657)
>      >>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>      >>> 'forcesafesearch.google.com <http://forcesafesearch.google.com>'
>      >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
>      >>> [2001:4860:4802:32::78]
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174)
>     handlePath:
>      >>> PeerSelector300176 found conn2388484 local=[::]
>      >>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1,
>     destination
>      >>> #1 for forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180)
>     handlePath:
>      >>> always_direct = ALLOWED
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181)
>     handlePath:
>      >>> never_direct = DENIED
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182)
>     handlePath:
>      >>> timedout = 0
>      >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
>      >>> 216.239.38.120
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174)
>     handlePath:
>      >>> PeerSelector300176 found conn2388485 local=0.0.0.0
>      >>> remote=216.239.38.120:443 <http://216.239.38.120:443>
>     HIER_DIRECT flags=1, destination #2 for
>      >>> forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180)
>     handlePath:
>      >>> always_direct = ALLOWED
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181)
>     handlePath:
>      >>> never_direct = DENIED
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182)
>     handlePath:
>      >>> timedout = 0
>      >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(236) finalCallback:
>      >>> 0x12208e038
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(479)
>      >>> resolveSelected: PeerSelector300176 found all 2 destinations for
>      >>> forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(480)
>      >>> resolveSelected:    always_direct = ALLOWED
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(481)
>      >>> resolveSelected:     never_direct = DENIED
>      >>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(482)
>      >>> resolveSelected:         timedout = 0
>      >>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(990) have:
>      >>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>      >>> 2024/01/10 11:55:49.849 kid1| 14,2| ipcache.cc(1031) markAsBad:
>      >>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>     <http://forcesafesearch.google.com>
>      >>> 2024/01/10 11:55:49.855 kid1| 14,7| ipcache.cc(990) have:
>      >>> 216.239.38.120:443 <http://216.239.38.120:443> at 0 in
>     [2001:4860:4802:32::78] #2/2-1
>      >>> 2024/01/10 11:55:49.855 kid1| 14,2| ipcache.cc(1055)
>     forgetMarking:
>      >>> 216.239.38.120:443 <http://216.239.38.120:443> of
>     forcesafesearch.google.com <http://forcesafesearch.google.com>
>      >>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP:
>      >>> Given Non-IP 'forcesafesearch.google.com
>     <http://forcesafesearch.google.com>': hostname or servname not
>      >>> provided or not known
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(460)
>      >>> resolveSelected: Find IP destination for:
>      >>> forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>' via
>     forcesafesearch.google.com <http://forcesafesearch.google.com>
>      >>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(617)
>     nbgethostbyname:
>      >>> forcesafesearch.google.com <http://forcesafesearch.google.com>
>      >>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP:
>      >>> Given Non-IP 'forcesafesearch.google.com
>     <http://forcesafesearch.google.com>': hostname or servname not
>      >>> provided or not known
>      >>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(657)
>      >>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>      >>> 'forcesafesearch.google.com <http://forcesafesearch.google.com>'
>      >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
>      >>> [2001:4860:4802:32::78]
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174)
>     handlePath:
>      >>> PeerSelector300177 found conn2388493 local=[::]
>      >>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1,
>     destination
>      >>> #1 for forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180)
>     handlePath:
>      >>> always_direct = ALLOWED
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181)
>     handlePath:
>      >>> never_direct = DENIED
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182)
>     handlePath:
>      >>> timedout = 0
>      >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
>      >>> 216.239.38.120
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174)
>     handlePath:
>      >>> PeerSelector300177 found conn2388494 local=0.0.0.0
>      >>> remote=216.239.38.120:443 <http://216.239.38.120:443>
>     HIER_DIRECT flags=1, destination #2 for
>      >>> forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180)
>     handlePath:
>      >>> always_direct = ALLOWED
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181)
>     handlePath:
>      >>> never_direct = DENIED
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182)
>     handlePath:
>      >>> timedout = 0
>      >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(236) finalCallback:
>      >>> 0x12208e038
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(479)
>      >>> resolveSelected: PeerSelector300177 found all 2 destinations for
>      >>> forcesafesearch.google.com:443
>     <http://forcesafesearch.google.com:443>
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(480)
>      >>> resolveSelected:    always_direct = ALLOWED
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(481)
>      >>> resolveSelected:     never_direct = DENIED
>      >>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(482)
>      >>> resolveSelected:         timedout = 0
>      >>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(990) have:
>      >>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>      >>> 2024/01/10 11:55:49.877 kid1| 14,2| ipcache.cc(1031) markAsBad:
>      >>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>     <http://forcesafesearch.google.com>
>      >>> 2024/01/10 11:55:49.882 kid1| 14,7| ipcache.cc(990) have:
>      >>> 216.239.38.120:443 <http://216.239.38.120:443> at 0 in
>     [2001:4860:4802:32::78] #2/2-1
>      >>> 2024/01/10 11:55:49.882 kid1| 14,2| ipcache.cc(1055)
>     forgetMarking:
>      >>> 216.239.38.120:443 <http://216.239.38.120:443> of
>     forcesafesearch.google.com <http://forcesafesearch.google.com>
>      >>>
>      >>
>      >> _______________________________________________
>      >> squid-users mailing list
>      >> squid-users at lists.squid-cache.org
>     <mailto:squid-users at lists.squid-cache.org>
>      >> https://lists.squid-cache.org/listinfo/squid-users
>     <https://lists.squid-cache.org/listinfo/squid-users>
>      >
>      > _______________________________________________
>      > squid-users mailing list
>      > squid-users at lists.squid-cache.org
>     <mailto:squid-users at lists.squid-cache.org>
>      > https://lists.squid-cache.org/listinfo/squid-users
>     <https://lists.squid-cache.org/listinfo/squid-users>
>      >
> 
>     -- 
>     Dr. Stephen Borrill, Director and Solutions Architect
>     Precedence Technologies Ltd              T: +44 (0) 1223 359 900
>     Technology House, 36a Union Lane         E:
>     sborrill at precedence.co.uk <mailto:sborrill at precedence.co.uk>
>     Cambridge, CB4 1QB, United Kingdom       W:
>     http://www.precedence.co.uk/ <http://www.precedence.co.uk/>
>     Limited company registered in England and Wales. Company number 3725626
> 
>     _______________________________________________
>     squid-users mailing list
>     squid-users at lists.squid-cache.org
>     <mailto:squid-users at lists.squid-cache.org>
>     https://lists.squid-cache.org/listinfo/squid-users
>     <https://lists.squid-cache.org/listinfo/squid-users>
> 



More information about the squid-users mailing list