[squid-users] IPv4 addresses go missing - markAsBad wrong?
Stephen Borrill
squid at borrill.org.uk
Tue Jan 16 14:43:03 UTC 2024
On 16/01/2024 14:37, Alex Rousskov wrote:
> On 2024-01-16 06:01, Stephen Borrill wrote:
>> The problem is no different with 6.6. Is there any more debugging I
>> can provide, Alex?
>
> Yes, but I need to give you a patch that adds that (temporary) debugging
> first (assuming I fail to reproduce the problem in the lab). The ball is
> on my side (unless somebody else steps in). Unfortunately, I do not have
> any free time for any of that right now. If you do not hear from me
> sooner, please ping me again on or after February 8, 2024.
Thanks. In the meantime, I have created a local DNS entry for
forcesafesearch.google.com that only returns the A record. I think that
should work around it (for that site, but not others).
>> On 10/01/2024 12:40, Stephen Borrill wrote:
>>> On 09/01/2024 15:42, Alex Rousskov wrote:
>>>> On 2024-01-09 05:56, Stephen Borrill wrote:
>>>>> On 09/01/2024 09:51, Stephen Borrill wrote:
>>>>>> On 09/01/2024 03:41, Alex Rousskov wrote:
>>>>>>> On 2024-01-08 08:31, Stephen Borrill wrote:
>>>>>>>> I'm trying to determine why squid 6.x (seen with 6.5) connected
>>>>>>>> via IPv4-only periodically fails to connect to the destination
>>>>>>>> and then requires a restart to fix it (reload is not sufficient).
>>>>>>>>
>>>>>>>> The problem appears to be that a host that has one address each
>>>>>>>> of IPv4 and IPv6 occasionally has its IPv4 address go missing as
>>>>>>>> a destination. On closer inspection, this appears to happen when
>>>>>>>> the IPv6 address (not the IPv4) address is marked as bad.
>>>>
>>>>> ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in
>>>>> 216.239.38.120 #1/2-0
>>>>
>>>>
>>>> Thank you for sharing more debugging info!
>>>
>>> The following seemed odd to. It finds an IPv4 address (this host does
>>> not have IPv6), puts it in the cache and then says "No DNS records":
>>>
>>> 2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
>>> schoolbase.online
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease:
>>> ipcacheRelease: Releasing entry for 'schoolbase.online'
>>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670)
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for
>>> 'schoolbase.online'
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1
>>> answers for schoolbase.online
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have: no
>>> 20.54.32.34 in [no cached IPs]
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have: no
>>> 20.54.32.34 in [no cached IPs]
>>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use
>>> first 69 from RR TTL 69
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood:
>>> schoolbase.online #1 20.54.32.34
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp:
>>> 20.54.32.34
>>> 2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector72389 found conn564274 local=0.0.0.0
>>> remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for
>>> schoolbase.online:443
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError:
>>> ERROR: DNS failure while resolving schoolbase.online: No DNS records
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586)
>>> ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback:
>>> 0x1b7381f38 lookup_err=No DNS records
>>>
>>> It seemed to happen about the same time as the other failure, so
>>> perhaps another symptom of the same.
>>>
>>>> The above log line is self-contradictory AFAICT: It says that the
>>>> cache has both IPv6-looking and IPv4-looking address at the same
>>>> cache position (0) and, judging by the corresponding code, those two
>>>> IP addresses are equal. This is not possible (for those specific IP
>>>> address values). The subsequent Squid behavior can be explained by
>>>> this (unexplained) conflict.
>>>>
>>>> I assume you are running official Squid v6.5 code.
>>>
>>> Yes, compiled from source on NetBSD. I have the patch I refer to here
>>> applied too:
>>> https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html
>>>
>>>> I can suggest the following two steps for going forward:
>>>>
>>>> 1. Upgrade to the latest Squid v6 in hope that the problem goes away.
>>>
>>> I have just upgraded to 6.6.
>>>
>>>> 2. If the problem is still there, patch the latest Squid v6 to add
>>>> more debugging in hope to explain what is going on. This may take a
>>>> few iterations, and it will take me some time to produce the
>>>> necessary debugging patch.
>>>
>>> Unfortunately, I don't have a test case that will cause the problem
>>> so I need to run this at a customer's production site that is
>>> particularly affected by it. Luckily, the problem recurs pretty quickly.
>>>
>>> Here's a run with 6.6 where the number of destinations drops from 2
>>> to 1 before reverting. Not seen this before - usually once it has
>>> dropped to 1 (the IPv6 address), it stays there until a restart (and
>>> this did happen about a minute after this log fragment). Happy to
>>> test out any debugging patch.
>>>
>>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
>>> forcesafesearch.google.com
>>> 2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not
>>> provided or not known
>>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657)
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>>> 'forcesafesearch.google.com'
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
>>> [2001:4860:4802:32::78]
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector300176 found conn2388484 local=[::]
>>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination
>>> #1 for forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath:
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath:
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath:
>>> timedout = 0
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
>>> 216.239.38.120
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector300176 found conn2388485 local=0.0.0.0
>>> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath:
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath:
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath:
>>> timedout = 0
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(236) finalCallback:
>>> 0x12208e038
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(479)
>>> resolveSelected: PeerSelector300176 found all 2 destinations for
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(480)
>>> resolveSelected: always_direct = ALLOWED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(481)
>>> resolveSelected: never_direct = DENIED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(482)
>>> resolveSelected: timedout = 0
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(990) have:
>>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>>> 2024/01/10 11:55:49.849 kid1| 14,2| ipcache.cc(1031) markAsBad:
>>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>>> 2024/01/10 11:55:49.855 kid1| 14,7| ipcache.cc(990) have:
>>> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>> 2024/01/10 11:55:49.855 kid1| 14,2| ipcache.cc(1055) forgetMarking:
>>> 216.239.38.120:443 of forcesafesearch.google.com
>>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not
>>> provided or not known
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(460)
>>> resolveSelected: Find IP destination for:
>>> forcesafesearch.google.com:443' via forcesafesearch.google.com
>>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
>>> forcesafesearch.google.com
>>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not
>>> provided or not known
>>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(657)
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>>> 'forcesafesearch.google.com'
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
>>> [2001:4860:4802:32::78]
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector300177 found conn2388493 local=[::]
>>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination
>>> #1 for forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath:
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath:
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath:
>>> timedout = 0
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
>>> 216.239.38.120
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector300177 found conn2388494 local=0.0.0.0
>>> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath:
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath:
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath:
>>> timedout = 0
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(236) finalCallback:
>>> 0x12208e038
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(479)
>>> resolveSelected: PeerSelector300177 found all 2 destinations for
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(480)
>>> resolveSelected: always_direct = ALLOWED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(481)
>>> resolveSelected: never_direct = DENIED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(482)
>>> resolveSelected: timedout = 0
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(990) have:
>>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>>> 2024/01/10 11:55:49.877 kid1| 14,2| ipcache.cc(1031) markAsBad:
>>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>>> 2024/01/10 11:55:49.882 kid1| 14,7| ipcache.cc(990) have:
>>> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>> 2024/01/10 11:55:49.882 kid1| 14,2| ipcache.cc(1055) forgetMarking:
>>> 216.239.38.120:443 of forcesafesearch.google.com
>>>
>>
>> _______________________________________________
>> squid-users mailing list
>> squid-users at lists.squid-cache.org
>> https://lists.squid-cache.org/listinfo/squid-users
>
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> https://lists.squid-cache.org/listinfo/squid-users
>
More information about the squid-users
mailing list