[squid-users] IPv4 addresses go missing - markAsBad wrong?
Stephen Borrill
squid at borrill.org.uk
Tue Jan 16 11:01:08 UTC 2024
The problem is no different with 6.6. Is there any more debugging I can
provide, Alex?
On 10/01/2024 12:40, Stephen Borrill wrote:
> On 09/01/2024 15:42, Alex Rousskov wrote:
>> On 2024-01-09 05:56, Stephen Borrill wrote:
>>> On 09/01/2024 09:51, Stephen Borrill wrote:
>>>> On 09/01/2024 03:41, Alex Rousskov wrote:
>>>>> On 2024-01-08 08:31, Stephen Borrill wrote:
>>>>>> I'm trying to determine why squid 6.x (seen with 6.5) connected
>>>>>> via IPv4-only periodically fails to connect to the destination and
>>>>>> then requires a restart to fix it (reload is not sufficient).
>>>>>>
>>>>>> The problem appears to be that a host that has one address each of
>>>>>> IPv4 and IPv6 occasionally has its IPv4 address go missing as a
>>>>>> destination. On closer inspection, this appears to happen when the
>>>>>> IPv6 address (not the IPv4) address is marked as bad.
>>
>>> ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in
>>> 216.239.38.120 #1/2-0
>>
>>
>> Thank you for sharing more debugging info!
>
> The following seemed odd to. It finds an IPv4 address (this host does
> not have IPv6), puts it in the cache and then says "No DNS records":
>
> 2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
> schoolbase.online
> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease:
> ipcacheRelease: Releasing entry for 'schoolbase.online'
> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670)
> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for
> 'schoolbase.online'
> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1
> answers for schoolbase.online
> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have: no
> 20.54.32.34 in [no cached IPs]
> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have: no
> 20.54.32.34 in [no cached IPs]
> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use first
> 69 from RR TTL 69
> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood:
> schoolbase.online #1 20.54.32.34
> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 20.54.32.34
> 2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath:
> PeerSelector72389 found conn564274 local=0.0.0.0 remote=20.54.32.34:443
> HIER_DIRECT flags=1, destination #1 for schoolbase.online:443
> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: ERROR:
> DNS failure while resolving schoolbase.online: No DNS records
> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) ipcacheHandleReply:
> done with schoolbase.online: 20.54.32.34 #1/1-0
> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback:
> 0x1b7381f38 lookup_err=No DNS records
>
> It seemed to happen about the same time as the other failure, so perhaps
> another symptom of the same.
>
>> The above log line is self-contradictory AFAICT: It says that the
>> cache has both IPv6-looking and IPv4-looking address at the same cache
>> position (0) and, judging by the corresponding code, those two IP
>> addresses are equal. This is not possible (for those specific IP
>> address values). The subsequent Squid behavior can be explained by
>> this (unexplained) conflict.
>>
>> I assume you are running official Squid v6.5 code.
>
> Yes, compiled from source on NetBSD. I have the patch I refer to here
> applied too:
> https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html
>
>> I can suggest the following two steps for going forward:
>>
>> 1. Upgrade to the latest Squid v6 in hope that the problem goes away.
>
> I have just upgraded to 6.6.
>
>> 2. If the problem is still there, patch the latest Squid v6 to add
>> more debugging in hope to explain what is going on. This may take a
>> few iterations, and it will take me some time to produce the necessary
>> debugging patch.
>
> Unfortunately, I don't have a test case that will cause the problem so I
> need to run this at a customer's production site that is particularly
> affected by it. Luckily, the problem recurs pretty quickly.
>
> Here's a run with 6.6 where the number of destinations drops from 2 to 1
> before reverting. Not seen this before - usually once it has dropped to
> 1 (the IPv6 address), it stays there until a restart (and this did
> happen about a minute after this log fragment). Happy to test out any
> debugging patch.
>
> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
> forcesafesearch.google.com
> 2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP: Given
> Non-IP 'forcesafesearch.google.com': hostname or servname not provided
> or not known
> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657)
> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
> 'forcesafesearch.google.com'
> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
> [2001:4860:4802:32::78]
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath:
> PeerSelector300176 found conn2388484 local=[::]
> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination #1
> for forcesafesearch.google.com:443
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath:
> always_direct = ALLOWED
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath:
> never_direct = DENIED
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath:
> timedout = 0
> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp:
> 216.239.38.120
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath:
> PeerSelector300176 found conn2388485 local=0.0.0.0
> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for
> forcesafesearch.google.com:443
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath:
> always_direct = ALLOWED
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath:
> never_direct = DENIED
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath:
> timedout = 0
> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(236) finalCallback:
> 0x12208e038
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(479) resolveSelected:
> PeerSelector300176 found all 2 destinations for
> forcesafesearch.google.com:443
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(480) resolveSelected:
> always_direct = ALLOWED
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(481) resolveSelected:
> never_direct = DENIED
> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(482) resolveSelected:
> timedout = 0
> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(990) have:
> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
> 2024/01/10 11:55:49.849 kid1| 14,2| ipcache.cc(1031) markAsBad:
> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
> 2024/01/10 11:55:49.855 kid1| 14,7| ipcache.cc(990) have:
> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
> 2024/01/10 11:55:49.855 kid1| 14,2| ipcache.cc(1055) forgetMarking:
> 216.239.38.120:443 of forcesafesearch.google.com
> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP: Given
> Non-IP 'forcesafesearch.google.com': hostname or servname not provided
> or not known
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(460) resolveSelected:
> Find IP destination for: forcesafesearch.google.com:443' via
> forcesafesearch.google.com
> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
> forcesafesearch.google.com
> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP: Given
> Non-IP 'forcesafesearch.google.com': hostname or servname not provided
> or not known
> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(657)
> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
> 'forcesafesearch.google.com'
> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
> [2001:4860:4802:32::78]
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath:
> PeerSelector300177 found conn2388493 local=[::]
> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination #1
> for forcesafesearch.google.com:443
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath:
> always_direct = ALLOWED
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath:
> never_direct = DENIED
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath:
> timedout = 0
> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp:
> 216.239.38.120
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath:
> PeerSelector300177 found conn2388494 local=0.0.0.0
> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for
> forcesafesearch.google.com:443
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath:
> always_direct = ALLOWED
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath:
> never_direct = DENIED
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath:
> timedout = 0
> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(236) finalCallback:
> 0x12208e038
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(479) resolveSelected:
> PeerSelector300177 found all 2 destinations for
> forcesafesearch.google.com:443
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(480) resolveSelected:
> always_direct = ALLOWED
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(481) resolveSelected:
> never_direct = DENIED
> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(482) resolveSelected:
> timedout = 0
> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(990) have:
> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
> 2024/01/10 11:55:49.877 kid1| 14,2| ipcache.cc(1031) markAsBad:
> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
> 2024/01/10 11:55:49.882 kid1| 14,7| ipcache.cc(990) have:
> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
> 2024/01/10 11:55:49.882 kid1| 14,2| ipcache.cc(1055) forgetMarking:
> 216.239.38.120:443 of forcesafesearch.google.com
>
More information about the squid-users
mailing list