[squid-users] IPv4 addresses go missing - markAsBad wrong?

Stephen Borrill squid at borrill.org.uk
Tue Jan 16 14:43:03 UTC 2024


On 16/01/2024 14:37, Alex Rousskov wrote:
> On 2024-01-16 06:01, Stephen Borrill wrote:
>> The problem is no different with 6.6. Is there any more debugging I 
>> can provide, Alex?
> 
> Yes, but I need to give you a patch that adds that (temporary) debugging 
> first (assuming I fail to reproduce the problem in the lab). The ball is 
> on my side (unless somebody else steps in). Unfortunately, I do not have 
> any free time for any of that right now. If you do not hear from me 
> sooner, please ping me again on or after February 8, 2024.

Thanks. In the meantime, I have created a local DNS entry for 
forcesafesearch.google.com that only returns the A record. I think that 
should work around it (for that site, but not others).

>> On 10/01/2024 12:40, Stephen Borrill wrote:
>>> On 09/01/2024 15:42, Alex Rousskov wrote:
>>>> On 2024-01-09 05:56, Stephen Borrill wrote:
>>>>> On 09/01/2024 09:51, Stephen Borrill wrote:
>>>>>> On 09/01/2024 03:41, Alex Rousskov wrote:
>>>>>>> On 2024-01-08 08:31, Stephen Borrill wrote:
>>>>>>>> I'm trying to determine why squid 6.x (seen with 6.5) connected 
>>>>>>>> via IPv4-only periodically fails to connect to the destination 
>>>>>>>> and then requires a restart to fix it (reload is not sufficient).
>>>>>>>>
>>>>>>>> The problem appears to be that a host that has one address each 
>>>>>>>> of IPv4 and IPv6 occasionally has its IPv4 address go missing as 
>>>>>>>> a destination. On closer inspection, this appears to happen when 
>>>>>>>> the IPv6 address (not the IPv4) address is marked as bad.
>>>>
>>>>> ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
>>>>> 216.239.38.120 #1/2-0
>>>>
>>>>
>>>> Thank you for sharing more debugging info!
>>>
>>> The following seemed odd to. It finds an IPv4 address (this host does 
>>> not have IPv6), puts it in the cache and then says "No DNS records":
>>>
>>> 2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
>>> schoolbase.online
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
>>> ipcacheRelease: Releasing entry for 'schoolbase.online'
>>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
>>> 'schoolbase.online'
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
>>> answers for schoolbase.online
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
>>> 20.54.32.34 in [no cached IPs]
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
>>> 20.54.32.34 in [no cached IPs]
>>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use 
>>> first 69 from RR TTL 69
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
>>> schoolbase.online #1 20.54.32.34
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>> 20.54.32.34
>>> 2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>> PeerSelector72389 found conn564274 local=0.0.0.0 
>>> remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for 
>>> schoolbase.online:443
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: 
>>> ERROR: DNS failure while resolving schoolbase.online: No DNS records
>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) 
>>> ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
>>> 0x1b7381f38  lookup_err=No DNS records
>>>
>>> It seemed to happen about the same time as the other failure, so 
>>> perhaps another symptom of the same.
>>>
>>>> The above log line is self-contradictory AFAICT: It says that the 
>>>> cache has both IPv6-looking and IPv4-looking address at the same 
>>>> cache position (0) and, judging by the corresponding code, those two 
>>>> IP addresses are equal. This is not possible (for those specific IP 
>>>> address values). The subsequent Squid behavior can be explained by 
>>>> this (unexplained) conflict.
>>>>
>>>> I assume you are running official Squid v6.5 code.
>>>
>>> Yes, compiled from source on NetBSD. I have the patch I refer to here 
>>> applied too:
>>> https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html
>>>
>>>> I can suggest the following two steps for going forward:
>>>>
>>>> 1. Upgrade to the latest Squid v6 in hope that the problem goes away.
>>>
>>> I have just upgraded to 6.6.
>>>
>>>> 2. If the problem is still there, patch the latest Squid v6 to add 
>>>> more debugging in hope to explain what is going on. This may take a 
>>>> few iterations, and it will take me some time to produce the 
>>>> necessary debugging patch.
>>>
>>> Unfortunately, I don't have a test case that will cause the problem 
>>> so I need to run this at a customer's production site that is 
>>> particularly affected by it. Luckily, the problem recurs pretty quickly.
>>>
>>> Here's a run with 6.6 where the number of destinations drops from 2 
>>> to 1 before reverting. Not seen this before - usually once it has 
>>> dropped to 1 (the IPv6 address), it stays there until a restart (and 
>>> this did happen about a minute after this log fragment). Happy to 
>>> test out any debugging patch.
>>>
>>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
>>> forcesafesearch.google.com
>>> 2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP: 
>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
>>> provided or not known
>>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657) 
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
>>> 'forcesafesearch.google.com'
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>> [2001:4860:4802:32::78]
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>> PeerSelector300176 found conn2388484 local=[::] 
>>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination 
>>> #1 for forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>> timedout = 0
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>> 216.239.38.120
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>> PeerSelector300176 found conn2388485 local=0.0.0.0 
>>> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for 
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>> timedout = 0
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(236) finalCallback: 
>>> 0x12208e038
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(479) 
>>> resolveSelected: PeerSelector300176 found all 2 destinations for 
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(480) 
>>> resolveSelected:    always_direct = ALLOWED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(481) 
>>> resolveSelected:     never_direct = DENIED
>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(482) 
>>> resolveSelected:         timedout = 0
>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(990) have: 
>>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>>> 2024/01/10 11:55:49.849 kid1| 14,2| ipcache.cc(1031) markAsBad: 
>>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>>> 2024/01/10 11:55:49.855 kid1| 14,7| ipcache.cc(990) have: 
>>> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>> 2024/01/10 11:55:49.855 kid1| 14,2| ipcache.cc(1055) forgetMarking: 
>>> 216.239.38.120:443 of forcesafesearch.google.com
>>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP: 
>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
>>> provided or not known
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(460) 
>>> resolveSelected: Find IP destination for: 
>>> forcesafesearch.google.com:443' via forcesafesearch.google.com
>>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
>>> forcesafesearch.google.com
>>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP: 
>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
>>> provided or not known
>>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(657) 
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
>>> 'forcesafesearch.google.com'
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>> [2001:4860:4802:32::78]
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>> PeerSelector300177 found conn2388493 local=[::] 
>>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination 
>>> #1 for forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>> timedout = 0
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>> 216.239.38.120
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>> PeerSelector300177 found conn2388494 local=0.0.0.0 
>>> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for 
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>> always_direct = ALLOWED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>> never_direct = DENIED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>> timedout = 0
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(236) finalCallback: 
>>> 0x12208e038
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(479) 
>>> resolveSelected: PeerSelector300177 found all 2 destinations for 
>>> forcesafesearch.google.com:443
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(480) 
>>> resolveSelected:    always_direct = ALLOWED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(481) 
>>> resolveSelected:     never_direct = DENIED
>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(482) 
>>> resolveSelected:         timedout = 0
>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(990) have: 
>>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>>> 2024/01/10 11:55:49.877 kid1| 14,2| ipcache.cc(1031) markAsBad: 
>>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>>> 2024/01/10 11:55:49.882 kid1| 14,7| ipcache.cc(990) have: 
>>> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>> 2024/01/10 11:55:49.882 kid1| 14,2| ipcache.cc(1055) forgetMarking: 
>>> 216.239.38.120:443 of forcesafesearch.google.com
>>>
>>
>> _______________________________________________
>> squid-users mailing list
>> squid-users at lists.squid-cache.org
>> https://lists.squid-cache.org/listinfo/squid-users
> 
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> https://lists.squid-cache.org/listinfo/squid-users
> 



More information about the squid-users mailing list