[squid-users] IPv4 addresses go missing - markAsBad wrong?

Alex Rousskov rousskov at measurement-factory.com
Tue Feb 20 21:12:45 UTC 2024


On 2024-02-12 06:46, Stephen Borrill wrote:
> On 16/01/2024 14:37, Alex Rousskov wrote:
>> On 2024-01-16 06:01, Stephen Borrill wrote:
>>> The problem is no different with 6.6. Is there any more debugging I 
>>> can provide, Alex?
>>
>> Yes, but I need to give you a patch that adds that (temporary) 
>> debugging first (assuming I fail to reproduce the problem in the lab). 
>> The ball is on my side (unless somebody else steps in). Unfortunately, 
>> I do not have any free time for any of that right now. If you do not 
>> hear from me sooner, please ping me again on or after February 8, 2024.

> PING!

I reproduced this bug and posted a minimal master/v7 fix for the 
official review: https://github.com/squid-cache/squid/pull/1691

Please test the corresponding patch; it applies to Squid v5 and v6:

https://github.com/squid-cache/squid/commit/7d255a72131217d30af3653cec10452fa53289c3.patch


Thank you,

Alex.


> I will get 6.7 compiled up so we can add debugging to it quickly. It 
> would be good if we could get something in place this week as it is 
> school holidays next week in the UK and so there will be little 
> opportunity to test until afterwards.
> 
>>> On 10/01/2024 12:40, Stephen Borrill wrote:
>>>> On 09/01/2024 15:42, Alex Rousskov wrote:
>>>>> On 2024-01-09 05:56, Stephen Borrill wrote:
>>>>>> On 09/01/2024 09:51, Stephen Borrill wrote:
>>>>>>> On 09/01/2024 03:41, Alex Rousskov wrote:
>>>>>>>> On 2024-01-08 08:31, Stephen Borrill wrote:
>>>>>>>>> I'm trying to determine why squid 6.x (seen with 6.5) connected 
>>>>>>>>> via IPv4-only periodically fails to connect to the destination 
>>>>>>>>> and then requires a restart to fix it (reload is not sufficient).
>>>>>>>>>
>>>>>>>>> The problem appears to be that a host that has one address each 
>>>>>>>>> of IPv4 and IPv6 occasionally has its IPv4 address go missing 
>>>>>>>>> as a destination. On closer inspection, this appears to happen 
>>>>>>>>> when the IPv6 address (not the IPv4) address is marked as bad.
>>>>>
>>>>>> ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
>>>>>> 216.239.38.120 #1/2-0
>>>>>
>>>>>
>>>>> Thank you for sharing more debugging info!
>>>>
>>>> The following seemed odd to. It finds an IPv4 address (this host 
>>>> does not have IPv6), puts it in the cache and then says "No DNS 
>>>> records":
>>>>
>>>> 2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
>>>> schoolbase.online
>>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
>>>> ipcacheRelease: Releasing entry for 'schoolbase.online'
>>>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
>>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
>>>> 'schoolbase.online'
>>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
>>>> answers for schoolbase.online
>>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
>>>> 20.54.32.34 in [no cached IPs]
>>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
>>>> 20.54.32.34 in [no cached IPs]
>>>> 2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use 
>>>> first 69 from RR TTL 69
>>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
>>>> schoolbase.online #1 20.54.32.34
>>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>>> 20.54.32.34
>>>> 2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>>> PeerSelector72389 found conn564274 local=0.0.0.0 
>>>> remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for 
>>>> schoolbase.online:443
>>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: 
>>>> ERROR: DNS failure while resolving schoolbase.online: No DNS records
>>>> 2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) 
>>>> ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
>>>> 2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
>>>> 0x1b7381f38  lookup_err=No DNS records
>>>>
>>>> It seemed to happen about the same time as the other failure, so 
>>>> perhaps another symptom of the same.
>>>>
>>>>> The above log line is self-contradictory AFAICT: It says that the 
>>>>> cache has both IPv6-looking and IPv4-looking address at the same 
>>>>> cache position (0) and, judging by the corresponding code, those 
>>>>> two IP addresses are equal. This is not possible (for those 
>>>>> specific IP address values). The subsequent Squid behavior can be 
>>>>> explained by this (unexplained) conflict.
>>>>>
>>>>> I assume you are running official Squid v6.5 code.
>>>>
>>>> Yes, compiled from source on NetBSD. I have the patch I refer to 
>>>> here applied too:
>>>> https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html
>>>>
>>>>> I can suggest the following two steps for going forward:
>>>>>
>>>>> 1. Upgrade to the latest Squid v6 in hope that the problem goes away.
>>>>
>>>> I have just upgraded to 6.6.
>>>>
>>>>> 2. If the problem is still there, patch the latest Squid v6 to add 
>>>>> more debugging in hope to explain what is going on. This may take a 
>>>>> few iterations, and it will take me some time to produce the 
>>>>> necessary debugging patch.
>>>>
>>>> Unfortunately, I don't have a test case that will cause the problem 
>>>> so I need to run this at a customer's production site that is 
>>>> particularly affected by it. Luckily, the problem recurs pretty 
>>>> quickly.
>>>>
>>>> Here's a run with 6.6 where the number of destinations drops from 2 
>>>> to 1 before reverting. Not seen this before - usually once it has 
>>>> dropped to 1 (the IPv6 address), it stays there until a restart (and 
>>>> this did happen about a minute after this log fragment). Happy to 
>>>> test out any debugging patch.
>>>>
>>>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
>>>> forcesafesearch.google.com
>>>> 2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP: 
>>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
>>>> provided or not known
>>>> 2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657) 
>>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
>>>> 'forcesafesearch.google.com'
>>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>>> [2001:4860:4802:32::78]
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>>> PeerSelector300176 found conn2388484 local=[::] 
>>>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination 
>>>> #1 for forcesafesearch.google.com:443
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>>> always_direct = ALLOWED
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>>> never_direct = DENIED
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>>> timedout = 0
>>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>>> 216.239.38.120
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>>> PeerSelector300176 found conn2388485 local=0.0.0.0 
>>>> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for 
>>>> forcesafesearch.google.com:443
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>>> always_direct = ALLOWED
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>>> never_direct = DENIED
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>>> timedout = 0
>>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(236) finalCallback: 
>>>> 0x12208e038
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(479) 
>>>> resolveSelected: PeerSelector300176 found all 2 destinations for 
>>>> forcesafesearch.google.com:443
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(480) 
>>>> resolveSelected:    always_direct = ALLOWED
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(481) 
>>>> resolveSelected:     never_direct = DENIED
>>>> 2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(482) 
>>>> resolveSelected:         timedout = 0
>>>> 2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(990) have: 
>>>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>>>> 2024/01/10 11:55:49.849 kid1| 14,2| ipcache.cc(1031) markAsBad: 
>>>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>>>> 2024/01/10 11:55:49.855 kid1| 14,7| ipcache.cc(990) have: 
>>>> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>>> 2024/01/10 11:55:49.855 kid1| 14,2| ipcache.cc(1055) forgetMarking: 
>>>> 216.239.38.120:443 of forcesafesearch.google.com
>>>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP: 
>>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
>>>> provided or not known
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(460) 
>>>> resolveSelected: Find IP destination for: 
>>>> forcesafesearch.google.com:443' via forcesafesearch.google.com
>>>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
>>>> forcesafesearch.google.com
>>>> 2024/01/10 11:55:49.877 kid1| 14,3| Address.cc(389) lookupHostIP: 
>>>> Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
>>>> provided or not known
>>>> 2024/01/10 11:55:49.877 kid1| 14,4| ipcache.cc(657) 
>>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
>>>> 'forcesafesearch.google.com'
>>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>>> [2001:4860:4802:32::78]
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>>> PeerSelector300177 found conn2388493 local=[::] 
>>>> remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination 
>>>> #1 for forcesafesearch.google.com:443
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>>> always_direct = ALLOWED
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>>> never_direct = DENIED
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>>> timedout = 0
>>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(253) forwardIp: 
>>>> 216.239.38.120
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1174) handlePath: 
>>>> PeerSelector300177 found conn2388494 local=0.0.0.0 
>>>> remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #2 for 
>>>> forcesafesearch.google.com:443
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1180) handlePath: 
>>>> always_direct = ALLOWED
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1181) handlePath: 
>>>> never_direct = DENIED
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(1182) handlePath: 
>>>> timedout = 0
>>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(236) finalCallback: 
>>>> 0x12208e038
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(479) 
>>>> resolveSelected: PeerSelector300177 found all 2 destinations for 
>>>> forcesafesearch.google.com:443
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(480) 
>>>> resolveSelected:    always_direct = ALLOWED
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(481) 
>>>> resolveSelected:     never_direct = DENIED
>>>> 2024/01/10 11:55:49.877 kid1| 44,2| peer_select.cc(482) 
>>>> resolveSelected:         timedout = 0
>>>> 2024/01/10 11:55:49.877 kid1| 14,7| ipcache.cc(990) have: 
>>>> [2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-0
>>>> 2024/01/10 11:55:49.877 kid1| 14,2| ipcache.cc(1031) markAsBad: 
>>>> [2001:4860:4802:32::78]:443 of forcesafesearch.google.com
>>>> 2024/01/10 11:55:49.882 kid1| 14,7| ipcache.cc(990) have: 
>>>> 216.239.38.120:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>>> 2024/01/10 11:55:49.882 kid1| 14,2| ipcache.cc(1055) forgetMarking: 
>>>> 216.239.38.120:443 of forcesafesearch.google.com
>>>>
>>>
>>> _______________________________________________
>>> squid-users mailing list
>>> squid-users at lists.squid-cache.org
>>> https://lists.squid-cache.org/listinfo/squid-users
>>
>> _______________________________________________
>> squid-users mailing list
>> squid-users at lists.squid-cache.org
>> https://lists.squid-cache.org/listinfo/squid-users
>>
> 



More information about the squid-users mailing list