[squid-users] IPv4 addresses go missing - markAsBad wrong?
Stephen Borrill
squid at borrill.org.uk
Tue Jan 9 10:56:16 UTC 2024
On 09/01/2024 09:51, Stephen Borrill wrote:
> On 09/01/2024 03:41, Alex Rousskov wrote:
>> On 2024-01-08 08:31, Stephen Borrill wrote:
>>> I'm trying to determine why squid 6.x (seen with 6.5) connected via
>>> IPv4-only periodically fails to connect to the destination and then
>>> requires a restart to fix it (reload is not sufficient).
>>>
>>> The problem appears to be that a host that has one address each of
>>> IPv4 and IPv6 occasionally has its IPv4 address go missing as a
>>> destination. On closer inspection, this appears to happen when the
>>> IPv6 address (not the IPv4) address is marked as bad. A log fragment
>>> is as follows:
>>>
>>> 2024/01/08 13:18:39.974 kid1| 44,2| peer_select.cc(460)
>>> resolveSelected: Find IP destination for:
>>> clientservices.googleapis.com:443' via clientservices.googleapis.com
>>> 2024/01/08 13:18:39.974 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector82284 found conn696198 local=0.0.0.0
>>> remote=142.250.187.227:443 HIER_DIRECT flags=1, destination #1 for
>>> clientservices.googleapis.com:443
>>> 2024/01/08 13:18:39.974 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector82284 found conn696199 local=[::]
>>> remote=[2a00:1450:4009:820::2003]:443 HIER_DIRECT flags=1,
>>> destination #2 for clientservices.googleapis.com:443
>>> 2024/01/08 13:18:39.974 kid1| 44,2| peer_select.cc(479)
>>> resolveSelected: PeerSelector82284 found all 2 destinations for
>>> clientservices.googleapis.com:443
>>> 2024/01/08 13:18:40.245 kid1| 14,2| ipcache.cc(1031) markAsBad:
>>> [2a00:1450:4009:820::2003]:443 of clientservices.googleapis.com
>>> 2024/01/08 13:18:40.245 kid1| 14,3| ipcache.cc(946) seekNewGood:
>>> succeeded for clientservices.googleapis.com:
>>> [2a00:1450:4009:820::2003] #2/2-1
>>> 2024/01/08 13:18:40.245 kid1| 14,3| ipcache.cc(978) restoreGoodness:
>>> cleared all IPs for clientservices.googleapis.com; now back to
>>> [2a00:1450:4009:820::2003] #2/2-1
>>> 2024/01/08 13:18:42.065 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'clientservices.googleapis.com': hostname or servname
>>> not provided or not known
>>> 2024/01/08 13:18:42.065 kid1| 44,2| peer_select.cc(460)
>>> resolveSelected: Find IP destination for:
>>> clientservices.googleapis.com:443' via clientservices.googleapis.com
>>> 2024/01/08 13:18:42.065 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'clientservices.googleapis.com': hostname or servname
>>> not provided or not known
>>> 2024/01/08 13:18:42.065 kid1| 44,2| peer_select.cc(1174) handlePath:
>>> PeerSelector82372 found conn697148 local=[::]
>>> remote=[2a00:1450:4009:820::2003]:443 HIER_DIRECT flags=1,
>>> destination #1 for clientservices.googleapis.com:443
>>> 2024/01/08 13:18:42.065 kid1| 44,2| peer_select.cc(479)
>>> resolveSelected: PeerSelector82372 found all 1 destinations for
>>> clientservices.googleapis.com:443
>>>
>>>
>>> This shows two subsequent connection attempts to
>>> clientservices.googleapis.com. The first one has both IPv4 and IPv6
>>> destinations. The IPv6 address is passed to markAsBad.
>>
>> Yes.
>>
>>
>>> After that the IPv4 address is not listed as a destination.
>>
>> I do not see that. I see IPv6 address being selected as the first
>> destination (instead of the IPv4 address).
>
> My reading of the the log files is that all possible destinations are
> listed, e.g. "found all 18 destinations for www.googleapis.com:443".
> Prior to markAsBad being called both IP addresses for
> clientservices.googleapis.com are shown, afterwards the IPv4 is not
> listed. It isn't just that IPv6 is being selected as the first
> destination, it is the ONLY destination as the IPv4 address is no longer
> a candidate.
>
>
>> I cannot explain why that happens though. Moreover, a combination of
>> certain lines in your debug output near "seekNewGood" do not make
>> sense to me -- I do not see how it is possible for Squid to display
>> those exact debugging details, but I am probably missing something.
>> Can you retest and repost similar lines with 14,9 (or at least 14,7)
>> added to your debug_options (or share those lines privately; the more
>> lines you can share, the better)?
>
> Thanks, I have started a new run with 14,7 and will pass it on privately
> unless I can distil out the relevant lines to be small enough for the list.
A distilled version is here. Let me know if you want the full log.
This looks suspicious to me:
have: [2001:4860:4802:32::78]:443 at 0 in 216.239.38.120 #1/2-0
2024/01/09 09:34:34.102 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
forcesafesearch.google.com
2024/01/09 09:34:34.102 kid1| 14,4| ipcache.cc(657)
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
'forcesafesearch.google.com'
2024/01/09 09:34:34.102 kid1| 14,7| ipcache.cc(253) forwardIp:
216.239.38.120
2024/01/09 09:34:34.102 kid1| 44,2| peer_select.cc(1174) handlePath:
PeerSelector128957 found conn1010793 local=0.0.0.0
remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #1 for
forcesafesearch.google.com:443
2024/01/09 09:34:34.102 kid1| 14,7| ipcache.cc(253) forwardIp:
[2001:4860:4802:32::78]
2024/01/09 09:34:34.102 kid1| 44,2| peer_select.cc(1174) handlePath:
PeerSelector128957 found conn1010794 local=[::]
remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination #2
for forcesafesearch.google.com:443
2024/01/09 09:34:34.102 kid1| 14,7| ipcache.cc(236) finalCallback:
0x18f816d38
2024/01/09 09:34:34.102 kid1| 44,2| peer_select.cc(479) resolveSelected:
PeerSelector128957 found all 2 destinations for
forcesafesearch.google.com:443
2024/01/09 09:34:34.352 kid1| 14,7| ipcache.cc(990) have:
[2001:4860:4802:32::78]:443 at 0 in 216.239.38.120 #1/2-0
2024/01/09 09:34:34.352 kid1| 14,2| ipcache.cc(1031) markAsBad:
[2001:4860:4802:32::78]:443 of forcesafesearch.google.com
2024/01/09 09:34:34.352 kid1| 14,3| ipcache.cc(946) seekNewGood:
succeeded for forcesafesearch.google.com: [2001:4860:4802:32::78] #2/2-1
2024/01/09 09:34:34.352 kid1| 14,3| ipcache.cc(978) restoreGoodness:
cleared all IPs for forcesafesearch.google.com; now back to
[2001:4860:4802:32::78] #2/2-1
2024/01/09 09:34:35.683 kid1| 14,4| ipcache.cc(617) nbgethostbyname:
forcesafesearch.google.com
2024/01/09 09:34:35.683 kid1| 14,3| Address.cc(389) lookupHostIP: Given
Non-IP 'forcesafesearch.google.com': hostname or servname not provided
or not known
2024/01/09 09:34:35.683 kid1| 14,4| ipcache.cc(657)
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
'forcesafesearch.google.com'
2024/01/09 09:34:35.683 kid1| 14,7| ipcache.cc(253) forwardIp:
[2001:4860:4802:32::78]
2024/01/09 09:34:35.683 kid1| 44,2| peer_select.cc(1174) handlePath:
PeerSelector128974 found conn1010956 local=[::]
remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination #1
for forcesafesearch.google.com:443
2024/01/09 09:34:35.683 kid1| 14,7| ipcache.cc(236) finalCallback:
0x18f816d38
2024/01/09 09:34:35.683 kid1| 44,2| peer_select.cc(479) resolveSelected:
PeerSelector128974 found all 1 destinations for
forcesafesearch.google.com:443
2024/01/09 09:34:35.683 kid1| 14,7| ipcache.cc(990) have:
[2001:4860:4802:32::78]:443 at 0 in [2001:4860:4802:32::78] #2/2-1
>>> Note that there have been many connections to
>>> clientservices.googleapis.com prior to this where markAsBad was not
>>> called, even though IPv6 connectivity was never available.
>>
>> No markAsBad() is probably normal if Squid did not try to establish an
>> IPv6 connection or did not wait long enough to know the result of that
>> attempt. However, that does not explain why Squid selected an IPv6
>> address as the next "good" address right after marking that IPv6
>> address as bad (at "restoreGoodness" line) when there was another good
>> IP address available. It is as if Squid stored two identical IPv6
>> addresses (and not IPv4 ones), but that should not happen either.
>
> This is tangentially related to this thread too:
> https://lists.squid-cache.org/pipermail/squid-users/2023-November/026266.html
>
> Once only the IPv6 address is being used, then it returns 503 for that
> host and thus can quickly get marked as dead by a downstream squid
> meaning it does not get used at all (and if it's the only peer all
> access stops).
>
More information about the squid-users
mailing list