[squid-users] 6.x gives frequent connection to peer failed - spurious?

Alex Rousskov rousskov at measurement-factory.com
Wed Nov 15 21:55:54 UTC 2023


On 2023-11-10 05:46, Stephen Borrill wrote:

> With 6.x (currently 6.5) there are very frequent (every 10 seconds or 
> so) messages like:
> 2023/11/10 10:25:43 kid1| ERROR: Connection to 127.0.0.1:8123 failed


 > why is this logged as a connection failure

The current error wording is too assuming and, in your case, evidently 
misleading. The phrase "Connection to X failed" should be changed to 
something more general like "Cannot contact cache_peer X" or "Cannot 
communicate with cache_peer X".

CachePeer::countFailure() patches welcome.


 > do I need to worry about it beyond it filing up the logs needlessly?

In short, "yes".

I cannot accurately assess your specific needs, but, in most 
environments, one should indeed worry that their cache_peer server names 
cannot be reliably resolved because failed resolution attempts waste 
Squid resources and increase transaction response time. Moreover, if 
these failures are frequent enough (relative to peer usage attempts), 
the affected cache_peer will be marked as DEAD (as you have mentioned):

 > 2023/11/09 08:55:22 kid1| Detected DEAD Parent: 127.0.0.1:8123


HTH,

Alex.



> 
> With 4.x there were no such messages.
> 
> By comparing to the peer squid logs, these seems to tally with DNS 
> failures:
> peer_select.cc(479) resolveSelected: PeerSelector1688 found all 0 
> destinations for bugzilla.tucasi.com:443
> 
> Full ALL,2 log at the time of the reported connection failure:
> 
> 2023/11/10 10:25:43.162 kid1| 5,2| TcpAcceptor.cc(214) doAccept: New 
> connection on FD 17
> 2023/11/10 10:25:43.162 kid1| 5,2| TcpAcceptor.cc(316) acceptNext: 
> connection on conn3 local=127.0.0.1:8123 remote=[::] FD 17 flags=9
> 2023/11/10 10:25:43.162 kid1| 11,2| client_side.cc(1332) 
> parseHttpRequest: HTTP Client conn13206 local=127.0.0.1:8123 
> remote=127.0.0.1:57843 FD 147 flags=1
> 2023/11/10 10:25:43.162 kid1| 11,2| client_side.cc(1336) 
> parseHttpRequest: HTTP Client REQUEST:
> 2023/11/10 10:25:43.162 kid1| 85,2| client_side_request.cc(707) 
> clientAccessCheckDone: The request CONNECT bugzilla.tucasi.com:443 is 
> ALLOWED; last ACL checked: localhost
> 2023/11/10 10:25:43.162 kid1| 85,2| client_side_request.cc(683) 
> clientAccessCheck2: No adapted_http_access configuration. default: ALLOW
> 2023/11/10 10:25:43.162 kid1| 85,2| client_side_request.cc(707) 
> clientAccessCheckDone: The request CONNECT bugzilla.tucasi.com:443 is 
> ALLOWED; last ACL checked: localhost
> 2023/11/10 10:25:43.162 kid1| 44,2| peer_select.cc(460) resolveSelected: 
> Find IP destination for: bugzilla.tucasi.com:443' via bugzilla.tucasi.com
> 2023/11/10 10:25:43.163 kid1| 44,2| peer_select.cc(479) resolveSelected: 
> PeerSelector1526 found all 0 destinations for bugzilla.tucasi.com:443
> 2023/11/10 10:25:43.163 kid1| 44,2| peer_select.cc(480) resolveSelected: 
>    always_direct = ALLOWED
> 2023/11/10 10:25:43.163 kid1| 44,2| peer_select.cc(481) resolveSelected: 
>     never_direct = DENIED
> 2023/11/10 10:25:43.163 kid1| 44,2| peer_select.cc(482) resolveSelected: 
>         timedout = 0
> 2023/11/10 10:25:43.163 kid1| 4,2| errorpage.cc(1397) buildBody: No 
> existing error page language negotiated for ERR_DNS_FAIL. Using default 
> error file.
> 2023/11/10 10:25:43.163 kid1| 33,2| client_side.cc(617) swanSong: 
> conn13206 local=127.0.0.1:8123 remote=127.0.0.1:57843 flags=1
> 
> If my analysis is correct why is this logged as a connection failure and 
> do I need to worry about it beyond it filing up the logs needlessly?
> 
> My concern is that this could lead to the parent being incorrectly 
> declared DEAD thus impacting other traffic:
> 
> 2023/11/09 08:55:22 kid1| Detected DEAD Parent: 127.0.0.1:8123
>      current master transaction: master4581234
> 



More information about the squid-users mailing list