[squid-users] logging: TCP connection to x.x.x.x/3128 failed
Alex Rousskov
rousskov at measurement-factory.com
Mon Apr 3 00:55:06 UTC 2023
On 3/30/23 07:58, Waldemar Brodkorb wrote:
> we recently updated one of our stage 1 proxies to Ubuntu 22.04 with
> Squid 5.8. The setup is like so:
> clients <-> loadbalancer <-> stage 1 proxies <-> stage 2 proxies <-> internet
>
> Now the cache.log on the stage 1 proxy is polluted with a lot of
> messages like: TCP connection to x.x.x.x/3128 failed
>
> The message appear when a client tries to connect to a website which
> does not resolve via DNS. The parent stage 2 proxy sends then a
> 50x error and the stage 1 proxy logs messages like above for every
> of the six stage 2 proxies.
> How can we suppress these messages
I do not think you can suppress these messages without changing Squid
source code.
Please note that when Squid considers the connection to a cache_peer
failed, it decrements the "up" counter for that peer. If that counter
reaches zero, the peer will be marked as dead. The counter starts at
connect-fail-limit=N level. The counter is restored to N when a
connection succeeds, so you may not normally see a dead peer for this
specific reasons, but it is just luck.
> or can they be fixed to what is really happening?
I suspect any fixes would require changing Squid sources. Squid v6
commit 022dbab might be helpful here, at least as an inspiration (it
does not apply to v5 cleanly and it focuses on 4xx peer responses while
your messages are related to 5xx peer responses).
Squid v6 has cache_log_message directive that can be used to suppress
some level-1 errors, but the error you are writing about is not one of
them, and, more importantly, I would not recommend suppressing it (see
the "Please note..." paragraph above for the discussion of this error
potential importance). That directive is not available in v5.
Ultimately, what you are describing sounds like a Squid bug to me: Squid
should not blame its cache_peer for request-target DNS resolution errors
outside that peer control. Fixing that bug can be controversial because
an admin may want Squid to automatically start bypassing a cache_peer
that is, for example, misconfigured and cannot resolve _any_
request-targets. One could argue that detection (and bypass) of (such)
problematic cache_peers should be done differently, but it is unknown
whether others would agree with any specific logic changes in that area.
For example, the proposal to add an ACL-driven cache_peer_fault
directive[1] (to give the admin more control over alive/dead decisions)
was rejected as "overkill"[2], preserving the approach that relies on
hard-coding decisions (including commit 022dbab fixes mentioned above).
[1]
https://github.com/squid-cache/squid/blob/25431f18f2f5e796b8704c85fc51f93b6cc2a73d/src/cf.data.pre#L4019
[2] https://github.com/squid-cache/squid/pull/1166#issuecomment-1295806530
Going forward, one can try to argue that HTTP 502 cache_peer responses
should never be treated as that cache_peer fault (as far as alive/dead
decisions are concerned) despite the fact that some 502 responses may be
related to cache_peer problems rather than basic DNS errors. That would
be a straightforward change, especially in v6. (If that attempt fails,)
one could also try to re-introduce the cache_peer_fault directive to let
admin decide.
https://wiki.squid-cache.org/SquidFaq/AboutSquid#how-to-add-a-new-squid-feature-enhance-of-fix-something
Cheers,
Alex.
More information about the squid-users
mailing list