[squid-users] Squid "suspending ICAP service for too many failures"

Alex Rousskov rousskov at measurement-factory.com
Fri Jan 29 19:38:25 UTC 2021


On 1/29/21 11:55 AM, Andrea Venturoli wrote:

> I see Squid connections to C-ICAP starting to time out:
> when the number of errors reach 10, Squid marks squidclamav service as
> "suspended".

> No big surprise.

IIRC, you did not disclose timeout suspicions before. This explanation
is news to me, and it eliminates several suspects.


> Still I don't get any more insight (Is C-ICAP choking?
> Why? What data triggers this?).

If you are talking about Squid timing out when attempting to establish a
TCP connection with the ICAP server, then this may by as much insight as
you can get from the Squid side. There is no ICAP "data" at that
connection establishment stage. It is a fairly low-level operation that
Squid and c-icap have little control over. The problem is probably
outside Squid.

I do not know much about c-icap, but I would check whether its
configuration or something like crontab results in hourly restarts and
associated loss of connectivity. The network interface or the routing
tables might also be reset hourly for some reason. The ICAP
server/service might be running out of descriptors or memory.

One potentially useful test is to try to connect to the ICAP server
_while the problem is happening_ using telnet or netcat. When Squid
cannot establish a connection, can you? If the ICAP service is not
running on the Squid box, then try this test both from the Squid box and
from the ICAP box.

Packet captures can tell you whether other Squid-ICAP server connections
were active at the time, whether from-Squid SYN packets were able to
reach the ICAP server, etc.

In other words, basic network troubleshooting steps...


> Is it a really bad idea to raise icap_connect_timeout?

Higher timeout will delay HTTP client transactions for longer periods of
time, of course. If you want to go down the road of finding workarounds,
then check whether raising that timeout actually helps. It is not yet
clear (to me) whether the connections just need more time to be
established or are simply doomed.


> Same for disabling icap_service_failure_limit?

This is an essential ICAP service (icap_service bypass=off). I assume
there is no backup service -- no adaptation_service_set in play here. If
so, disabling the limit means that fewer HTTP transactions will be
inconvenienced in the long run than if the service were to be suspended.
 Hence, fewer ICAP errors will be delivered to Squid clients.

You can also enable bypass.

Fixing the problem would be a much better solution, of course.


HTH,

Alex.


More information about the squid-users mailing list