[squid-users] Change of server hardware (?) resulted in massive increase of crashes

Tue Sep 22 13:52:57 UTC 2020

On 9/22/20 3:47 AM, Ralf Hildebrandt wrote:
> I'm getting (with the same hand-build squid versions!) a LOT
> (about once every 15 Minutes) of crashes like this one:
> 
> 2020/09/22 09:34:07| FATAL: check failed: opening()
>     exception location: tunnel.cc(1305) noteDestinationsEnd
>     current master transaction: master359979

This is still bug #5055. I hope we will post an official pull request
properly addressing it soon.

In my environment, Squid v5 is hardly usable without those fixes but, as
you know, YMMV. Your OS upgrade could trigger different DNS resolution
timings, the new cluster may have different IPv6 connectivity profile,
or there can be similar minor/innocent changes that result in slightly
different Squid state and more exceptions. I would not spend time trying
to pinpoint the exact trigger.

I updated bug #5055 with a patch that covers the tunneling case:
https://bugs.squid-cache.org/show_bug.cgi?id=5055#c5

> My infrastructure generates backtraces upon crash, but in the case I'm
> not getting any.

Unlike "assertion failed" FATAL messages, the "check failed" FATAL
messages are the result of an unhandled (for the lack of a better word)
exception. Today, such exceptions do not generate core dumps because the
low-level stack is pretty much lost by the time the exception is caught
by the high-level code. Unhandled exception handling (yes, I know) may
change in the future, but that is a separate issue.

HTH,

Alex.