[squid-dev] Squid 3.5.23: crash in Comm::DoSelect
Alex Rousskov
rousskov at measurement-factory.com
Tue Oct 18 14:48:26 UTC 2016
On 10/18/2016 03:44 AM, oleg gv wrote:
> nfds=284, so loop ends on 283 and pfds[283] is buggy
> I/o module is src/comm/ModPoll.cc, method Comm::DoSelect(int msec)
> On stack we see that pfds[SQUID_MAXFD=256], so is less than nfds in loop.
> May be malloc nfds?
If your maxfd is bigger than SQUID_MAXFD than the bug is elsewhere and
dynamically allocating pfds is not the right fix (even though it will
"work").
I suspect your Squid is creating creating or accepting a descriptor that
exceeds SQUID_MAXFD-1. Biggest_FD+1 cannot be allowed to exceed the
misnamed SQUID_MAXFD limit.
This combination looks like a big red flag to me:
struct pollfd pfds[SQUID_MAXFD];
...
maxfd = Biggest_FD + 1;
for (int i = 0; i < maxfd; ++i) {
...
pfds[nfds].fd = i;
That code is missing assert(maxfd <= SQUID_MAXFD) which will fail in
your case.
If you want a workaround, try building Squid with a reasonable number of
maximum descriptors (e.g., 16K, 32K, or 64K). If that number is never
reached in your environment, the code will appear to work.
If you want to try a quick fix, replace SQUID_MAXFD with (Biggest_FD +
1) when declaring pfds. You may need to ignore/disable compiler warnings
about C++ arrays with dynamic sizes. Alternatively, you can allocate
pfds dynamically (as you suggested).
If you want to fix the bug, audit all Biggest_FD- and
SQUID_MAXFD-related code to make sure the two are always in sync.
HTH,
Alex.
> 2016-10-18 8:29 GMT+03:00 Amos Jeffries:
>
> FYI: Squid-3.5.23 does not exist yet. What is the output of "squid -v" ?
>
> On 18/10/2016 5:01 a.m., oleg gv wrote:
> > I have big traffic (at least 100 computers) , and squid often crashed in
> > Comm::DoSelect(int msec) function.
> > I have interception mode and NAT redirect.
> >
> > In coredump I saw then bug is in next fragment of code:
> >
> > 446│ for (size_t loopIndex = 0; loopIndex < nfds; ++loopIndex) {
> > 447│ fde *F;
> > 448│ int revents = pfds[loopIndex].revents;
> > 449│ fd = pfds[loopIndex].fd;
> > 450│
> > 451│ if (fd == -1)
> > 452│ continue;
> > 453│
> > 454├> if (fd_table[fd].flags.read_pending)
> > 455│ revents |= POLLIN;
> >
> > SIGSEGV occured often (about 1 time in a minute) in line 454 : fd=-66012128
> > , loopindex=283
> >
> > (gdb) p pfds[282]
> > $17 = {fd = 291, events = 64, revents = 0} -- looks ok
> >
> > (gdb) p pfds[283]
> > $18 = {fd = -66012128, events = 32595, revents = 0} -- looks strange and
> > spoiled
> >
> > (gdb) p Biggest_FD
> > $19 = 292
> >
>
> What is the nfds value ?
>
> It looks to me like only 282 FD have operations to perform on this I/O
> cycle.
>
> What I/O module is being used?
>
> src/comm/ModDevPoll.cc:Comm::DoSelect(int msec)
> src/comm/ModPoll.cc:Comm::DoSelect(int msec)
>
>
> Amos
>
> _______________________________________________
> squid-dev mailing list
> squid-dev at lists.squid-cache.org <mailto:squid-dev at lists.squid-cache.org>
> http://lists.squid-cache.org/listinfo/squid-dev
> <http://lists.squid-cache.org/listinfo/squid-dev>
>
>
>
>
> _______________________________________________
> squid-dev mailing list
> squid-dev at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-dev
>
More information about the squid-dev
mailing list