[squid-users] Socket handle leak?

Fri Jul 12 12:05:11 UTC 2024

Hi,

I my setup (also ubuntu) I have made these changes :

root at proxy: # cat /etc/security/limits.d/squid.conf
squid        soft    nofile  64000
squid        hard    nofile  65500

root at proxy: # cat /etc/squid/squid.conf | grep max_file
max_filedesc 64000

This force the system limits for squid process and tell squid how much FD it can consume.

Regards,

Yvain PAYEN

De : squid-users <squid-users-bounces at lists.squid-cache.org> De la part de paolo.prinx at gmail.com
Envoyé : vendredi 12 juillet 2024 12:58
À : squid-users at lists.squid-cache.org
Objet : [squid-users] Socket handle leak?

⚠ FR : Ce message provient de l'extérieur de l'organisation. N'ouvrez pas de liens ou de pièces jointes à moins que vous ne sachiez que le contenu est fiable.  ⚠

Hello,
   apologies in advance for the silly question.

We are having some stability issues with our squid farms after a recent upgrade from Centos/Squid 3.5.x to Ubuntu/Squid 5.7/6.9. I wonder if anyone here has seen something similar, and might have some suggestion about what we are obviously missing?

In short, after running for a certain period the servers run out of file descriptors. We see a slowly growing number of TCP or TCPv6 socket handles, that eventually hits the configured maximum. The handles do not get released until after squid is restarted (-k restart)

It is somewhat similar to what reported under https://access.redhat.com/solutions/3362211 . They state that

  *   If an application fails to close() it's socket descriptors and continues to allocate new sockets then it can use up all the system memory on TCP(v6) slab objects.
  *   Note some of these sockets will not show up in /proc/net/sockstat(6). Sockets that still have a file descriptor but are in the TCP_CLOSE state will consume a slab object. But will not be accounted for in /proc/net/sockstat(6) or "ss" or "netstat".
  *   It can be determined whether this is an application sockets leak, by stopping the application processes that are consuming sockets. If the slab objects in /proc/slabinfo are freed then the application is responsible. As that means that destructor routines have found open file descriptors to sockets in the process.

"This is most likely to be a case of the application not handling error conditions correctly and not calling close() to free the FD and socket."

For example, on a server with squid 5.7, unmodified package:

list of open files;
lsof |wc -l
56963

of which 35K in TCPv6:
lsof |grep proxy |grep TCPv6 |wc -l
    35301

under /proc I see less objects
    cat  /proc/net/tcp6 |wc -l
    3095

but the number of objects in the slabs is high
    cat /proc/slabinfo |grep TCPv6
    MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
    tw_sock_TCPv6       1155   1155    248   33    2 : tunables    0    0    0 : slabdata     35     35      0
    request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0
    TCPv6              38519  38519   2432   13    8 : tunables    0    0    0 : slabdata   2963   2963      0

I have 35K of lines like this
    lsof |grep proxy |grep TCPv6 |more
    squid        1049              proxy   13u     sock                0,8        0t0    5428173 protocol: TCPv6
    squid        1049              proxy   14u     sock                0,8        0t0   27941608 protocol: TCPv6
    squid        1049              proxy   24u     sock                0,8        0t0   45124047 protocol: TCPv6
    squid        1049              proxy   25u     sock                0,8        0t0   50689821 protocol: TCPv6
...

We thought maybe this is a weird IPv6 thing, as we only route IPv4, so we compiled a more recent version of squid with no v6 support. The thing just moved to TCP4..

lsof |wc -l
120313

cat /proc/slabinfo |grep TCP
MPTCPv6                0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
tw_sock_TCPv6          0      0    248   33    2 : tunables    0    0    0 : slabdata      0      0      0
request_sock_TCPv6      0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0
TCPv6                208    208   2432   13    8 : tunables    0    0    0 : slabdata     16     16      0
MPTCP                  0      0   1856   17    8 : tunables    0    0    0 : slabdata      0      0      0
tw_sock_TCP         5577   5577    248   33    2 : tunables    0    0    0 : slabdata    169    169      0
request_sock_TCP    1898   2002    304   26    2 : tunables    0    0    0 : slabdata     77     77      0
TCP               102452 113274   2240   14    8 : tunables    0    0    0 : slabdata   8091   8091      0

cat /proc/net/tcp |wc -l
255

After restarting squid the slab objects are released and the open file descriptors drop to a reasonable value. This further suggests it is squid hanging on to these FDs.
lsof |grep proxy |wc -l
1221

Any suggestion? I guess it's something blatantly obvious, but it's a couple of days we look at this and we're not going anywhere...

Thanks again

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20240712/7c3f1e7d/attachment.htm>