[squid-users] External helper consumes too many DB connections

Tue Feb 8 15:12:23 UTC 2022

On 2/8/22 09:50, roee klinger wrote:

> Alex: If there are a lot more requests than your users/TTLs should
>       generate, then you may be able to decrease db load by figuring out
>       where the extra requests are coming from.

> actually, I don't think it matters much now that I think about it
> again, since as per my requirements, I need to reload the cache every
> 60 seconds, which means that even if it is perfect, MariaDB will
> still get a high load. I think the second approach will be better
> suited.

Your call. Wiping out the entire authentication cache every 60 seconds 
feels odd, but I do not know enough about your environment to judge.

> Alex: aggregating helper-db connections (helpers can be written to
>       talk through a central connection aggregator)
> 

> That sounds like exactly what I am looking for, how would one go about 
> doing this?

You have at least two basic options:

A. Enhance Squid to let SMP workers share helpers. I assume that you 
have C SMP workers and N helpers per worker, with C and N significantly 
greater than 1. Instead of having N helpers per worker and C*N helpers 
total, you will have just one concurrent helper per worker and C helpers 
total. This will be a significant, generally useful improvement that 
should be officially accepted if implemented well. This enhancement 
requires serious Squid code modifications in a neglected error-prone 
area, but it is certainly doable -- Squid already shares rock diskers 
across workers, for example.

B. Convert your helper from a database client program to an Aggregator 
client program (and write the Aggregator). Depending on your needs and 
skill, you can use TCP or Unix Domain Sockets (UDS) for 
helper-Aggregator communication. The Aggregator may look very similar to 
the current helper, except it will not use stdin/stdout for 
receiving/sending helper queries/responses. This option also requires 
development, but it is much simpler than option A.

HTH,

Alex.

> On Tue, Feb 8, 2022 at 4:41 PM Alex Rousskov wrote:
> 
>     On 2/8/22 09:13, roee klinger wrote:
> 
>      > I am running multiple instances of Squid in a K8S environment, each
>      > Squid instance has a helper that authenticates users based on their
>      > username and password, the scripts are written in Python.
>      >
>      > I have been facing an issue, that when under load, the helpers (even
>      > with 3600 sec TTL) swamp the MariaDB instance, causing it to
>     reach 100%
>      > CPU, basically I believe because each helper opens up its own
>     connection
>      > to MariaDB, which ends up as a lot of connections.
>      >
>      > My initial idea was to create a Redis DB next to each Squid
>     instance and
>      > connect each Squid to its own dedicated Redis. I will sync Redis
>     with
>      > MariaDB every minute, thus decreasing the connections count from
>     a few
>      > 100s to just 1 every minute. This will also improve speeds since
>     Redis
>      > is much faster than MariaDB.
>      >
>      > The problem is, however, that there will still be many
>     connections from
>      > Squid to Redis, and I probably that will consume a lot of DB
>     resources
>      > as well, which I don't actually know how to optimize, since it seems
>      > that Squid opens many processes, and there is no way to get them
>     to talk
>      > to each other (expect TTL values, which seems not to help in my
>     case,
>      > which I also don't understand why that is).
>      >
>      > What is the best practice to handle this? considering I have the
>      > following requirements:
>      >
>      >     1. Fast
>      >     2. Refresh data every minute
>      >     3. Consume as least amount of DB resources as possible
> 
>     I would start from the beginning: Does the aggregate number of database
>     requests match your expectations? In other words, do you see lots of
>     database requests that should not be there given your user access
>     patterns and authentication TTLs? In yet other words, are there many
>     repeated authentication accesses that should have been authentication
>     cache hits?
> 
>     If there are a lot more requests than your users/TTLs should generate,
>     then you may be able to decrease db load by figuring out where the
>     extra
>     requests are coming from. For example, it is possible that your
>     authentication cache key includes some noise that renders caching
>     ineffective (e.g., see comments about key_extras in
>     squid.conf.documented). Or maybe you need a bigger authentication cache.
> 
>     If the total stream of authentication requests during peak hours is
>     reasonable, with few unwarranted cache misses, then you can start
>     working on aggregating helper-db connections (helpers can be written to
>     talk through a central connection aggregator) and/or adding database
>     power (e.g., by introducing additional databases running on previously
>     unused hardware -- just like your MariaDB idea).
> 
> 
>     Cheers,
> 
>     Alex.
>