[squid-users] External helper consumes too many DB connections
Alex Rousskov
rousskov at measurement-factory.com
Tue Feb 8 15:12:23 UTC 2022
On 2/8/22 09:50, roee klinger wrote:
> Alex: If there are a lot more requests than your users/TTLs should
> generate, then you may be able to decrease db load by figuring out
> where the extra requests are coming from.
> actually, I don't think it matters much now that I think about it
> again, since as per my requirements, I need to reload the cache every
> 60 seconds, which means that even if it is perfect, MariaDB will
> still get a high load. I think the second approach will be better
> suited.
Your call. Wiping out the entire authentication cache every 60 seconds
feels odd, but I do not know enough about your environment to judge.
> Alex: aggregating helper-db connections (helpers can be written to
> talk through a central connection aggregator)
>
> That sounds like exactly what I am looking for, how would one go about
> doing this?
You have at least two basic options:
A. Enhance Squid to let SMP workers share helpers. I assume that you
have C SMP workers and N helpers per worker, with C and N significantly
greater than 1. Instead of having N helpers per worker and C*N helpers
total, you will have just one concurrent helper per worker and C helpers
total. This will be a significant, generally useful improvement that
should be officially accepted if implemented well. This enhancement
requires serious Squid code modifications in a neglected error-prone
area, but it is certainly doable -- Squid already shares rock diskers
across workers, for example.
B. Convert your helper from a database client program to an Aggregator
client program (and write the Aggregator). Depending on your needs and
skill, you can use TCP or Unix Domain Sockets (UDS) for
helper-Aggregator communication. The Aggregator may look very similar to
the current helper, except it will not use stdin/stdout for
receiving/sending helper queries/responses. This option also requires
development, but it is much simpler than option A.
HTH,
Alex.
> On Tue, Feb 8, 2022 at 4:41 PM Alex Rousskov wrote:
>
> On 2/8/22 09:13, roee klinger wrote:
>
> > I am running multiple instances of Squid in a K8S environment, each
> > Squid instance has a helper that authenticates users based on their
> > username and password, the scripts are written in Python.
> >
> > I have been facing an issue, that when under load, the helpers (even
> > with 3600 sec TTL) swamp the MariaDB instance, causing it to
> reach 100%
> > CPU, basically I believe because each helper opens up its own
> connection
> > to MariaDB, which ends up as a lot of connections.
> >
> > My initial idea was to create a Redis DB next to each Squid
> instance and
> > connect each Squid to its own dedicated Redis. I will sync Redis
> with
> > MariaDB every minute, thus decreasing the connections count from
> a few
> > 100s to just 1 every minute. This will also improve speeds since
> Redis
> > is much faster than MariaDB.
> >
> > The problem is, however, that there will still be many
> connections from
> > Squid to Redis, and I probably that will consume a lot of DB
> resources
> > as well, which I don't actually know how to optimize, since it seems
> > that Squid opens many processes, and there is no way to get them
> to talk
> > to each other (expect TTL values, which seems not to help in my
> case,
> > which I also don't understand why that is).
> >
> > What is the best practice to handle this? considering I have the
> > following requirements:
> >
> > 1. Fast
> > 2. Refresh data every minute
> > 3. Consume as least amount of DB resources as possible
>
> I would start from the beginning: Does the aggregate number of database
> requests match your expectations? In other words, do you see lots of
> database requests that should not be there given your user access
> patterns and authentication TTLs? In yet other words, are there many
> repeated authentication accesses that should have been authentication
> cache hits?
>
> If there are a lot more requests than your users/TTLs should generate,
> then you may be able to decrease db load by figuring out where the
> extra
> requests are coming from. For example, it is possible that your
> authentication cache key includes some noise that renders caching
> ineffective (e.g., see comments about key_extras in
> squid.conf.documented). Or maybe you need a bigger authentication cache.
>
> If the total stream of authentication requests during peak hours is
> reasonable, with few unwarranted cache misses, then you can start
> working on aggregating helper-db connections (helpers can be written to
> talk through a central connection aggregator) and/or adding database
> power (e.g., by introducing additional databases running on previously
> unused hardware -- just like your MariaDB idea).
>
>
> Cheers,
>
> Alex.
>
More information about the squid-users
mailing list