<div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">You have at least two basic options:</blockquote><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">A. Enhance Squid to let SMP workers share helpers. I assume that you<br>have C SMP workers and N helpers per worker, with C and N significantly<br>greater than 1. Instead of having N helpers per worker and C*N helpers<br>total, you will have just one concurrent helper per worker and C helpers<br>total. This will be a significant, generally useful improvement that<br>should be officially accepted if implemented well. This enhancement<br>requires serious Squid code modifications in a neglected error-prone<br>area, but it is certainly doable -- Squid already shares rock diskers<br>across workers, for example.</blockquote><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">B. Convert your helper from a database client program to an Aggregator<br>client program (and write the Aggregator). Depending on your needs and<br>skill, you can use TCP or Unix Domain Sockets (UDS) for<br>helper-Aggregator communication. The Aggregator may look very similar to<br>the current helper, except it will not use stdin/stdout for<br>receiving/sending helper queries/responses. This option also requires<br>development, but it is much simpler than option A.</blockquote><div><br></div><div>Thank you, Alex, I will keep these in mind.</div><div><br></div><div>I thought about the following approach:</div><div><br></div><div>1. Have only one python helper, this helper fetches the data every minute from the main DB.</div><div>2. This helper has concurrency set for it.</div><div>3. The helper then spawns child processes using multithreading, each process responds to std/stdout and reads the data from the main process which spawned it.</div><div><br></div><div>What do you think about taking this route?</div><div><br></div><div>It will require no extra DBs and no tweaks to Squid, but maybe I am missing something,</div><div><br></div><div>Best regards,</div><div>Roee</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 8, 2022 at 5:12 PM Alex Rousskov <<a href="mailto:rousskov@measurement-factory.com">rousskov@measurement-factory.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 2/8/22 09:50, roee klinger wrote:<br>

<br>

> Alex: If there are a lot more requests than your users/TTLs should<br>

>       generate, then you may be able to decrease db load by figuring out<br>

>       where the extra requests are coming from.<br>

<br>

> actually, I don't think it matters much now that I think about it<br>

> again, since as per my requirements, I need to reload the cache every<br>

> 60 seconds, which means that even if it is perfect, MariaDB will<br>

> still get a high load. I think the second approach will be better<br>

> suited.<br>

<br>

Your call. Wiping out the entire authentication cache every 60 seconds <br>

feels odd, but I do not know enough about your environment to judge.<br>

<br>

<br>

> Alex: aggregating helper-db connections (helpers can be written to<br>

>       talk through a central connection aggregator)<br>

> <br>

<br>

> That sounds like exactly what I am looking for, how would one go about <br>

> doing this?<br>

<br>

You have at least two basic options:<br>

<br>

A. Enhance Squid to let SMP workers share helpers. I assume that you <br>

have C SMP workers and N helpers per worker, with C and N significantly <br>

greater than 1. Instead of having N helpers per worker and C*N helpers <br>

total, you will have just one concurrent helper per worker and C helpers <br>

total. This will be a significant, generally useful improvement that <br>

should be officially accepted if implemented well. This enhancement <br>

requires serious Squid code modifications in a neglected error-prone <br>

area, but it is certainly doable -- Squid already shares rock diskers <br>

across workers, for example.<br>

<br>

B. Convert your helper from a database client program to an Aggregator <br>

client program (and write the Aggregator). Depending on your needs and <br>

skill, you can use TCP or Unix Domain Sockets (UDS) for <br>

helper-Aggregator communication. The Aggregator may look very similar to <br>

the current helper, except it will not use stdin/stdout for <br>

receiving/sending helper queries/responses. This option also requires <br>

development, but it is much simpler than option A.<br>

<br>

<br>

HTH,<br>

<br>

Alex.<br>

<br>

<br>

> On Tue, Feb 8, 2022 at 4:41 PM Alex Rousskov wrote:<br>

> <br>

>     On 2/8/22 09:13, roee klinger wrote:<br>

> <br>

>      > I am running multiple instances of Squid in a K8S environment, each<br>

>      > Squid instance has a helper that authenticates users based on their<br>

>      > username and password, the scripts are written in Python.<br>

>      ><br>

>      > I have been facing an issue, that when under load, the helpers (even<br>

>      > with 3600 sec TTL) swamp the MariaDB instance, causing it to<br>

>     reach 100%<br>

>      > CPU, basically I believe because each helper opens up its own<br>

>     connection<br>

>      > to MariaDB, which ends up as a lot of connections.<br>

>      ><br>

>      > My initial idea was to create a Redis DB next to each Squid<br>

>     instance and<br>

>      > connect each Squid to its own dedicated Redis. I will sync Redis<br>

>     with<br>

>      > MariaDB every minute, thus decreasing the connections count from<br>

>     a few<br>

>      > 100s to just 1 every minute. This will also improve speeds since<br>

>     Redis<br>

>      > is much faster than MariaDB.<br>

>      ><br>

>      > The problem is, however, that there will still be many<br>

>     connections from<br>

>      > Squid to Redis, and I probably that will consume a lot of DB<br>

>     resources<br>

>      > as well, which I don't actually know how to optimize, since it seems<br>

>      > that Squid opens many processes, and there is no way to get them<br>

>     to talk<br>

>      > to each other (expect TTL values, which seems not to help in my<br>

>     case,<br>

>      > which I also don't understand why that is).<br>

>      ><br>

>      > What is the best practice to handle this? considering I have the<br>

>      > following requirements:<br>

>      ><br>

>      >     1. Fast<br>

>      >     2. Refresh data every minute<br>

>      >     3. Consume as least amount of DB resources as possible<br>

> <br>

>     I would start from the beginning: Does the aggregate number of database<br>

>     requests match your expectations? In other words, do you see lots of<br>

>     database requests that should not be there given your user access<br>

>     patterns and authentication TTLs? In yet other words, are there many<br>

>     repeated authentication accesses that should have been authentication<br>

>     cache hits?<br>

> <br>

>     If there are a lot more requests than your users/TTLs should generate,<br>

>     then you may be able to decrease db load by figuring out where the<br>

>     extra<br>

>     requests are coming from. For example, it is possible that your<br>

>     authentication cache key includes some noise that renders caching<br>

>     ineffective (e.g., see comments about key_extras in<br>

>     squid.conf.documented). Or maybe you need a bigger authentication cache.<br>

> <br>

>     If the total stream of authentication requests during peak hours is<br>

>     reasonable, with few unwarranted cache misses, then you can start<br>

>     working on aggregating helper-db connections (helpers can be written to<br>

>     talk through a central connection aggregator) and/or adding database<br>

>     power (e.g., by introducing additional databases running on previously<br>

>     unused hardware -- just like your MariaDB idea).<br>

> <br>

> <br>

>     Cheers,<br>

> <br>

>     Alex.<br>

> <br>

<br>

</blockquote></div>