[squid-users] external helper development

Eliezer Croitoru ngtech1ltd at gmail.com
Fri Feb 4 06:06:31 UTC 2022


Hey David,

 

First, PHP is a good language however it doesn’t handle well STDIN/STDOUT helpers and crashes more then once without any warnings.

It’s documented in the PHP docs (Don’t remember exactly where).

Regarding PHP compared to PHP being faster, it’s pretty simple to test and validate if the cost of the crash is better than the speed.

What python and PHP code have used in your tests? (I would be happy to try and test it..).

You can see this session helper written in python:

https://wiki.squid-cache.org/EliezerCroitoru/SessionHelper/Python

 

And about the cache of each helpers, the cost of a cache on a single helper is not much in terms of memory comparing to some network access.

Again it’s possible to test and verify this on a loaded system to get results. The delay itself can be seen from squid side in the cache manager statistics.

 

You can also try to compare the next ruby helper:

https://wiki.squid-cache.org/EliezerCroitoru/SessionHelper

 

About a shared “base” which allows helpers to avoid computation of the query…. It’s a good argument, however it depends what is the cost of
pulling from the cache compared to calculating the answer.

A very simple string comparison or regex matching would probably be faster than reaching a shared storage in many cases.

 

Also take into account the “concurrency” support from the helper side.

A helper that supports parallel processing of requests/lines can do better then many single helpers in more than once use case.

In any case I would suggest to enable requests concurrency from squid side since the STDIN buffer will emulate some level of concurrency
by itself and will allow squid to keep going forward faster.

 

Just to mention that SquidGuard have used a single helper cache for a very long time, ie every single SquidGuard helper has it’s own copy of the whole

configuration and database files in memory.

 

And again, if you do have any option to implement a server service model and that the helpers will contact this main service you will be able to implement
much faster internal in-memory cache compared to a redis/memcahe/other external daemon(need to be tested).

 

A good example for this is ufdbguard which has helpers that are clients of the main service which does the whole heavy lifting and also holds 
one copy of the DB.

 

I have implemented SquidBlocker this way and have seen that it out-performs any other service I have tried until now.

 

Eliezer

 

----

Eliezer Croitoru

NgTech, Tech Support

Mobile: +972-5-28704261

Email: ngtech1ltd at gmail.com <mailto:ngtech1ltd at gmail.com> 

 

From: squid-users <squid-users-bounces at lists.squid-cache.org> On Behalf Of David Touzeau
Sent: Thursday, February 3, 2022 14:24
To: squid-users at lists.squid-cache.org
Subject: Re: [squid-users] external helper development

 

Hi Elizer

You are right in a way but when squid loads multiple helpers, each helper will use its own cache.
Using a shared "base" allows helpers to avoid having to compute a query already found by another helper who already has the answer.

Concerning PHP what we find strange is that with our tests, a simple loop and an "echo OK", php goes faster: 1.5x than python.

Le 03/02/2022 à 07:09, Eliezer Croitoru a écrit :

Hey Andre,
 
Every language has a "cost" for it's qualities.
For example, Golang is a very nice language that offers a relatively simple way for concurrency support and cross hardware compilation/compatibility.
One cost in Golang is that the binary is in the size of an OS/Kernel.
In python you must write everything in a specific position and indentation and threading is not simple to implement for a novice.
However when you see what was written in Python you can see that most of OpenStack api's and systems are written in.. python and it means something.
I like very much ruby but it doesn't support threading by nature but supports "concurrency".
Squid doesn't implement threading but implements "concurrency".
 
Don't touch PHP as a helper!!! (+1 to Alex)
 
Also take into account that Redis or Memcached is less preferred in many cases if the library doesn't re-use the existing connection for multiple queries.
Squid also implements caching for helpers answers so it's possible to implement the helper and ACL's in such a way that squid caching will
help you to lower the access to the external API and or redis/memcahced/DB.
I also have good experience with some libraries which implements cache that I have used inside a helper with a limited size for "level 1" cache.
It's possible that if you will implement both the helper and server side of the solution like ufdbguard you would be able to optimize the system
to take very high load.
 
I hope the above will help you.
Eliezer
 
----
Eliezer Croitoru
NgTech, Tech Support
Mobile: +972-5-28704261
Email: ngtech1ltd at gmail.com <mailto:ngtech1ltd at gmail.com> 
 
-----Original Message-----
From: squid-users  <mailto:squid-users-bounces at lists.squid-cache.org> <squid-users-bounces at lists.squid-cache.org> On Behalf Of André Bolinhas
Sent: Wednesday, February 2, 2022 00:09
To: 'Alex Rousskov'  <mailto:rousskov at measurement-factory.com> <rousskov at measurement-factory.com>; squid-users at lists.squid-cache.org <mailto:squid-users at lists.squid-cache.org> 
Subject: Re: [squid-users] external helper development
 
Hi
Thanks for the reply.
I will take a look on Rust as you recommend.
Also, between Python and Go and is the best for multithreading and concurrency?
Rust supports multithreading and concurrency?
Best regards
 
-----Mensagem original-----
De: squid-users  <mailto:squid-users-bounces at lists.squid-cache.org> <squid-users-bounces at lists.squid-cache.org> Em Nome De Alex Rousskov
Enviada: 1 de fevereiro de 2022 22:01
Para: squid-users at lists.squid-cache.org <mailto:squid-users at lists.squid-cache.org> 
Assunto: Re: [squid-users] external helper development
 
On 2/1/22 16:47, André Bolinhas wrote:

Hi
 
I’m building an external helper to get the categorization of an 
website, I know how to build it, but I need you option about the best 
language for the job in terms of performance, bottlenecks, I/O blocking..
 
The helper will work like this.
 
1º  will check the hot memory for faster response (memcache or redis)
 
2º If the result not exist in hot memory then will check an external 
api to fetch the categorie and saved it in hot memory.
 
In what language do you recommend develop such helper? PHP, Python, Go..

 
If this helper is for long-term production use, and you are willing to learn new things, then use Rust[1]. Otherwise, use whatever language you are the most comfortable with already (except PHP), especially if that language has good libraries/wrappers for the external APIs you will need to use.
 
Alex.
[1] https://www.rust-lang.org/
_______________________________________________
squid-users mailing list
squid-users at lists.squid-cache.org <mailto:squid-users at lists.squid-cache.org> 
http://lists.squid-cache.org/listinfo/squid-users
 
_______________________________________________
squid-users mailing list
squid-users at lists.squid-cache.org <mailto:squid-users at lists.squid-cache.org> 
http://lists.squid-cache.org/listinfo/squid-users
 
_______________________________________________
squid-users mailing list
squid-users at lists.squid-cache.org <mailto:squid-users at lists.squid-cache.org> 
http://lists.squid-cache.org/listinfo/squid-users

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20220204/f16bd717/attachment-0001.htm>


More information about the squid-users mailing list