[squid-users] external helper development

Mon Feb 7 00:41:58 UTC 2022

Sorry  Elizer

It was a mistake... No, your code is clean..
Impressive for the first shot
Many thanks for your example, we will run our stress tool to see the 
difference...

Just a question

Why did you send 500 milliseconds of sleep in the handle_stdoud ? Is it 
for let squid closing the pipe ?

Le 06/02/2022 à 11:46, Eliezer Croitoru a écrit :
>
> Hey David,
>
> Not a fully completed helper but it seems to works pretty nice and 
> might be better then what exist already:
>
> https://gist.githubusercontent.com/elico/03938e3a796c53f7c925872bade78195/raw/21ff1bbc0cf3d91719db27d9d027652e8bd3de4e/threaded-helper-example.py
>
> #!/usr/bin/env python
>
> import sys
>
> import time
>
> import urllib.request
>
> import signal
>
> import threading
>
> #set debug mode for True or False
>
> debug = False
>
> #debug = True
>
> queue = []
>
> threads = []
>
> RUNNING = True
>
> quit = 0
>
> rand_api_url = "https://cloud1.ngtech.co.il/api/test.php"
>
> def sig_handler(signum, frame):
>
>     sys.stderr.write("Signal is received:" + str(signum) + "\n")
>
>     global quit
>
>     quit = 1
>
>     global RUNNING
>
>     RUNNING=False
>
> def handle_line(line):
>
>      if not RUNNING:
>
>          return
>
>      if not line:
>
>          return
>
>      if quit > 0:
>
>          return
>
>      arr = line.split()
>
>      response = urllib.request.urlopen( rand_api_url )
>
>      response_text = response.read()
>
>      queue.append(arr[0] + " " + response_text.decode("utf-8"))
>
> def handle_stdout(n):
>
>      while RUNNING:
>
>          if quit > 0:
>
>            return
>
>          while len(queue) > 0:
>
>              item = queue.pop(0)
>
> sys.stdout.write(item)
>
>              sys.stdout.flush()
>
>          time.sleep(0.5)
>
> def handle_stdin(n):
>
>     while RUNNING:
>
>          line = sys.stdin.readline()
>
>          if not line:
>
>              break
>
>          if quit > 0:
>
>              break
>
>          line = line.strip()
>
>          thread = threading.Thread(target=handle_line, args=(line,))
>
>          thread.start()
>
>          threads.append(thread)
>
> signal.signal(signal.SIGUSR1, sig_handler)
>
> signal.signal(signal.SIGUSR2, sig_handler)
>
> signal.signal(signal.SIGALRM, sig_handler)
>
> signal.signal(signal.SIGINT, sig_handler)
>
> signal.signal(signal.SIGQUIT, sig_handler)
>
> signal.signal(signal.SIGTERM, sig_handler)
>
> stdout_thread = threading.Thread(target=handle_stdout, args=(1,))
>
> stdout_thread.start()
>
> threads.append(stdout_thread)
>
> stdin_thread = threading.Thread(target=handle_stdin, args=(2,))
>
> stdin_thread.start()
>
> threads.append(stdin_thread)
>
> while(RUNNING):
>
>     time.sleep(3)
>
> print("Not RUNNING")
>
> for thread in threads:
>
>     thread.join()
>
> print("All threads stopped.")
>
> ## END
>
> Eliezer
>
> ----
>
> Eliezer Croitoru
>
> NgTech, Tech Support
>
> Mobile: +972-5-28704261
>
> Email: ngtech1ltd at gmail.com
>
> *From:*squid-users <squid-users-bounces at lists.squid-cache.org> *On 
> Behalf Of *David Touzeau
> *Sent:* Friday, February 4, 2022 16:29
> *To:* squid-users at lists.squid-cache.org
> *Subject:* Re: [squid-users] external helper development
>
> Elizer,
>
> Thanks for all this advice and indeed your arguments are valid between 
> opening a socket, sending data, receiving data and closing the socket 
> unlike direct access to a regex or a memory entry even if the 
> calculation has already been done.
>
> But what surprises me the most is that we have produced a python 
> plugin in thread which I provide you a code below.
> The php code is like your mentioned example ( No thread, just a loop 
> and output OK )
>
> Results are after 6k requests, squid freeze and no surf can be made as 
> with PHP code we can up to 10K requests and squid is happy
> really, we did not understand why python is so low.
>
> Here a python code using threads
>
> #!/usr/bin/env python
> import os
> import sys
> import time
> import signal
> import locale
> import traceback
> import threading
> import select
> import traceback as tb
>
> class ClienThread():
>
>     def __init__(self):
>         self._exiting = False
>         self._cache = {}
>
>     def exit(self):
>         self._exiting = True
>
>     def stdout(self, lineToSend):
>         try:
>             sys.stdout.write(lineToSend)
>             sys.stdout.flush()
>
>         except IOError as e:
>             if e.errno==32:
>                 # Error Broken PIPE!"
>                 pass
>         except:
>             # other execpt
>             pass
>
>     def run(self):
>         while not self._exiting:
>             if sys.stdin in select.select([sys.stdin], [], [], 0.5)[0]:
>                 line = sys.stdin.readline()
>                 LenOfline=len(line)
>
>                 if LenOfline==0:
>                     self._exiting=True
>                     break
>
>                 if line[-1] == '\n':line = line[:-1]
>                 channel = None
>                 options = line.split()
>
>                 try:
>                     if options[0].isdigit(): channel = options.pop(0)
>                 except IndexError:
>                     self.stdout("0 OK first=ERROR\n")
>                     continue
>
>                 # Processing here
>
>                 try:
>                     self.stdout("%s OK\n" % channel)
>                 except:
>                     self.stdout("%s ERROR first=ERROR\n" % channel)
>
>
>
>
> class Main(object):
>     def __init__(self):
>         self._threads = []
>         self._exiting = False
>         self._reload = False
>         self._config = ""
>
>         for sig, action in (
>             (signal.SIGINT, self.shutdown),
>             (signal.SIGQUIT, self.shutdown),
>             (signal.SIGTERM, self.shutdown),
>             (signal.SIGHUP, lambda s, f: setattr(self, '_reload', True)),
>             (signal.SIGPIPE, signal.SIG_IGN),
>         ):
>             try:
>                 signal.signal(sig, action)
>             except AttributeError:
>                 pass
>
>
>
>     def shutdown(self, sig = None, frame = None):
>         self._exiting = True
>         self.stop_threads()
>
>     def start_threads(self):
>
>         sThread = ClienThread()
>         t = threading.Thread(target = sThread.run)
>         t.start()
>         self._threads.append((sThread, t))
>
>
>
>     def stop_threads(self):
>         for p, t in self._threads:
>             p.exit()
>         for p, t in self._threads:
>             t.join(timeout =  1.0)
>         self._threads = []
>
>     def run(self):
>         """ main loop """
>         ret = 0
>         self.start_threads()
>         return ret
>
>
> if __name__ == '__main__':
>     # set C locale
>     locale.setlocale(locale.LC_ALL, 'C')
>     os.environ['LANG'] = 'C'
>     ret = 0
>     try:
>         main = Main()
>         ret = main.run()
>     except SystemExit:
>         pass
>     except KeyboardInterrupt:
>         ret = 4
>     except:
>     sys.exit(ret)
>
> Le 04/02/2022 à 07:06, Eliezer Croitoru a écrit :
>
>     And about the cache of each helpers, the cost of a cache on a
>     single helper is not much in terms of memory comparing to some
>     network access.
>
>     Again it’s possible to test and verify this on a loaded system to
>     get results. The delay itself can be seen from squid side in the
>     cache manager statistics.
>
>     You can also try to compare the next ruby helper:
>
>     https://wiki.squid-cache.org/EliezerCroitoru/SessionHelper
>
>     About a shared “base” which allows helpers to avoid computation of
>     the query…. It’s a good argument, however it depends what is the
>     cost of
>     pulling from the cache compared to calculating the answer.
>
>     A very simple string comparison or regex matching would probably
>     be faster than reaching a shared storage in many cases.
>
>     Also take into account the “concurrency” support from the helper side.
>
>     A helper that supports parallel processing of requests/lines can
>     do better then many single helpers in more than once use case.
>
>     In any case I would suggest to enable requests concurrency from
>     squid side since the STDIN buffer will emulate some level of
>     concurrency
>     by itself and will allow squid to keep going forward faster.
>
>     Just to mention that SquidGuard have used a single helper cache
>     for a very long time, ie every single SquidGuard helper has it’s
>     own copy of the whole
>
>     configuration and database files in memory.
>
>     And again, if you do have any option to implement a server service
>     model and that the helpers will contact this main service you will
>     be able to implement
>     much faster internal in-memory cache compared to a
>     redis/memcahe/other external daemon(need to be tested).
>
>     A good example for this is ufdbguard which has helpers that are
>     clients of the main service which does the whole heavy lifting and
>     also holds
>     one copy of the DB.
>
>     I have implemented SquidBlocker this way and have seen that it
>     out-performs any other service I have tried until now.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20220207/9646d2ab/attachment-0001.htm>