[squid-users] Upper limit on the number of regular expressions in url_regex?

Marcus Kool marcus.kool at urlfilterdb.com
Wed Aug 9 11:25:33 UTC 2017



On 09/08/17 05:15, Ralf Hildebrandt wrote:
> * Marcus Kool <marcus.kool at urlfilterdb.com>:
>> I have only seen regex failing with such short RE on AIX.
>> what is your OS, distro, CPU and lib version ?
> 
> Ubuntu Linux LTS 16.04 (xenial)
> x86_64 (amd64)
> 
> I guess you mean libc:
> ii  libc6:amd64                            2.23-0ubuntu9

I see no issues with the optimised RE so my first guess is a libc bug.

The RE optimisation in Squid is inspired by the RE optimisation in ufdbGuard.
ufdbGuard optimises the RE a bit different and it looks like this:
zizicamarda.com/7fg3g|zizzhaida.com/3m6ij|zizzhaida.com/98g4ubq|...
I have tested this optimised RE on Ubuntu 16.04 and it works so maybe it is not a libc bug but a Squid bug.

>> BTW: why use regular expressions for a list of 10000+ _fixed_ URLs ?
> 
> What is the alternative?

ufdbGuard is a URL filter that converts a file with 10000 URLs to a database file that is optimised for fast lookups.
So all you need to do is configure a URL rewriter and you can filter those URLs, using fixed URLs not REs.

Marcus



More information about the squid-users mailing list