[squid-users] Upper limit on the number of regular expressions in url_regex?
Marcus Kool
marcus.kool at urlfilterdb.com
Wed Aug 9 11:25:33 UTC 2017
On 09/08/17 05:15, Ralf Hildebrandt wrote:
> * Marcus Kool <marcus.kool at urlfilterdb.com>:
>> I have only seen regex failing with such short RE on AIX.
>> what is your OS, distro, CPU and lib version ?
>
> Ubuntu Linux LTS 16.04 (xenial)
> x86_64 (amd64)
>
> I guess you mean libc:
> ii libc6:amd64 2.23-0ubuntu9
I see no issues with the optimised RE so my first guess is a libc bug.
The RE optimisation in Squid is inspired by the RE optimisation in ufdbGuard.
ufdbGuard optimises the RE a bit different and it looks like this:
zizicamarda.com/7fg3g|zizzhaida.com/3m6ij|zizzhaida.com/98g4ubq|...
I have tested this optimised RE on Ubuntu 16.04 and it works so maybe it is not a libc bug but a Squid bug.
>> BTW: why use regular expressions for a list of 10000+ _fixed_ URLs ?
>
> What is the alternative?
ufdbGuard is a URL filter that converts a file with 10000 URLs to a database file that is optimised for fast lookups.
So all you need to do is configure a URL rewriter and you can filter those URLs, using fixed URLs not REs.
Marcus
More information about the squid-users
mailing list