[squid-users] Squid regex grammar

Alex Rousskov rousskov at measurement-factory.com
Fri Oct 27 14:55:26 UTC 2017


On 10/27/2017 08:32 AM, Amos Jeffries wrote:
> On 28/10/17 02:59, Yuri wrote:
>> the regular expression is simply silently ignored and it is extremely
>> difficult to detect.

> That sounds like a library problem. If Squid receives a regex error code
> from the library when compiling any regex from your squid.conf it logs
> the relevant error to cache.log.

When a regular expression is using extended features, the basic regular
expression compiler often (or even always?!) does not fail because it
views the extended features as ordinary plain characters. Thus, Squid
cannot tell that something went wrong.

I cannot give a Squid-based example quickly, but here is a related
illustration using grep (which is not exactly the same as what happens
inside Squid, but I suspect it is similar enough for the illustration
purposes in this context):

> $ echo "foobar" | grep --basic-regexp    'foo|bar'
> $ echo "foobar" | grep --extended-regexp 'foo|bar'
> foobar

As you can see, the basic compiler is silent about the "|" character
that it does not support. Here is a similar example where a malformed
extended regular expression is silently accepted by the basic compiler:


> $ echo "foobar" | grep --basic-regexp 'foo(bar'
> $ echo "foobar" | grep --extended-regexp 'foo(bar'
> grep: Unmatched ( or \(


In theory, Squid itself could detect special characters unsupported by
the current regex library but doing so correctly without breaking many
existing working configurations may be impossible. On the other hand,
this validation could become an optional feature that admins can control.

The best strategy for a Squid admin working with complex regex ACLs may
be to add external test cases that validate ACL matching expectations,
but doing so requires significant amount of work and discipline.

Alex.


More information about the squid-users mailing list