[squid-dev] [PATCH] Support for regex with \-escaped characters

Alex Rousskov rousskov at measurement-factory.com
Fri Nov 21 04:38:39 UTC 2014


On 11/17/2014 05:56 PM, Amos Jeffries wrote:

> For now the detection is only added during parsing of regex tokens
> or files.

And it should probably stay that way: We cannot easily add support for
\-based escaping in bare strings because in some of those strings \ is
meaningful on its own (unfortunately).


> +    while (ConfigParser::RecognizeQuotedPair_ && *(nextToken-1) == '\\') {

Are you sure that nextToken never points at the start of a buffer,
especially when parsing parameters loaded from external files? If you
are absolutely sure, this guarantee should be documented where
ConfigParser::TokenParse() is defined. Otherwise, the code has to be
revised (but still support regex parameter files that start with a
backslash).

The second problem with this code is that it incorrectly handles "\\X"
AFAICT: X should not be escaped, even though it is preceded by a
backslash (because that backslash is itself escaped).


> +	<tag>configuration_includes_quoted_values</tag>
> +	<p>Regex expression values can not be parsed when this directive is
> +	   configured to <em>ON</em>.


This is a little misleading because, IIRC, the
configuration_includes_quoted_values hack can be turned ON for just a
section of the configuration file. I suggest s/when this directive is
configured to ON/in parts of configuration where this directive is set
to ON/.


Please consider s/can not/cannot/ for improved consistency:

$ fgrep -i 'can not' src/*pre | wc -l
7
$ fgrep -i 'cannot' src/*pre | wc -l
19



> +	<tag>configuration_includes_quoted_values</tag>
> +	<p>Regex expression values can not be parsed when this directive is
> +	   configured to <em>ON</em>. Instead Squid now accepts regex \-escaped
> +	   characters including escaped whitespace.


We should clarify that \-escaped characters are supported when parsing
regular expressions, not just any token. I also suggest moving the
last sentence to its own paragraph as it has no relation to
configuration_includes_quoted_values:

<p>When parsing regular expressions, Squid now accepts \-escaped
characters, including escaped whitespace.</p>

If "escaped whitespace" means \s, I would say so explicitly. If it
means "\ ", I would say "including escaped space characters".


> +    static bool RecognizeQuotedPair_; ///< The next tokens may containing quoted-pair (\-escaped) characters

s/containing/contain/


Except for the nextToken-1 concerns, the above corrections are
cosmetic, of course.


HTH,

Alex.


More information about the squid-dev mailing list