[squid-dev] Use MAX_URL for first line limitation

Alex Rousskov rousskov at measurement-factory.com
Thu Jun 7 23:18:02 UTC 2018


On 06/07/2018 04:13 PM, Eduard Bagdasaryan wrote:

> in %>ru Squid logs large and small URLs differently.  For example,
> Squid strips whitespaces from small URLs, while keeping them for
> large ones.

Is %ru logging consistent with regard to small and large URLs?

* If it is, should we use the same approach to make %>ru consistent?

* Otherwise, we should keep both %ru and %>ru in mind when deciding
whether (and how) to make logging consistent. For example, it would be
OK to ignore this inconsistency if making things consistent requires too
much out-of-scope work.


> 2. Adjust Http::One::RequestParser::parseRequestFirstLine(), immediately
>    rejecting requests with URLs exceeding MAX_URL (use it instead of
>    Config.maxRequestHeaderSize). As a result, access.log would get
>    "error:request-too-large" for such requests. This solution looks
>    better, since anyway, Squid eventually denies such requests and,
>    moreover, there are many contexts in Squid tied up to that MAX_URL
>    limitation.
> 
>   So, the question is: can we go on with (2) without breaking something?

Clarification: (2) is already known to "break something" -- after (2),
some refused transactions that were previously logged with the actual
URL (prefix) in %ru would be logged with a largely useless
"error:request-too-large" URL. Which evil is lesser:

  a) hiding actual %ru URL for requests with URLs between ~8K and ~64K
  b) inconsistent logging of small vs large URLs


>   Or probably there are any other(better) alternative approaches?

Yes, that is the other important question. FWIW, I do not know any
better solutions that do not require fixing the 8K/64K problem itself
(which is way out of %>ru fixing scope).


Thank you,

Alex.
P.S. Here, "inconsistent logging" means that two URLs with identical 7K
prefix may be logged differently if one of the URLs is longer than 8K.
Scripts and humans can be easily confused by such inconsistent logging,
and URL lengths are often at the mercy of highly variable parameters
such as search strings, session IDs, context descriptions, etc.


More information about the squid-dev mailing list