[squid-dev] RFC: Categorize level-0/1 messages

Alex Rousskov rousskov at measurement-factory.com
Sun Dec 5 17:43:39 UTC 2021


On 12/5/21 8:06 AM, Amos Jeffries wrote:
> On 21/10/21 16:16, Alex Rousskov wrote:
>> On 10/20/21 3:14 PM, Amos Jeffries wrote:
>>> On 21/10/21 4:22 am, Alex Rousskov wrote:
>>>> To facilitate automatic monitoring of Squid cache.logs, I suggest to
>>>> adjust Squid code to divide all level-0/1 messages into two major
>>>> categories -- "problem messages" and "status messages"[0]:
>>
>>> We already have a published categorization design which (when/if used)
>>> solves the problem(s) you are describing. Unfortunately that design has
>>> not been followed by all authors and conversion of old code to it has
>>> not been done.
>>
>>> Please focus your project on making Squid actually use the system of
>>> debugs line labels. The labels are documented at:
>>>    https://wiki.squid-cache.org/SquidFaq/SquidLogs#Squid_Error_Messages
>>
>> AFAICT, the partial classification in that wiki table is an opinion on
>> how things could be designed, and that opinion does not reflect Project
>> consensus.

> The wiki was written from observation of how the message labels are/were
> being used in the code. As such it reflects the defacto consensus of
> everyone ever authoring code that used one of the labels.

[ N.B. I am worried that this (mostly irrelevant IMO) part of the
discussion risks ruining the shaky agreement we have reached on the
important parts of the RFC, but I am also worried about
misrepresentation of the wiki table status. I will respond here, but
please move any future discussion about that table status (if you decide
to continue them) to a different email thread. ]

AFAICT, the wiki table in question does not accurately reflect Squid
code and does not constitute Project consensus on how things could be
designed, regardless of what observations led to that table creation.
The creative process of writing a classification table (based on code
observations) naturally allows for misinterpretations, mistakes, and
other problems. One cannot claim consensus on the _result_ on the
grounds that they have started with code observations.


>> FWIW, I cannot use that wiki table for labeling messages, but
>> I do not want to hijack this RFC thread for that table review.

> You our text below contradicts the "cannot" statement by describing how
> the two definitions fit together and offer to use the wiki table labels
> for problem category.

I cannot use the wiki table for deciding how to label a given message
(because of the problems with the table definitions that I would rather
not review here), but the primary labels we use (and should continue to
use!) are naturally found in that table. There is no contradiction here.


>> * The wiki table talks about FATAL, ERROR, and WARNING messages. These
>> labels match the RFC "problem messages" category. This match covers all
>> of the remaining cache.log messages except for 10 debugs() detailed
>> below. Thus, so far, there is a usable match on nearly all current
>> level-0/1 messages. Excellent!

> Thus my request that you use the wiki definitions to categorize the
> unlabeled and fix any detected labeling mistakes.

While I cannot use those wiki definitions (because of the problems with
the table that I would rather not review here), it is not a big deal as
far as this RFC is concerned because I do not have to use or violate
those definitions to implement the vast majority of the proposed changes
-- those changes are orthogonal to the wiki table and its definitions.

If somebody finds a table violation introduced by the RFC PR, then we
will either undo the corresponding PR change, change the label used by
the PR, or fix the table, but my goal is to minimize the number of such
cases because they are likely to waste a lot of time on difficult
discussions about poorly defined concepts.



>> * The wiki table also uses three "SECURITY ..." labels. The RFC does not
>> recognize those labels as special. I find their definitions in the wiki
>> table unusable/impractical, and you naturally think otherwise, but the
>> situation is not as bad as it may seem at the first glance:
>>
>> - "SECURITY ERROR" is used once to report a coding _bug_. That single
>> use case does not match the wiki table SECURITY ERROR description. We
>> should be able to rephrase that single message so that does it not
>> contradict the wiki table and the RFC.
>>
>> - "SECURITY ALERT" is used 6 times. Most or all of those cases are a
>> poor match for the SECURITY ALERT description in the wiki table IMHO. I
>> hope we can find a way to rephrase those 6 cases to avoid conflicts.
>>
>> - "SECURITY NOTICE" is used 3 times. Two of those use cases can be
>> simply removed by removing the long-deprecated and increasingly poorly
>> supported SslBump features. I do not see why we should keep the third
>> message/feature, but if it must be kept, we may be able to rephrase it.
>>
>> If we cannot reach an agreement regarding these 10 special messages, we
>> can leave them as is for now, and come back to them when we find a way
>> to agree on how/whether to assign additional labels to some messages.

> AFAICT, they were added as equivalent to ERROR/WARNING in CVE fixes,

Sorry, I do not understand the relationship between a CVE fix (i.e. a
code change) and a debugs() label that you are referring to above.
Clearly, CVE fixes may or may not alter or add debugs() statements, but
those statements very rarely get (or should get) a special label. I hope
this detail is not important for the RFC though.


> or to highlight a known security vulnerability being opened by admin settings.

We can and, IMO, should highlight those while still using just three
top-level labels.


>> Thus, there are no significant conflicts between the RFC and the table!
>> We strongly disagree how labels should be defined,

> Recall that the wiki is describing the observed pattern of label usage
> by all Squid contributors.

IMO, it does not.


> The options for any author are to comply with the existing
> consensus/pattern or to get agreement on changing the definitions.

Yes, of course. We only disagree on whether the wiki table represents
any existing agreement/consensus or Project definitions, but, again, I
hope we can avoid fighting about that as far as this RFC is concerned.


> Options like changing the labeling scheme are off the table because we
> already have significant amounts of community using those labels with
> third-party tools etc. i.e. the "automatic monitoring" cited as target
> use-case for this proposal are already using the wiki labels as their
> category types.

There is pretty much no official labeling scheme right now, so this RFC
does not _change_ it. The RFC proposes parts of such a scheme for future
use.

The RFC is compatible with what the community is doing/using already and
greatly reduces long-term admin overheads. Any "we cannot change
anything because admins are using these labels" assertion is false
because it is _admins_ that rightfully complain about existing labeling
(and lack of thereof) and ask for these changes! Admins really want us
to change/improve things in this area. I would not post this RFC otherwise.


>> We only need to
>> agree that (those 10 SECURITY messages aside) the RFC-driven message
>> categorization projects should adjust (the easily adjustable) messages
>> about Squid problems to use three standard labels: FATAL, ERROR, and
>> WARNING. Can we do just that and set aside the other disagreements for
>> another time?

> Agreed.

Great! I will proceed with that agreement in mind.


> IMO you should not need a formal definition of "status message" and
> "problem message".

Hm... Perhaps there is a misunderstanding of how "problem messages" and
"status messages" categories are defined in the RFC.

For obvious convenience in discussions, documentation, and code, we need
a term that describes

* level-0/1 messages with such well-known prefixes as WARNING:, ERROR:,
and FATAL:.

The RFC calls these messages "problem messages".

All other level-0/1 messages are called "status messages":

* level-0/1 messages without well-known prefixes

That is it! These "definitions" are so
factual/trivial/pragmatic/convenient that (the choice of the words
"problem" and "status" aside) we should not be arguing about (the need
for) them IMO! If you think these categories are not needed, you do not
need to use them, of course, but I see no reason to argue about it. They
contradict/prevent/block nothing.


> All it needs to do is go straight to determine whether the message
> meets one of the wiki labels and apply it or the debugs to level2+.
> "status" vs "problem messages" are an emergent property of the correctly
> used labels.

Lots of of existing level-0/1 messages satisfy both of these conditions:

a) They are not ERRORs/WARNINGs/FATALs/SECURITY*s by _any_ definition.
b) They should be logged at level 0 or 1.

In RFC terminology, these are "status messages", but you do not have to
call them that. If you are suggesting that their debugs() level should
be changed to 2+, then I disagree because they are useful for
understanding Squid state and triaging basic problems at the _default_
debugging level. Perhaps you are suggesting something else with regard
to the status messages?


> IIRC there are a large number that will need to change verbosity,
> but fine to do that later.

Agreed on all counts.


Thank you,

Alex.


More information about the squid-dev mailing list