[squid-dev] [PATCH] HttpHeader migration (coverity-fixes branch)

Amos Jeffries squid3 at treenet.co.nz
Wed Aug 5 02:08:23 UTC 2015


On 5/08/2015 9:08 a.m., Kinkie wrote:
> On Tue, Aug 4, 2015 at 2:58 PM, Amos Jeffries wrote:
> 
>> On 4/08/2015 11:22 p.m., Kinkie wrote:
>>> Hi all,
>>>   the attached patch is a build- and run-tested merge proposal for next
>>> round of the coverity-fixes branch, currently focusing on more effective
>>> header name -> id lookups.
>>>
>>> Here's the status with the current todo checklist:
>>>
>>>  * DONE 1. shift HDR_BAD_HDR to end of enum
>>>  * DONE 2. shift headers data array to http/RegistredHeaders.cc
>>>  * DONE 3. creatign LookupTable object from teh enum and array
>>>  * (with HDR_BAD_HDR as invalid value)
>>>  * DONE 4. replacing httpHeaderIdByName() uses with the lookup table
>>>  * NOT POSSIBLE 5. merge HDR_BAD_HDR and HDR_ENUM_END into one thing -
>>> HDR_ENUM_END is overloaded meaning "All" headers in Manglers.
>>>  * DONE 6. replacing httpHeaderNameById with direct array lookups
>>>  * DONE 7. being looking at the other arrays removal
>>>
>>> In working on this I found out several instances of enum abuses -
>> tracking
>>> those down has been the hardest part of the effort.
>>> HttpHeader::parse is being used to parse error page templates - thus the
>>> relaxed any_registered_header() checks in some methods, e.g.
>>> HttpHdeader::addEntry().
>>>
>>> Next steps, if there is consensus:
>>> - moving LookupTable away from std::map to std::hash with custom
>>> gperf-derived Hashers for extra boost
>>> - investigating whether strongly-typed enums can be used instead of
>> C-style
>>> enums in more places.
>>> - moving away from homegrown bitfields (CBIT_TEST etc.) towards
>>> std::vector<bool> or std::vector<unsigned char>, possibly via a class
>>> bitfield or somesuch.
>>>
>>
>> audit results:
>>
>> * httpHdrCcCleanModule() and httpHdrScCleanModule() can both be fully
>> deleted now.
>>
> 
> Yes. I chose to let them in: they are NOPs, not in the critical path, and
> may be useful in the future. Let me know if you still want them removed.
> 

:-( more 'dead' code. This kind of thing is useful for
ctpr/dtor/functor/virtual definition. But C functions that are not even
used as functors is a waste of LOC and also compiler time dealing with
the symbols.
 Its minor but still technical debt. The long-term plan is also to use
RegisteredRunners for this type of thing not C functions.


>> * LookupTableRecord should be a class.
>>  - custom storage types can then inherit from it, with it as the first
>> parent
>>
> 
> Well, templatization can prevent that need, but sure. Should I implement
> that?
> Isn't nowadays a struct == all-public class though? It looks odd, but
> functionally is the same
> 

Avoid "looks odd" whenever possible :-) as you say, the compiler dont
care. But this is about us humans quick and easy reading it in 10 years
time.

I think its okay for useful code optimizations. Bt then the design
choice needs documenting so nobody breaks or undoes it casually.


> 
>> in src/HttpHeader.cc:
>>
>> * can you move the headerLookupTable to src/http/RegisteredHeaders.cc
>> and the Http:: namespace as well please ?
>>
> 
> Moved to RegisteredHeaders and renamed HDR_FOO to Http::HdrType::FOO .
> Moving to a strongly-typed enum is unfortunately not feasible as eCAP
> requires integer-enum equivalence; it may be that the whole change has to
> be reverted.
> 

I thought "enum class" with first parameter value assigned 0
("HDR_ACCEPT = 0,") should do that. ie. strongly typed to unsigned
integer values.


>> * any_registered_header() is wrong.
>>  - it matches for HDR_OTHER which by definition is a non-registered header
>>  - assert_eid() is equivalent to any_valid_header()
>>
> 
> Then we need to change name. We need two semantics:
> 1) any header which is valid and defined (including OTHER)
> 2) any header ID which will not go out-of-bounds (same as (1) + HDR_BAD_HDR)
> 
> any suggestion?
> 

(1) is already any_valid_header().

(2) any_HdrType_enum_value()


But where/why do we need (2) ?
 Any value input or received that is non-(1) needs to be represented as
BAD_HDR when parsing/validating the input to an enum value.

Are you looking at things HttpHeader.cc line ~524:

AA)

+  if (e->id >= Http::HdrType::ENUM_END) {
    debugs(55, DBG_CRITICAL, "BUG: invalid entry (" << e->id << ")...

That should probably be if(!any_valid_header(e->id)) since BAD is also
invalid entry value for a header.

If the logic there explicitly requires BAD handling, it should be:
  // some reason for why BAD is accepted as 'valid'.
  if(!any_valid_header(e->id) && e->id != BAD)

Although note that when parsing is fixed BAD will represent *any*
invalid value.


or BB) the header manglers "All" ?

For that I think we need a new enum value like "Other,"  ("All," ?)
which is outside registered headers, but valid only as an enum entry.
Currently END, but that needs a rename and never to be used by
non-mangler code.


>>
>>  - please consider removing the assert(assert(any_valid_header(id)))
>>    it could probably be replaced by:
>>       if (!assert(any_valid_header(id)))
>>           id = HDR_OTHER
>>
> 
> Huh? I can't find any nested assert
> 

copy-paste error on my part. It is the assert you are adding three lines
below what is now "got hdr-id=" in the chunk @1512.

I can't see the function name, but it looks like parsing some input.
Avoid assert in all parsers. Output BAD or OTHER as appropriate instead.


I been thinking a bit more (sorry dangerous).

It may be worth adjusting the src/mk-string-arrays.awk script to also be
capable of output LookupTable<X>::Record arrays. That script is designed
to guarantee invariance. Though it does leave us with another .cc.
Anyhow thats a followup to think about, not in this scope.


second round audit:

in src/HttpHdrCc.cc:

* valid_id in httpHdrCcStatDumper() can probably be a const bool like
the other.


in src/HttpHdrSc.cc:

* useless include of dlink.h since its done in the .h

* a duplicate include of HttpHdrSc.h since its pulled in by
HttpHdrScTarget.h.
 - but in this case I think replece it with:
  //#include "HttpHdrSc.h" // pulled in by HttpHdrScTarget.h


in src/HttpHdrSc.h:

* please use mem/forward.h instead of mem/AllocatorProxy.h

* please store the class pre-defines alphabetically.
 - Packable above StatHist.


in src/HttpHdrScTarget.cc:

* useless include of HttpHdrSc.h
 - thanks for fixing the missing HttpHdrScTarget.h


in src/HttpHdrScTarget.h:

* now has useless include of mem/forward.h (after above fix),
SquidString.h, dlink.h, typedefs.h


in src/HttpHeader.cc:

* please take the opportunity to remove useless includes overlapping
with HttpHdrCc.h and HttpHdrSc.h

* please replace include fof HttpHdrSc.h with HttpHdrScTarget.h in the
new include nesting.
 - also checking for duplicate includes between HttpHdrScTarget.h and
HttpHeader.cc

* chunk @562, s/NULL/nullptr/
 - maybe others in the touched lines

* HttpHeaderEntry dtor use of assert(any_valid_header(id)); at the top
is needless.
 - protect the stats accounting against BAD though.


* HttpHeaderEntry ctor accepts OTHER but also "anId != BAD" implies that
it accepts BAD as well despite the previous
assert(any_registered_header(anId)).
 - wrap the stats accounting in if (id != BAD).

 - fixing those ctor/dtor asserts will avoid any potential for assertion
from temporary default- or partially- constructed objects, or bad
emplacement destructions.


in src/acl/RequestHeaderStrategy.h:

 * please take the opportunity to remove that empty line between
'template' and 'class' at lines at ~14


in src/http/RegisteredHeaders.cc:

* please add empty line between namespace '{' and start of comment.


in src/http/RegisteredHeaders.h:

* please document on HeaderTable that lookup() will return BAD if the
header is unknown / unregistered. callers need to check and handle that
case properly.


I think that is all. And all polish :-)

+1 on this so far. Though you/we still need to sort the valid registered
ID tests before commit.


NOTE: these are about old bugs you are uncovering. I'd like to take the
opportunity to fix them. But this is already huge, so for now just XXX
mark it and followup patch.

* I'm still not clear on why insertEntry() checks for valid header enum
values but addEntry() needs to accept BAD.
 - if that need is true its worth a comment to document. otherwise
update top of addEntry() to match insertEntry().
 - NP: we should *never* be adding a BAD/invalid header entry to a
message object. Custom ones yes, but they are OTHER.


* like has() all the has/put/get*(ID, ...) methods need to exclude BAD
and OTHER as a valid inputs.
 - OTHER means we are accessing and using the Entry by string name, not
by ID value.
 - BAD means we should never have put it into object in the first place.
 - the put*(ID, ...) accepting OTHER may be where addEntry(ID,...) goes
wrong.
 - sorry these seem to have got caught in cleaning up one of my earlier
comments which was about another changed use of assert_eid().
assert_eid() test was wrong for these get/put/has*(ID,...) to begin with.


[[ httpHeaderFieldStatDumper() earning itself a place in the hall of
horrors :-(
  ..., double val, ...
  const int id = static_cast<int>(val);
  const bool valid_id = id < Http::HdrType::ENUM_END;
 // back away, quietly ...
]]

Amos


More information about the squid-dev mailing list