<div dir="ltr">Merged as revision 14210 with some minor added polish (ostream<< for HttpHdrCcType and Http::HdrType).</div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 5, 2015 at 3:58 PM, Kinkie <span dir="ltr"><<a href="mailto:gkinkie@gmail.com" target="_blank">gkinkie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Wed, Aug 5, 2015 at 4:08 AM, Amos Jeffries <span dir="ltr"><<a href="mailto:squid3@treenet.co.nz" target="_blank">squid3@treenet.co.nz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 5/08/2015 9:08 a.m., Kinkie wrote:<br>
<div><div>> On Tue, Aug 4, 2015 at 2:58 PM, Amos Jeffries wrote:<br>
><br>
>> On 4/08/2015 11:22 p.m., Kinkie wrote:<br>
>>> Hi all,<br>
>>> the attached patch is a build- and run-tested merge proposal for next<br>
>>> round of the coverity-fixes branch, currently focusing on more effective<br>
>>> header name -> id lookups.<br>
>>><br>
>>> Here's the status with the current todo checklist:<br>
>>><br>
>>> * DONE 1. shift HDR_BAD_HDR to end of enum<br>
>>> * DONE 2. shift headers data array to http/RegistredHeaders.cc<br>
>>> * DONE 3. creatign LookupTable object from teh enum and array<br>
>>> * (with HDR_BAD_HDR as invalid value)<br>
>>> * DONE 4. replacing httpHeaderIdByName() uses with the lookup table<br>
>>> * NOT POSSIBLE 5. merge HDR_BAD_HDR and HDR_ENUM_END into one thing -<br>
>>> HDR_ENUM_END is overloaded meaning "All" headers in Manglers.<br>
>>> * DONE 6. replacing httpHeaderNameById with direct array lookups<br>
>>> * DONE 7. being looking at the other arrays removal<br>
>>><br>
>>> In working on this I found out several instances of enum abuses -<br>
>> tracking<br>
>>> those down has been the hardest part of the effort.<br>
>>> HttpHeader::parse is being used to parse error page templates - thus the<br>
>>> relaxed any_registered_header() checks in some methods, e.g.<br>
>>> HttpHdeader::addEntry().<br>
>>><br>
>>> Next steps, if there is consensus:<br>
>>> - moving LookupTable away from std::map to std::hash with custom<br>
>>> gperf-derived Hashers for extra boost<br>
>>> - investigating whether strongly-typed enums can be used instead of<br>
>> C-style<br>
>>> enums in more places.<br>
>>> - moving away from homegrown bitfields (CBIT_TEST etc.) towards<br>
>>> std::vector<bool> or std::vector<unsigned char>, possibly via a class<br>
>>> bitfield or somesuch.<br>
>>><br>
>><br>
>> audit results:<br>
>><br>
>> * httpHdrCcCleanModule() and httpHdrScCleanModule() can both be fully<br>
>> deleted now.<br>
>><br>
><br>
> Yes. I chose to let them in: they are NOPs, not in the critical path, and<br>
> may be useful in the future. Let me know if you still want them removed.<br>
><br>
<br>
</div></div>:-( more 'dead' code. This kind of thing is useful for<br>
ctpr/dtor/functor/virtual definition. But C functions that are not even<br>
used as functors is a waste of LOC and also compiler time dealing with<br>
the symbols.<br>
Its minor but still technical debt. The long-term plan is also to use<br>
RegisteredRunners for this type of thing not C functions.<br></blockquote><div><br></div></div></div><div>Ok. Removed all empty module cleanup functions - which are only invoked if LEAKCHECK is enabled anyway.<br></div><span class=""><div></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
>> * LookupTableRecord should be a class.<br>
>> - custom storage types can then inherit from it, with it as the first<br>
>> parent<br>
>><br>
><br>
> Well, templatization can prevent that need, but sure. Should I implement<br>
> that?<br>
> Isn't nowadays a struct == all-public class though? It looks odd, but<br>
> functionally is the same<br>
><br>
<br>
</span>Avoid "looks odd" whenever possible :-) as you say, the compiler dont<br>
care. But this is about us humans quick and easy reading it in 10 years<br>
time.<br>
<br>
I think its okay for useful code optimizations. Bt then the design<br>
choice needs documenting so nobody breaks or undoes it casually.<br></blockquote><div><br></div></span><div>Ok.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
><br>
>> in src/HttpHeader.cc:<br>
>><br>
>> * can you move the headerLookupTable to src/http/RegisteredHeaders.cc<br>
>> and the Http:: namespace as well please ?<br>
>><br>
><br>
> Moved to RegisteredHeaders and renamed HDR_FOO to Http::HdrType::FOO .<br>
> Moving to a strongly-typed enum is unfortunately not feasible as eCAP<br>
> requires integer-enum equivalence; it may be that the whole change has to<br>
> be reverted.<br>
><br>
<br>
</span>I thought "enum class" with first parameter value assigned 0<br>
("HDR_ACCEPT = 0,") should do that. ie. strongly typed to unsigned<br>
integer values.<br></blockquote><div><br></div></span><div>No.<br></div><div>strongly-typed == doesn't automatically cast to int, if you use strongly-typed you have to static_cast, while for old-style enum casting is automatic.<br><br></div><div>in c++11 the full enum syntax (square brackets mean optional) is<br></div><div>enum [ class ] enumName [ : storage_specification ] {<br></div><div> enum_elem_1 [ = elem-1-representation ],<br> ...<br>};<br><br></div><div>where storage_specification is the underlying integral data type.<br></div><div>Using the optional "class" specification disables auto-cast-to-int.<br></div><div>Elements can then be referenced by enumName::enum_elem_name , where the enumName specification is mandatory for "enum class", optional for old-style enum.<br><br></div><div>BTW: I've tried changing the underlying storage to unsigned, and I get asserts in masks calculation.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
>> * any_registered_header() is wrong.<br>
>> - it matches for HDR_OTHER which by definition is a non-registered header<br>
>> - assert_eid() is equivalent to any_valid_header()<br>
>><br>
><br>
> Then we need to change name. We need two semantics:<br>
> 1) any header which is valid and defined (including OTHER)<br>
> 2) any header ID which will not go out-of-bounds (same as (1) + HDR_BAD_HDR)<br>
><br>
> any suggestion?<br>
><br>
<br>
</span>(1) is already any_valid_header().<br>
<br>
(2) any_HdrType_enum_value()<br>
<br>
But where/why do we need (2) ?<br>
Any value input or received that is non-(1) needs to be represented as<br>
BAD_HDR when parsing/validating the input to an enum value.<br></blockquote><div><br></div></span><div>It's sprinkled all over the place, mostly meant to protect against int-related abuses which may end up causing out of bounds in arrays, vectors and masks. Moving to enum class would mean implicit safety in this regard, but would require accessor functions to ensure consistent checks in typecasts. Hello again HdrIdFromName and HdrDescFromId. I'm cool with it, if that's a wise design decision in your opinion. It'd save the need of a dozen asserts or so.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Are you looking at things HttpHeader.cc line ~524:<br>
<br>
AA)<br>
<br>
+ if (e->id >= Http::HdrType::ENUM_END) {<br>
debugs(55, DBG_CRITICAL, "BUG: invalid entry (" << e->id << ")...<br>
<br>
That should probably be if(!any_valid_header(e->id)) since BAD is also<br>
invalid entry value for a header.<br></blockquote><div><br></div></span><div>aha, got it and turned in all places where ENUM_END is used to this end.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
If the logic there explicitly requires BAD handling, it should be:<br>
// some reason for why BAD is accepted as 'valid'.<br>
if(!any_valid_header(e->id) && e->id != BAD)<br>
<br>
Although note that when parsing is fixed BAD will represent *any*<br>
invalid value.<br>
<br>
<br>
or BB) the header manglers "All" ?<br>
<br>
For that I think we need a new enum value like "Other," ("All," ?)<br>
which is outside registered headers, but valid only as an enum entry.<br>
Currently END, but that needs a rename and never to be used by<br>
non-mangler code.<br></blockquote><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span><br>
<br>
>><br>
>> - please consider removing the assert(assert(any_valid_header(id)))<br>
>> it could probably be replaced by:<br>
>> if (!assert(any_valid_header(id)))<br>
>> id = HDR_OTHER<br>
>><br>
><br>
> Huh? I can't find any nested assert<br>
><br>
<br>
</span>copy-paste error on my part. It is the assert you are adding three lines<br>
below what is now "got hdr-id=" in the chunk @1512.<br>
<br>
I can't see the function name, but it looks like parsing some input.<br>
Avoid assert in all parsers. Output BAD or OTHER as appropriate instead.<br></blockquote><div><br></div></span><div>It's HttpHeader::parse.<br></div><div>That assert is actually useless: it's set by a HeaderLookupTable.lookup. It's guaranteed to be valid (if BAD, and the BAD case is handled by reassigning to OTHER.<br></div><div>Removing that assert full stop.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I been thinking a bit more (sorry dangerous).<br>
<br>
It may be worth adjusting the src/mk-string-arrays.awk script to also be<br>
capable of output LookupTable<X>::Record arrays. That script is designed<br>
to guarantee invariance. Though it does leave us with another .cc.<br>
Anyhow thats a followup to think about, not in this scope.<br></blockquote><div><br></div></span><div>I agree on it not being in scope. Also because if we wish to use gperf to generate the perfect hashes, then gperf could (and thus should) do that for us (see <a href="http://www.gnu.org/software/gperf/manual/html_node/User_002dsupplied-Struct.html#User_002dsupplied-Struct" target="_blank">http://www.gnu.org/software/gperf/manual/html_node/User_002dsupplied-Struct.html#User_002dsupplied-Struct</a> )</div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
second round audit:<br>
<br>
in src/HttpHdrCc.cc:<br>
<br>
* valid_id in httpHdrCcStatDumper() can probably be a const bool like<br>
the other.<br></blockquote><div><br></div></span><div>Yes. Done. <br><br></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
in src/HttpHdrSc.cc:<br>
<br>
* useless include of dlink.h since its done in the .h<br></blockquote></span><div><br>Ok.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
* a duplicate include of HttpHdrSc.h since its pulled in by<br>
HttpHdrScTarget.h.<br>
- but in this case I think replece it with:<br>
//#include "HttpHdrSc.h" // pulled in by HttpHdrScTarget.h<br></blockquote><div><br></div></span><div>Done. Good idea, it'd look odd otherwise.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
in src/HttpHdrSc.h:<br>
<br>
* please use mem/forward.h instead of mem/AllocatorProxy.h<br></blockquote><div><br></div></span><div>Ok<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
* please store the class pre-defines alphabetically.<br>
- Packable above StatHist.<br></blockquote><div><br></div></span><div>Done. <br></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
in src/HttpHdrScTarget.cc:<br>
<br>
* useless include of HttpHdrSc.h<br>
- thanks for fixing the missing HttpHdrScTarget.h<br></blockquote><div><br></div></span><div>Done. NP :)<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
in src/HttpHdrScTarget.h:<br>
<br>
* now has useless include of mem/forward.h (after above fix),<br>
SquidString.h, dlink.h, typedefs.h<br>
<br></blockquote></span><div>Ok.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
in src/HttpHeader.cc:<br>
<br>
* please take the opportunity to remove useless includes overlapping<br>
with HttpHdrCc.h and HttpHdrSc.h<br></blockquote><div><br></div></span><div>Done. I've commented them and mentioned the indirect inclusion in a comment.<br> <br></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
* please replace include fof HttpHdrSc.h with HttpHdrScTarget.h in the<br>
new include nesting.<br>
- also checking for duplicate includes between HttpHdrScTarget.h and<br>
HttpHeader.cc<br></blockquote><div><br></div></span><div>Ok.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
* chunk @562, s/NULL/nullptr/<br>
- maybe others in the touched lines<br></blockquote><div><br></div></span><div>Done.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
* HttpHeaderEntry dtor use of assert(any_valid_header(id)); at the top<br>
is needless.<br>
- protect the stats accounting against BAD though<br></blockquote><div><br></div></span><div>Done.<br> <br></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
* HttpHeaderEntry ctor accepts OTHER but also "anId != BAD" implies that<br>
it accepts BAD as well despite the previous<br>
assert(any_registered_header(anId)).<br>
- wrap the stats accounting in if (id != BAD).<br></blockquote><div><br></div></span><div>done<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
- fixing those ctor/dtor asserts will avoid any potential for assertion<br>
from temporary default- or partially- constructed objects, or bad<br>
emplacement destructions.<br></blockquote><div><br></div></span><div>..which are not there, or we'd assert all over the place. But sure.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
in src/acl/RequestHeaderStrategy.h:<br>
<br>
* please take the opportunity to remove that empty line between<br>
'template' and 'class' at lines at ~14<br></blockquote><div><br></div></span><div>Done <br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
in src/http/RegisteredHeaders.cc:<br>
<br>
* please add empty line between namespace '{' and start of comment.<br></blockquote><div><br></div></span><div>Ok<br> <br></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
in src/http/RegisteredHeaders.h:<br>
<br>
* please document on HeaderTable that lookup() will return BAD if the<br>
header is unknown / unregistered. callers need to check and handle that<br>
case properly.<br></blockquote><div><br></div></span><div>Ok.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I think that is all. And all polish :-)<br>
<br>
+1 on this so far. Though you/we still need to sort the valid registered<br>
ID tests before commit.<br></blockquote><div><br></div></span><div>I think I have fixed those, although it's hard to be sure - layering violations all over the place.<br></div><span class=""><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
NOTE: these are about old bugs you are uncovering. I'd like to take the<br>
opportunity to fix them. But this is already huge, so for now just XXX<br>
mark it and followup patch.<br></blockquote><div><br><br></div></span><div>Like many other areas, there is (plenty of) room for improvement :)<br></div><div>If not performance-wise, certainly code-wise.<br></div><div>I'd leave them as a follow-up effort.<br></div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
* I'm still not clear on why insertEntry() checks for valid header enum<br>
values but addEntry() needs to accept BAD.<br>
- if that need is true its worth a comment to document. otherwise<br>
update top of addEntry() to match insertEntry().<br>
- NP: we should *never* be adding a BAD/invalid header entry to a<br>
message object. Custom ones yes, but they are OTHER.<br>
<br>
<br>
* like has() all the has/put/get*(ID, ...) methods need to exclude BAD<br>
and OTHER as a valid inputs.<br>
- OTHER means we are accessing and using the Entry by string name, not<br>
by ID value.<br>
- BAD means we should never have put it into object in the first place.<br>
- the put*(ID, ...) accepting OTHER may be where addEntry(ID,...) goes<br>
wrong.<br>
- sorry these seem to have got caught in cleaning up one of my earlier<br>
comments which was about another changed use of assert_eid().<br>
assert_eid() test was wrong for these get/put/has*(ID,...) to begin with.<br>
<br>
<br>
[[ httpHeaderFieldStatDumper() earning itself a place in the hall of<br>
horrors :-(<br>
..., double val, ...<br>
const int id = static_cast<int>(val);<br>
const bool valid_id = id < Http::HdrType::ENUM_END;<br>
// back away, quietly ...<br>
]]<br>
<span></span><br></blockquote></span></div><span class="HOEnZb"><font color="#888888"><br clear="all"><br>-- <br><div> Francesco</div>
</font></span></div></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"> Francesco</div>
</div>