[squid-dev] [PATCH] Coverity(-inspired) fixes part four, HttpHeader refactor

Amos Jeffries squid3 at treenet.co.nz
Fri Aug 21 12:35:05 UTC 2015


On 21/08/2015 10:37 p.m., Kinkie wrote:
> On Thu, Aug 20, 2015 at 8:46 AM, Amos Jeffries <squid3 at treenet.co.nz> wrote:
> 
>> On 16/08/2015 7:09 p.m., Kinkie wrote:
>>> Hi all,
>>>   the attached patch does:
>>> - EnumIterator, which now uses autoconf to detect std::underlying_type
>>> - Removes header masks in HttpHeader.cc and HttpReply in favor of..
>>> - gperf-generated header table with flags describing headers all
>> collected
>>> in one place
>>> - replaces HttpHeader::getEntry with explicit vector iteration
>>> - Implements and uses Http::HeaderTable.lookup(string-or-Http::HdrType)
>> to
>>> look up headers in the gperf-generated hash
>>> - small optimizationi in HttpHeader::getLastEntry thanks to reverse
>>> iterators
>>> - use erase-remove pattern in HttpHeader::compact in place of resize()
>>> - use std:: algorithms instead of explicit loop in HttpHeader::delById
>>> - gets rid of httpHeaderCalcMask - masks are still used elsewhere but
>>> generating them is now easier thanks to WholeEnum<type>
>>> - some readability improvements
>>> - refactor HeaderManglers::track to avoid overloading of meaning the
>>> enumEnd_ enumerator - please help me checking carefully I haven't altered
>>> functionality here
>>> - implements a case-insensitve hash function for SBuf
>>> - all known headers are now unconditionally parsed. May not be acted upon
>>> if some functions are not compiled in
>>> - LookupTable now uses an unordered_map instead of map for backing
>> storage.
>>> No user-visible change for SBuf and std::string users unless they wish to
>>> have a custom hasher
>>>
>>> Most code has been run-tested (most notable exception: HeaderManglers),
>>> farm-build-, coadvisor- and polygraph- tested. The main focus of this
>>> refactoring is correctness and readability, c++11-ification and
>> recovering
>>> from the performance regression that was introduced ~10 days ago.
>>>
>>
>>
>> Here goes. Audit:
>>
>>
>> * IMO the HeaderManglers really do need at least basic testing with
>> these changes. A regression there affects widely used features.
>>
>> basic tests are easy enough. A simple proxy altering the Server header
>> to a fixed value, adding some custom new header, and stripping something
>> common like Expires should be sufficient to trigger any code problems
>> and eyeball any major issues from the squidclient output to verify working.
>>
>>
> After some careful checking:
> *_header_access and *_header_add work fine.
> *_header_replace is broken, but it's been for a long time (I run-tested
> with code from february).
> 


<snipping the done requests>


>> in src/Makefile.am:
>>
>> * for testEnumIterator please add $(SQUID_CPPUNIT_LA) to the *_LDADD
>> list instead of a *_DEPENDENCIES entry
>>
> 
> hm.. it's applied inconsistently. Documentation (at line 957) says to have
> it in both places.
> testUfs, testRock : do not have it as a dependency
> testHttpReply, testBoilerplate, testCacheManager, testDiskIO, testEvent,
> testEventLoop, test_http_range, testHttp1Parser, testHttpRequest,
> testStore, testString, testUrl, testSBuf, testSBufList, testConfigParser,
> testStatHist (and so on..) have it as a dependency
> testACLMaxUserIP : used to have it as a dependency (but now it's commented
> out)
> 
> For now I'm following documentation and leaving it in both places; if you
> have an opinion about how it should be I'll gladly carpet-apply that to the
> whole file.


Documentation is outdated. I'm trying to work on _DEPENDENCIES removal
now for automake 1.15 support as another incremental fix-up polishing as
we add/change tests. Each test that has one does not check all its
_LDADD indirect dependencies properly, so can inconsistently link tests.

As far as SQUID_CPPUNIT_LA is concerned the issue seems to have been a
bug that was resolved in older automake and/or cppunit versions. It
works fine with just one _LDADD entry on the tests I've converted so far.


It looks like I'm going to have to do a sorting sweep on *_LDADD anyway.
So up to you.


> 
>> * also please place tests/stub_* and base/EnumIterator.h in the nodist_*
>> list from now on.
>>
> 
> Ok. Changed documentation at about line 951 to reflect that.
> 
> 
>>  - the test _SOURCE list should now only contain the tests/test*.{h,cc}
>> actually containing the unit-test code.
>>
> Does this also apply to the files actually being tested? (X.h, X.cc)
> I am fine either way, I would just like it to be reflected in the
> documentation. Not changed there for now, just in the testEnumIterator part.
> 

Yes. Especially them. The purpose is to ensure they are correctly listed
in a _SOURCES for one of the libraries or binaries in normal builds.

Sorry, I got a bit distracted on the _DEPENDENCIES removal work. Docs
update still coming.


> 
>>  - this helps prevent unit tests being the only reason files are dist
>> (found a few like that already), so we can ensure they are linked to
>> Squid main binary or libraries properly for automake dependency detection.
>>
>>
>> in src/SBufAlgos.cc:
>>
>> * whats cctype needed for?
>>
> 
> tolower(3).
> 

Ah. Thought so. Please use the xtolower() provided by libcompat.


>>
>>
>> in src/http.cc:
>>
>> * please mark use of name.c_str() with "XXX: performance regression,
>> c_str() reallocates"
>>
> 
> Actually, no in this case it doesn't unless ilen is exactly equal to one of
> the MemBlob MemPool sizes (possible but unlikely, I estimate in the 0.1%
> range).
> There are some performance deltas but they are IMO quite marginal:
> slowdowns: SBuf housekeeping, temp creation (of ilen+1 bytes) done in two
> steps instead of one
> speedups: SBuf uses mempools, "*" case is checked before lowering case
> 

Okay. If you are sure. I have thought that a few times then found SBuf
reallocating based on seemingly just the lock counter.


>>
>> in src/http/RegisteredHeadersHash.cci:
>>
>> * is there any way to add a comment blurb "/* AUTO-GENERATED DO NOT EDIT
>> */" to the top of this file ?
>>
> 
> It can and I did, but it's not possible at the very top unfortunately.
> 
>>
>> * also need to add the copyright boilerplate
>>
> 
> Same thing.
> 

Fine. At least they there now to silence the "CHECK COPYRIGHT"
maintenance warnings. I will see what else is needed by the scripts once
this is all merged.


+1. with the below documentation polish :-)

New bits:

in src/HttpHeader.cc
* HttpHeader::append now s/addEntry (/addEntry(/


in src/HttpReply.cc:
* one more for-loop needing brackets


in src/SBufAlgos.h:
* please use doxygen \code ... \endcode for code exammples.
 - definitely NOT using shell-eval quotes


in src/acl/HttpHeaderData.h:
* please re-aligning the comments after code shrinkage :-)


in src/base/EnumIterator.h:

* one empty line after the boilerplate please.
 - surprised the formatting run did not catch that, it does on master.
 - same in testEnumIterator.h

* s/or c++11 strongly-typed/or C++11 strongly-typed/
 - "C++11" is a noun/acronym.

* please do not quote around begin()/end()/rbegin()/rend() names
 - doxygen should highlight and link naturally
 - specialy not with shell-eval quotes.
 - might be best to look for more of those in this patch if you can. I
cant because my command-line tries to execute something just looking for
a string involving them.


* s/doesn't/does not/ and s/you'll/you will/
 - documentation is formal language.

* all this "prefix increment" / "postfix increment" operator documentation
 - if its worth writing them at all use doxygen "///".

* EnumRangeT
 - s/mote/note/ ?
 - and shell-eval quotes again

* WholeEnum
 - doxygen syntax is:  /** [brief] \n ... \n */
 - if you start a multi-line description on the first line after "/**"
things get weird by only using part of the text in some output douments.
 - so keep 1-liners for brief description, and omit anything on the same
line as "/**" if you need just need a good long description.
 - this is why I generally avoid the 1-liner brief prefixes.


in src/esi/Libxml2Parser.h
* revert to trunk version.


in src/http/RegisteredHeaders.h:
* HeaderTableRecord if its worth using '//' on the bools its worth using
'///< '

* HeaderLookupTable_t
 - s/header types' definitions/header definitions/
 - on BadHdr; use doxygen '///< '

in src/tests/testEnumIterator.cc
* in testBidirectionalIter and testUnsignedEnum
 - how about using &&count<=20 to terminate the loops if things go very
wrong?
 - I'd hope for something similar in the testRangeFor*() loops, but not
sure if/how its possible to enforce an exit with range-for being
hyper-threaded sometimes.


Amos


More information about the squid-dev mailing list