[squid-dev] [RFC] Refactor HttpHeader

Kinkie gkinkie at gmail.com
Thu Jul 30 07:44:32 UTC 2015


On Wed, Jul 29, 2015 at 11:31 PM, Amos Jeffries <squid3 at treenet.co.nz>
wrote:

> On 30/07/2015 9:10 a.m., Kinkie wrote:
> > Hi all,
> >    I'm starting to work on refactoring HttpHeader to use LookupTable, and
> > boy that code is a mess..
> >
> > Since it's going to take significant effort, I'd like to get feedback on
> > the changes I'd like to implement, so not to end in long discussions
> later.
> >
> > Current data structures:
> > HeadersAttrs: static declaration of header name, header ID, header type
> > (string/int/etc). Must be sorted by numeric value of ID
> > Headers: built at initialization time from HeadersAttrs, it's an array of
> > (name, id, type, stats) structs
> > ListHeadersArr, GeneralHeadersArr, EntityHeadersArr, RequestHeadersArr,
> > ReplyHeadersArr, etc: lists of headers ID (possibly overlapping) which
> are
> > used to generate..
> > ListHeadersMask, GeneralHeadersArr, EntityHeadersArr etc: bitmaps used
> > (only) by HttpHeaderStats to assemble different stats sets
>
> I seem to recall a hop-by-hop headers lists as well and the
> request/reply lists being used by message filtering logic to strip away
> invalid and hop-by-hop headers ?
>

That was a partial list. But yes, we need a quick way to understand some
spec characteristics of any given header, defined by ID. Current approach
is very effective at packing the information but very low-level, but very
low-level and probably not very much effective.


> > I would like to turn these into:
> > headerTable: LookupTable<>::Record mapping header name to header ID, used
> > to generate ...
> > headerLookupTable: a fast lookup table of header names to ids
> > headerStatsTable: a std::vector<HttpHeaderFieldStat> indexed by header ID
> > to collect the statistics currently in Headers[id].stats.
> > headerDescription: a std::vector keyed by header ID containing header
> type
> > (currently in Headers[id].type), possibly header name (if useful), a
> > bitfield noting if HTTP_HDR{LIST,GENERAL,REQUEST,REPLY,...}.
> >
> > There are some possible optimizations, but at a minimum this should help
> > keep information more organized while introducing no performance
> > regressions.
> > What do you think?
> >
>
> In my brief look the other day I thought a small struct {ID, type, group
> bitmask} could be used as the EnumType on LookupTable.
>

I don't like the idea, as it makes the header name the primary key. We want
the primary key to be header ID, and header name just an index. It really
is two different users: 1. given a header name, get the ID; 2. given an ID,
get the header's characteristics.

We can do it by changing LookupType to be parametric on Record

template <typename EnumType>
typedef struct LookupTableRecord
{
   const char *name;
   EnumType id;
}

template<typename Enumtype, typename RecordType =
LookupTableRecord<EnumType> >
class LookupTable
{
   //...
}

This would make it so that someone could define a custom Record type, and
as long as that record type matches the signature of LookupTableRecord,
LookupTable won't care.

This also has the advantate of not needing to fill in different data
structures from the static initializer table. As long as the id of each row
in the initializer table is equal to the index of that row (which is an
already-present requirement), then the initializer table will carry all the
read-only information we need.

Plan #2: change the LookupTable API to allow filling it in from the
outside, and initialize it from the module init code (just like now). This
way we can have a single big table with all the relevant information and
fill in different data structures during the initialization loop.


> LookupTable really only needs a POD type that can be copied cheaply for
> its stored type. Enum / int is the simplest of those but not mandatory.
>
> If that works then you can collapse headerLookupTable and
> headerDescription into one list that efficiently looks up all data for
> the header. Getting rid of the multiple sub-list type arrays.
>


The proposed change to LookupTable seems more elegant to me and is tested
working; what do you think?

-- 
    Kinkie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20150730/0840aba0/attachment.html>


More information about the squid-dev mailing list