<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 29, 2015 at 11:31 PM, Amos Jeffries <span dir="ltr"><<a href="mailto:squid3@treenet.co.nz" target="_blank">squid3@treenet.co.nz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 30/07/2015 9:10 a.m., Kinkie wrote:<br>
> Hi all,<br>
> I'm starting to work on refactoring HttpHeader to use LookupTable, and<br>
> boy that code is a mess..<br>
><br>
> Since it's going to take significant effort, I'd like to get feedback on<br>
> the changes I'd like to implement, so not to end in long discussions later.<br>
><br>
> Current data structures:<br>
> HeadersAttrs: static declaration of header name, header ID, header type<br>
> (string/int/etc). Must be sorted by numeric value of ID<br>
> Headers: built at initialization time from HeadersAttrs, it's an array of<br>
> (name, id, type, stats) structs<br>
> ListHeadersArr, GeneralHeadersArr, EntityHeadersArr, RequestHeadersArr,<br>
> ReplyHeadersArr, etc: lists of headers ID (possibly overlapping) which are<br>
> used to generate..<br>
> ListHeadersMask, GeneralHeadersArr, EntityHeadersArr etc: bitmaps used<br>
> (only) by HttpHeaderStats to assemble different stats sets<br>
<br>
</span>I seem to recall a hop-by-hop headers lists as well and the<br>
request/reply lists being used by message filtering logic to strip away<br>
invalid and hop-by-hop headers ?<br></blockquote><div><br></div><div>That was a partial list. But yes, we need a quick way to understand some spec characteristics of any given header, defined by ID. Current approach is very effective at packing the information but very low-level, but very low-level and probably not very much effective.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span class="">> I would like to turn these into:<br>
> headerTable: LookupTable<>::Record mapping header name to header ID, used<br>
> to generate ...<br>
> headerLookupTable: a fast lookup table of header names to ids<br>
> headerStatsTable: a std::vector<HttpHeaderFieldStat> indexed by header ID<br>
> to collect the statistics currently in Headers[id].stats.<br>
> headerDescription: a std::vector keyed by header ID containing header type<br>
> (currently in Headers[id].type), possibly header name (if useful), a<br>
> bitfield noting if HTTP_HDR{LIST,GENERAL,REQUEST,REPLY,...}.<br>
><br>
> There are some possible optimizations, but at a minimum this should help<br>
> keep information more organized while introducing no performance<br>
> regressions.<br>
> What do you think?<br>
><br>
<br>
</span>In my brief look the other day I thought a small struct {ID, type, group<br>
bitmask} could be used as the EnumType on LookupTable.<br></blockquote><div><br></div><div>I don't like the idea, as it makes the header name the primary key. We want the primary key to be header ID, and header name just an index. It really is two different users: 1. given a header name, get the ID; 2. given an ID, get the header's characteristics.<br></div><div><br></div><div>We can do it by changing LookupType to be parametric on Record<br><br></div><div>template <typename EnumType><br></div><div>typedef struct LookupTableRecord<br>{<br></div><div> const char *name;<br></div><div> EnumType id;<br></div><div>}<br><br></div><div>template<typename Enumtype, typename RecordType = LookupTableRecord<EnumType> ><br></div><div>class LookupTable<br>{<br> //...<br>}<br><br></div><div>This would make it so that someone could define a custom Record type, and as long as that record type matches the signature of LookupTableRecord, LookupTable won't care.<br><br></div><div>This also has the advantate of not needing to fill in different data structures from the static initializer table. As long as the id of each row in the initializer table is equal to the index of that row (which is an already-present requirement), then the initializer table will carry all the read-only information we need.<br></div><div><br></div><div>Plan #2: change the LookupTable API to allow filling it in from the outside, and initialize it from the module init code (just like now). This way we can have a single big table with all the relevant information and fill in different data structures during the initialization loop.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
LookupTable really only needs a POD type that can be copied cheaply for<br>
its stored type. Enum / int is the simplest of those but not mandatory.<br>
<br>
If that works then you can collapse headerLookupTable and<br>
headerDescription into one list that efficiently looks up all data for<br>
the header. Getting rid of the multiple sub-list type arrays.<br></blockquote><div><br><br></div><div>The proposed change to LookupTable seems more elegant to me and is tested working; what do you think?<br></div><div> <br></div></div>-- <br><div class="gmail_signature"> Kinkie</div>
</div></div>