[squid-dev] [PATCH] HTTP Response Parser upgrade

Alex Rousskov rousskov at measurement-factory.com
Fri Jan 23 17:37:35 UTC 2015


On 01/23/2015 01:00 AM, Amos Jeffries wrote:
> On 23/01/2015 6:47 a.m., Alex Rousskov wrote:
>> On 01/20/2015 06:58 AM, Amos Jeffries wrote:
>>> +    /// determine how much space the buffer needs to reserve +
>>> size_t needBufferSpace(const SBuf &readBuf, const size_t
>>> minSpace) const;

>> need+something implies a notification or state changing method,
>> which is not what this method is about. Based on the caller, I
>> suggest to

>> s/needBufferSpace/calcReadSize/ or s/needBufferSpace/readSize/


> While the current caller is only in http.cc, this is intended for use
> by all Client child classes in their own ways.

The suggested names were protocol-neutral AFAICT.


> I'm going with calcBufferSpaceToReserve() for now.

I doubt you can have a generic reservation algorithm without knowing
what the reservation is for (i.e. "reading" of some sort), but
calcBufferSpaceToReserve() is indeed better than the original
needBufferSpace() on other levels.


>>> +    /// parse scan to find the mime headers block for current
>>> message +    bool findMimeBlock(const char *which, size_t
>>> limit);
> 
>> Find is a usually a const method. This method does a lot more than
>> just finding MIME headers. To fix that and to supply information
>> that the method name does not already reveal, I would do:
> 
>> /// returns whether the header was successfully parsed bool
>> parseMimeHeader(const char *headerKind, const size_t sizeLimit);
> 
> 
> I've adjusted the documentation to say what state changes happen on true.
> 
> 
> I do object to the use of "header" other than as a modifier to the
> named mime block type like I had it.

Please feel free to insert missing MIME augmentation as needed, of course.


> Would you accept "identify" instead of "find" ?
>  or maybe "grab" is better?

Out of those two, "grab" or "isolate" is better IMO. "Parse" would be
even better because parsing is exactly what this method is doing
(despite all the disclaimers to the contrary :-).


>> but there is more that needs to be done in the above code:
> 
>>> +    // Squid could handle these headers, but admin does not want
>>> to +    if (messageHeaderSize() >= limit) { +        debugs(33,
>>> 5, "Too large " << which); +        parseStatusCode =
>>> Http::scHeaderTooLarge; +        return false; +    }
> 
>> This should be rewritten to go before we initialize
>> mimeHeaderBlock_ IMO. Otherwise, the damage of keeping the
>> admin-prohibited header is already done.
> 
> 
> Done.
> 
> NP: This will prevent the ERR_TOO_BIG and cache.log level 11,2
> displaying the mime block as part of the message when its present but
> outside the desired size range.

I am sure there is a way to have ERR_TOO_BIG and cache.log level 11,2
messages on huge messages if we want that. Parser polishing you are
working on should not prevent that.


>>> * This has now had several hours of regular web browsing by
>>> myself (YouTube, Facebook, some news sites, and the Squid sites)
>>> with nothing noticably going wrong.
> 
>> Any ICAP tests? The HTTP-ICAP interaction has been a source of many
>> bugs before and your patch changes the relevant code. If you have
>> not tested it, please do (or ask others to do it).

> No the ICAP related code is not touched. It still uses the HttpMsg
> parser functions and its own original buffering. These changes only
> touch the server socket (virgin reply) buffer I/O and parse function.

The patch contains/changes adaptation-related code. I am not talking
about parsing ICAP messages, but about coordination between HTTP code
and ICAP/eCAP code. Anything inside USE_ADAPTATION.


>>> * Polygraph shows a 7 RPS performance loss (0.36%), with trunk
>>> having +/-2 RPS (0.1%) normal fluctuation. I believe this is from
>>> the transitional regression to data-copy into MemBuf for chunked
>>> coding.

>> Did Polygraph actually use chunked responses in these tests?

> Unsure, but I think so.

I suspect it did not. If I read the code correctly, Polygraph supports
dechunking of received responses but does not chunk the messages it
sends (because Polygraph knows their size). We will add that support!


Cheers,

Alex.


More information about the squid-dev mailing list