[squid-dev] [PATCH] HTTP Response Parser upgrade

Thu Jan 22 17:47:47 UTC 2015

On 01/20/2015 06:58 AM, Amos Jeffries wrote:

> Updated patch attached resolving the issues found by coadvisor in the
> original.

> +size_t
> +Client::needBufferSpace(const SBuf &readBuf, const size_t minSpace) const
> +{
> +    size_t space = readBuf.spaceSize(); // available space w/o heroic measures

If the above line is the only readBuf usage in this method, pass
readBuf.spaceSize() as a parameter instead of the whole buffer.

> +    if (space < minSpace) {
> +        const size_t maxSpace = SBuf::maxSize; // absolute best
> +        space = min(minSpace, maxSpace); // do not promise more than asked
> +    }

Here and elsewhere in Client::needBufferSpace(), the returned space size
may be lower than minSpace. s/minSpace/wantedSpace/ or something like that.

> +         * If the pipe is totally full, don't register the read handler.
> +         * The BodyPipe will call our noteMoreBodySpaceAvailable() method
> +         * when it has free space again.

The above comment does not match the code. The code does not deal with
read handler registration, only size calculations.

> +        size_t adaptation_space =
> +            virginBodyDestination->buf().potentialSpaceSize();

Make const if possible.

> +    /// determine how much space the buffer needs to reserve
> +    size_t needBufferSpace(const SBuf &readBuf, const size_t minSpace) const;

need+something implies a notification or state changing method, which is
not what this method is about. Based on the caller, I suggest to

s/needBufferSpace/calcReadSize/ or
s/needBufferSpace/readSize/

>  void
>  HttpStateData::readReply(const CommIoCbParams &io)
>  {
...
> +    assert(!flags.do_next_read); // XXX: should have been set false by mayReadVirginBody()
...
> +    assert(Comm::IsConnOpen(serverConnection));
> +    assert(io.conn->fd == serverConnection->fd);

These are all new asserts in an async job callback. Should we use
Must()s instead to avoid crashing the whole Squid process when dealing
with what seems to be a transaction-specific problem?

> +    // how much we want to read
> +    const int read_size = needBufferSpace(inBuf, (limitBuffer - inBuf.length()));

Unsigned->signed conversion. Can you use size_t for read_size?

> +    if (read_size < 1) {

... and then convert this to !read_size or (read_size == 0) for clarity?

> +    /// amount of message payload/body received so far.
> +    int64_t payloadSeen;
> +    /// positive when we read more than we wanted
> +    int64_t payloadTruncated;

Have you considered making these unsigned?

> +    /// parse scan to find the mime headers block for current message
> +    bool findMimeBlock(const char *which, size_t limit);

Find is a usually a const method. This method does a lot more than just
finding MIME headers. To fix that and to supply information that the
method name does not already reveal, I would do:

/// returns whether the header was successfully parsed
bool parseMimeHeader(const char *headerKind, const size_t sizeLimit);

Note that the limit parameter is const.

> +        int64_t mimeHeaderBytes = 0;
> +        // XXX: c_str() reallocates. performance regression.
> +        if ((mimeHeaderBytes = headersEnd(buf_.c_str(), buf_.length())) == 0) {
> +            if (buf_.length()+firstLineSize() >= limit) {
> +                debugs(33, 5, "Too large " << which);
> +                parseStatusCode = Http::scHeaderTooLarge;
> +                parsingStage_ = HTTP_PARSE_DONE;
> +            } else
> +                debugs(33, 5, "Incomplete " << which << ", waiting for end of headers");
> +            return false;
> +        }
> +        mimeHeaderBlock_ = buf_.consume(mimeHeaderBytes);
> +        debugs(74, 5, "mime header (0-" << mimeHeaderBytes << ") {" << mimeHeaderBlock_ << "}");

The above is better written as

// XXX: c_str() reallocates. performance regression.
if (const int64_t mimeHeaderBytes = ...) {
   mimeHeaderBlock_ = buf_.consume(mimeHeaderBytes);
   debugs(...);
} else {
   ...
   return false;
}

but there is more that needs to be done in the above code:

> +    // Squid could handle these headers, but admin does not want to
> +    if (messageHeaderSize() >= limit) {
> +        debugs(33, 5, "Too large " << which);
> +        parseStatusCode = Http::scHeaderTooLarge;
> +        return false;
> +    }

This should be rewritten to go before we initialize mimeHeaderBlock_
IMO. Otherwise, the damage of keeping the admin-prohibited header is
already done.

> +const int
> +Http::One::ResponseParser::parseResponseStatusAndReason()
> +{
> +    if (buf_.isEmpty())
> +        return 0;
> +

Remove this check (and fix the tokenizer if it cannot handle empty buffers).

> +        // NP: search space is >3 to get terminator character)
> +        if(!tok.prefix(status, CharacterSet::DIGIT, 4))
> +            return -1; // invalid status
> +        // NOTE: multiple SP or non-SP bytes between version and status code are invalid.
> +        if (tok.atEnd())
> +            return 0; // need more to be sure we have it all
> +        if(!tok.skip(' '))
> +            return -1; // invalid status, a single SP terminator required
> +
> +        debugs(74, 6, "found string status-code=" << status);
> +
> +        // get the actual numeric value of the 0-3 digits we found
> +        ::Parser::Tokenizer t2(status);
> +        int64_t statusValue;
> +        if (!t2.int64(statusValue))
> +            return -1; // ouch. digits not forming a valid number?
> +        debugs(74, 6, "found int64 status-code=" << statusValue);
> +        if (statusValue < 0 || statusValue > 999)
> +            return -1; // ouch. digits not within valid status code range.

This sequence does not make sense to me: Incomplete status should be
treated the same as early atEnd(), but the former is treated as an
invalid status. 4-digit status should be considered as invalid but it is
not. Numeric value of 0 digits?! Etc., etc. It feels like you are still
parsing using low-level C function calls, not a Tokenizer that we worked
so hard to build!

The correct sequence should be similar to these lines:

    if (tok.prefix(status, CharacterSet::DIGIT, 3) && tok.skip(' '))
        debugs(74, 6, "found string status-code=" << status);
    else
    if (tok.atEnd())
        return 0; // good so far but need more to finish
    else
        return -1; // invalid status

    // get the actual numeric value of the digits we found
    ::Parser::Tokenizer t2(status);
    int64_t statusValue;
    if (!t2.int64(statusValue))
        return -1; // ouch. digits not forming a valid number?
    if (status < 0 || status > 999)
        return -1; // ouch. digits not within valid status code range.

However, if Tokenizer integer parsing interface were written correctly,
the above would have actually looked close to this:

    uint64_t status;
    if (tok.uint64(status, base10) && tok.skip(' '))
        debugs(74, 6, "found status-code=" << status);
    else
    if (tok.atEnd())
        return 0; // good so far but need more to finish
    else
        return -1; // invalid status

    if (status > 999)
        return -1; // status exceeds valid status code range

or even

    uint64_t status;
    if (tok.uint64(status, 3, base10) && tok.skip(' '))
        debugs(74, 6, "found status-code=" << status);
    else
    if (tok.atEnd())
        return 0; // good so far but need more to finish
    else
        return -1; // invalid status

The last form (with a size limit for uint64) allows the Tokenizer caller
to terminate parsing earlier, without waiting for the end of all-digit
input, but I am not sure that is a common need.

I urge you to fix Tokenizer integer parsing and rewrite the parsing code.

That is all I had time for today. I would have to come back to this
later. Sorry.

> * This has now had several hours of regular web browsing by myself
> (YouTube, Facebook, some news sites, and the Squid sites) with nothing
> noticably going wrong.

Any ICAP tests? The HTTP-ICAP interaction has been a source of many bugs
before and your patch changes the relevant code. If you have not tested
it, please do (or ask others to do it).

> * Polygraph shows a 7 RPS performance loss (0.36%), with trunk having
> +/-2 RPS (0.1%) normal fluctuation. I believe this is from the
> transitional regression to data-copy into MemBuf for chunked coding.

Did Polygraph actually use chunked responses in these tests?

Thank you,

Alex.