[squid-dev] Incremental parsing of chunked quoted extensions

Eduard Bagdasaryan eduard.bagdasaryan at measurement-factory.com
Tue Oct 16 11:49:23 UTC 2018


Since there have not been any objections so far, I am going
to start implementing the incremental parsing approach,
outlined here.


Eduard.

On 05.10.2018 18:34, Alex Rousskov wrote:
> I doubt that writing code to explicitly buffer the whole line before
> parsing extensions is necessary. It is definitely not desirable because
> it slows things down in_common_  cases. I would adjust the code to
> follow this sketch instead:
>
>      while (!atEof()) {
>          if (skipLineTerminator())
>              return true; // reached the end of extensions (if any)
>          if (parseChunkExtension())
>              continue; // got one extension; there may be more
>          if (tok.skipAll(NotLF) && skip(LF))
>              throw "cannot parse chunk extension"; // <garbage> CR*LF
>          return false; // need more data to finish parsing extension
>      }
>      return false; // need data to start parsing extension or CRLF
>
> where skipLineTerminator() is, essentially,
>
>      return tok.skip(CRLF)) || (relaxed_parser && tok.skipOne(LF));
>
>
> As you can see, the above sketch is optimized for the common cases and
> blindly seeks to the end of the line only in rare cases of partially
> received or malformed extensions.
>
> Please note that whether the above sketch is used incrementally is an
> _orthogonal_  question:
>
> * In incremental parsing, a successful parseChunkExtension() would
> update object state to remember the parsed extension and advance the
> parsing position to never come back to the same spot. The loop caller
> would advance the parsing position/state if (and only if) the loop
> returns true.
>
> * In non-incremental parsing, a successful parseChunkExtension() would
> update its parameter (not shown) to remember the parsed extension. The
> loop caller would update object state ("committing" that not shown
> parameter info) and advance the parsing position if (and only if) the
> loop returns true. [ Alternatively, a successful parseChunkExtension()
> would update object state directly, sometimes overwriting the previous
> (identical) value many times. The current useOriginBody code uses that
> simpler scheme. However, even with this simpler alternative, the
> advancement is controlled by the loop caller. ]
>
> Should we fix and continue to use incremental parsing here or abandon
> it? The answer probably depends on overall code simplicity. Since we
> have to use incremental parsing for parsing chunked encoding as a whole
> (as discussed in the beginning of this email), we may continue to use it
> for parsing extensions if (and only if) doing so does not complicate things.



More information about the squid-dev mailing list