[squid-dev] [PATCH] Simpler and more robust request line parser
Amos Jeffries
squid3 at treenet.co.nz
Wed Nov 4 10:45:14 UTC 2015
Updated copy of your original patch attached. I have not made any of the
requested edits. Just ported so it would apply on current trunk.
Amos
On 4/11/2015 8:10 a.m., Amos Jeffries wrote:
> On 4/11/2015 5:27 a.m., Alex Rousskov wrote:
>> On 11/03/2015 02:12 AM, Amos Jeffries wrote:
>>>> On 24/07/2015 11:13 a.m., Alex Rousskov wrote:
>>>>> "I am happy to restrict URIs to what older Squids accepted or more.
>>>>> Recent parser changes broke Squid. There is no need to go beyond what we
>>>>> were doing before those changes. I will gladly make a simple patch
>>>>> change to restrict URIs to either that old subset or to the current
>>>>> trunk subset, whichever is larger. Exactly what subset do you want me to
>>>>> use?"
>>
>>
>>>> Current trunk already accepts all ASCII printable characters, and a
>>>> bunch of whitespace / control codes.
>>>>
>>>> "What I would like" is:
>>>>
>>>> // RFC 3986 URI component character set(s)
>>>> // RFC 7230 request-line delimiter
>>>> // RFC 7230 HTTP-version character set
>>>> auto permit = uriValidCharacters() + CharacterSet::SP;
>>>>
>>>> if (relaxed_header_parser) {
>>>>
>>>> // RFC7230 section 3.5 alternative control-code whitespace
>>>> permit += CharacterSet::HTAB
>>>> + CharacterSet("VT,FF","\x0B\x0C")
>>>> + CharacterSet::CR;
>>>>
>>>> #if USE_HTTP_VIOLATIONS
>>>> // RFC 2396 unwise character set
>>>> // the "transport agents are known to sometimes modify or use as
>>>> delimiter" set
>>>> // which must never be transmitted in un-escaped form
>>>> permit.add('\0');
>>>> permit += CharacterSet("RFC2396-unwise","\"\\|^<>`{}");
>>>> #endif
>>>> }
>>>>
>>>>
>>>> Which leaves some 27 CTL characters (0x01-0x1F) still forbidden. Things
>>>> like delete, backslash, bell, escape, end-of-transmission. I hope you
>>>> are not going to argue for re-enabling those inside this parser.
>>>>
>>>> Of course the bare UTF-8 extended unicode characters are omitted too.
>>>> Those is prohibited anywhere in HTTP, and a sure sign that its non-HTTP
>>>> being processed. So definitely exit this parser.
>>>>
>>>> IFF you have no objections to the above charsets then the only thing
>>>> this patch is waiting on in the technical sense is polygraph test results.
>>
>>
>>> Ping. Do we have agreement on those character sets ?
>>
>> Maybe. I asked for the exact "relaxed parser" "HTTP violating" set (the
>> only important case in the real world). You gave me code constructing
>> various sets under various conditions. Until I start porting the
>> proposed patch to trunk, I would not know whether your code is
>> compatible with what I think we should do, and what sets it constructs.
>>
>
> Strict parsing:
> 0x20-0x21, 0x23-0x3B, 0x3D, 0x3F-0x5B, 0x5D, 0x5F, 0x61-0x7A, 0x7E
>
> Relaxed parsing:
> 0x09-0x0d, 0x20-0x21, 0x23-0x3B, 0x3D, 0x3F-0x5B, 0x5D, 0x5F,
> 0x61-0x7A, 0x7E
>
> Relaxed parsing with USE_HTTP_VIOLATIONS enabled:
> 0x09-0x0d, 0x20-0x7E
>
>
> NOTE 1: violatison is enabled by default. To get non-violations
> behaviour people have to have explicitly configured strict parsing, or
> built the proxy without violations (reducing their relaxed parser to
> just the compliant tolerances).
>
> NOTE 2: HTTP is ASCII-only unless specifically documented. The
> request-line is not one of those special cases. The 0x80-0xFF octet
> range is not part of ASCII.
>
>
> I cannot do a proper comparison to the old-old parser because there was
> not just one of them. But 3 and bits of partial-parsing in a few places.
> They mostly appeared to accept 0x00-0x09, 0x0B-0xFF, with various bits
> of the buffer having specific exact-string matches applied or passed
> directly to syscalls depending on what code path was being followed.
>
> The biggest problem is the main syscall, isspace() which is locale
> dependent. So there are some really weird attacks possible by sending an
> HTTP message with locale-specific "whitspace" delimiters around the URL
> field, then the first actual SP (0x20) character on another line
> somewhere after the first \n character. Sometimes that second line gets
> parsed and used as the URL, not the one on the first line of the message.
>
>
> I can see where your temptation to accept 0x00-0xFF comes from. But I
> really think we need to do better with the validation side of parse now,
> since this parser is replacing whole code paths of both parse and
> validation.
>
> Rejecting the 0x01-0x08, 0x0E-0x1F, 0x7F set even in violations mode is
> a good start on that. Those characters are non-printable and can affect
> display in logs, terminals, and when debugging.
>
>
>
>> You also expressed hope that we will not re-enable support for CTL
>> characters. I do not know whether the old parser allowed [some of] them,
>> which is one of the primary acceptance conditions for me (by default, we
>> want to accept everything old Squid accepted and possibly more; any new
>> restrictions should be discussed).
>>
>> You also talked about Unicode which sounds completely irrelevant to me
>> because we are (or should be) dealing with octets and their 0-255
>> integer values, not any smart encoding on top of that.
>
> Unicode: octets 0x80-0xFF.
>
>>
>> In summary, I could not (and still cannot) respond with a quick
>> "agreed/disagreed". You did not give me enough information in an
>> digestible enough form to do that. You are certainly _not_ obligated to
>> provide the info I ask in a form I can quickly grok! I am just
>> explaining why I could not respond promptly.
>>
>>
>> Moreover, there is still some merging work required because you have
>> changed the parser between my review of your parser and my suggested
>> replacement patch. I could no longer simply adjust the sets and apply my
>> patch back then. That merging work is still required. If you have the
>> time to do that, please do it.
>
> The current parser is accepting the same sets above that I am requesting
> from yours, in those same above modes. It seems to be working okay so
> far as the charactersets go.
>
> Trying to update the patch now.
>
> Thank you.
>
> Amos
> _______________________________________________
> squid-dev mailing list
> squid-dev at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-dev
>
-------------- next part --------------
=== modified file 'src/http/one/RequestParser.cc'
--- src/http/one/RequestParser.cc 2015-07-16 05:09:30 +0000
+++ src/http/one/RequestParser.cc 2015-11-03 21:57:00 +0000
@@ -1,39 +1,44 @@
/*
* Copyright (C) 1996-2015 The Squid Software Foundation and contributors
*
* Squid software is distributed under GPLv2+ license and includes
* contributions from numerous individuals and organizations.
* Please see the COPYING and CONTRIBUTORS files for details.
*/
#include "squid.h"
#include "Debug.h"
#include "http/one/RequestParser.h"
#include "http/one/Tokenizer.h"
#include "http/ProtocolVersion.h"
#include "profiler/Profiler.h"
#include "SquidConfig.h"
+// the right debugs() level for parsing errors
+inline static int
+ErrorLevel() {
+ return Config.onoff.relaxed_header_parser < 0 ? DBG_IMPORTANT : 5;
+}
+
Http::One::RequestParser::RequestParser() :
- Parser(),
- firstLineGarbage_(0)
+ Parser()
{}
Http1::Parser::size_type
Http::One::RequestParser::firstLineSize() const
{
// RFC 7230 section 2.6
/* method SP request-target SP "HTTP/" DIGIT "." DIGIT CRLF */
return method_.image().length() + uri_.length() + 12;
}
/**
* Attempt to parse the first line of a new request message.
*
* Governed by RFC 7230 section 3.5
* "
* In the interest of robustness, a server that is expecting to receive
* and parse a request-line SHOULD ignore at least one empty line (CRLF)
* received prior to the request-line.
* "
*
@@ -45,404 +50,293 @@
{
if (Config.onoff.relaxed_header_parser) {
if (Config.onoff.relaxed_header_parser < 0 && (buf_[0] == '\r' || buf_[0] == '\n'))
debugs(74, DBG_IMPORTANT, "WARNING: Invalid HTTP Request: " <<
"CRLF bytes received ahead of request-line. " <<
"Ignored due to relaxed_header_parser.");
// Be tolerant of prefix empty lines
// ie any series of either \n or \r\n with no other characters and no repeated \r
while (!buf_.isEmpty() && (buf_[0] == '\n' || (buf_[0] == '\r' && buf_[1] == '\n'))) {
buf_.consume(1);
}
}
}
/**
* Attempt to parse the method field out of an HTTP message request-line.
*
* Governed by:
* RFC 1945 section 5.1
* RFC 7230 section 2.6, 3.1 and 3.5
- *
- * Parsing state is stored between calls. The current implementation uses
- * checkpoints after each successful request-line field.
- * The return value tells you whether the parsing is completed or not.
- *
- * \retval -1 an error occurred. parseStatusCode indicates HTTP status result.
- * \retval 1 successful parse. method_ is filled and buffer consumed including first delimiter.
- * \retval 0 more data is needed to complete the parse
*/
-int
-Http::One::RequestParser::parseMethodField(Http1::Tokenizer &tok, const CharacterSet &WspDelim)
+bool
+Http::One::RequestParser::parseMethodField(Http1::Tokenizer &tok)
{
- // scan for up to 16 valid method characters.
- static const size_t maxMethodLength = 16; // TODO: make this configurable?
-
// method field is a sequence of TCHAR.
SBuf methodFound;
- if (tok.prefix(methodFound, CharacterSet::TCHAR, maxMethodLength) && tok.skipOne(WspDelim)) {
-
- method_ = HttpRequestMethod(methodFound);
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
-
- } else if (tok.atEnd()) {
- debugs(74, 5, "Parser needs more data to find method");
- return 0;
-
- } // else error(s)
-
- // non-delimiter found after accepted method bytes means ...
- if (methodFound.length() == maxMethodLength) {
- // method longer than acceptible.
- // RFC 7230 section 3.1.1 mandatory (SHOULD) 501 response
- parseStatusCode = Http::scNotImplemented;
- debugs(33, 5, "invalid request-line. method too long");
- } else {
- // invalid character in the URL
- // RFC 7230 section 3.1.1 required (SHOULD) 400 response
+ if (!tok.prefix(methodFound, CharacterSet::TCHAR)) {
+ debugs(33, ErrorLevel(), "invalid request-line: missing or malformed method");
parseStatusCode = Http::scBadRequest;
- debugs(33, 5, "invalid request-line. missing method delimiter");
+ return false;
}
- return -1;
+ method_ = HttpRequestMethod(methodFound);
+ return true;
}
static CharacterSet
-uriValidCharacters()
+UriValidCharacters()
{
CharacterSet UriChars("URI-Chars","");
/* RFC 3986 section 2:
* "
* A URI is composed from a limited set of characters consisting of
* digits, letters, and a few graphic symbols.
* "
*/
// RFC 3986 section 2.1 - percent encoding "%" HEXDIG
UriChars.add('%');
UriChars += CharacterSet::HEXDIG;
// RFC 3986 section 2.2 - reserved characters
UriChars += CharacterSet("gen-delims", ":/?#[]@");
UriChars += CharacterSet("sub-delims", "!$&'()*+,;=");
// RFC 3986 section 2.3 - unreserved characters
UriChars += CharacterSet::ALPHA;
UriChars += CharacterSet::DIGIT;
UriChars += CharacterSet("unreserved", "-._~");
return UriChars;
}
-int
-Http::One::RequestParser::parseUriField(Http1::Tokenizer &tok)
+/// characters used to separate HTTP fields for tolerant parsers
+static const CharacterSet &
+RelaxedDelimiterCharacters()
+{
+ // RFC 7230 section 3.5
+ // tolerant parser MAY accept any of SP, HTAB, VT (%x0B), FF (%x0C), or bare CR
+ // as whitespace between request-line fields
+ static const CharacterSet RelaxedDels =
+ CharacterSet::SP +
+ CharacterSet::HTAB +
+ CharacterSet("VT,FF","\x0B\x0C") +
+ CharacterSet::CR;
+
+ return RelaxedDels;
+}
+
+/// characters used to separate HTTP fields
+const CharacterSet &
+Http::One::RequestParser::DelimiterCharacters()
{
- // URI field is a sequence of ... what? segments all have different valid charset
- // go with non-whitespace non-binary characters for now
- static CharacterSet UriChars = uriValidCharacters();
+ return Config.onoff.relaxed_header_parser ?
+ RelaxedDelimiterCharacters() : CharacterSet::SP;
+}
+
+const CharacterSet &
+Http::One::RequestParser::UriCharacters()
+{
+ static const CharacterSet StrictChars = UriValidCharacters();
+ static const CharacterSet StrictCharsPlusWhite = StrictChars + RelaxedDelimiterCharacters();
+#if USE_HTTP_VIOLATIONS
+ static const CharacterSet RelaxedChars = RelaxedDelimiterCharacters().complement("RelaxedUriChars");
+ static const CharacterSet RelaxedCharsPlusWhite = CharacterSet("Any", 0, 255);
+#endif
+
+ // no custom treatment of http0() for now, but it can go here
+
+ if (Config.onoff.relaxed_header_parser) {
+#if USE_HTTP_VIOLATIONS
+ return RelaxedCharsPlusWhite;
+#else
+ return StrictCharsPlusWhite;
+#endif
+ }
+
+#if USE_HTTP_VIOLATIONS
+ return RelaxedChars;
+#else
+ return StrictChars;
+#endif
+}
+bool
+Http::One::RequestParser::parseUriField(Http1::Tokenizer &tok)
+{
/* Arbitrary 64KB URI upper length limit.
*
* Not quite as arbitrary as it seems though. Old SquidString objects
* cannot store strings larger than 64KB, so we must limit until they
* have all been replaced with SBuf.
*
* Not that it matters but RFC 7230 section 3.1.1 requires (RECOMMENDED)
* at least 8000 octets for the whole line, including method and version.
*/
- const size_t maxUriLength = min(static_cast<size_t>(Config.maxRequestHeaderSize) - firstLineSize(),
- static_cast<size_t>((64*1024)-1));
+ const size_t maxUriLength = static_cast<size_t>((64*1024)-1);
SBuf uriFound;
-
- // RFC 7230 HTTP/1.x URI are followed by at least one whitespace delimiter
- if (tok.prefix(uriFound, UriChars, maxUriLength) && tok.skipOne(CharacterSet::SP)) {
- uri_ = uriFound;
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
-
- // RFC 1945 for GET the line terminator may follow URL instead of a delimiter
- } else if (method_ == Http::METHOD_GET && skipLineTerminator(tok)) {
- debugs(33, 5, "HTTP/0.9 syntax request-line detected");
- msgProtocol_ = Http::ProtocolVersion(0,9);
- uri_ = uriFound; // found by successful prefix() call earlier.
- parseStatusCode = Http::scOkay;
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
-
- } else if (tok.atEnd()) {
- debugs(74, 5, "Parser needs more data to find URI");
- return 0;
+ if (!tok.prefix(uriFound, UriCharacters())) {
+ parseStatusCode = Http::scBadRequest;
+ debugs(33, ErrorLevel(), "invalid request-line: missing or malformed URI");
+ return false;
}
- // else errors...
-
- if (uriFound.length() == maxUriLength) {
+ if (uriFound.length() > maxUriLength) {
// RFC 7230 section 3.1.1 mandatory (MUST) 414 response
parseStatusCode = Http::scUriTooLong;
- debugs(33, 5, "invalid request-line. URI longer than " << maxUriLength << " bytes");
- } else {
- // RFC 7230 section 3.1.1 required (SHOULD) 400 response
- parseStatusCode = Http::scBadRequest;
- debugs(33, 5, "invalid request-line. missing URI delimiter");
+ debugs(33, ErrorLevel(), "invalid request-line: " << uriFound.length() <<
+ "-byte URI exceeds " << maxUriLength << "-byte limit");
+ return false;
}
- return -1;
+
+ uri_ = uriFound;
+ return true;
}
-int
+bool
Http::One::RequestParser::parseHttpVersionField(Http1::Tokenizer &tok)
{
- // partial match of HTTP/1 magic prefix
- if (tok.remaining().length() < Http1magic.length() && Http1magic.startsWith(tok.remaining())) {
- debugs(74, 5, "Parser needs more data to find version");
- return 0;
- }
+ const auto savedTok = tok;
- if (!tok.skip(Http1magic)) {
- debugs(74, 5, "invalid request-line. not HTTP/1 protocol");
- parseStatusCode = Http::scHttpVersionNotSupported;
- return -1;
+ SBuf digit;
+ // Searching for Http1magic precludes detecting HTTP/2+ versions.
+ // Rewrite if we ever _need_ to return 505 (Version Not Supported) errors.
+ if (tok.suffix(digit, CharacterSet::DIGIT) && tok.skipSuffix(Http1magic)) {
+ msgProtocol_ = Http::ProtocolVersion(1, (*digit.rawContent() - '0'));
+ return true;
}
- if (tok.atEnd()) {
- debugs(74, 5, "Parser needs more data to find version");
- return 0;
+ // A GET request might use HTTP/0.9 syntax
+ if (method_ == Http::METHOD_GET) {
+ // RFC 1945 - no HTTP version field at all
+ tok = savedTok; // in case the URI ends with a digit
+ // report this assumption as an error if configured to triage parsing
+ debugs(33, ErrorLevel(), "assuming HTTP/0.9 request-line");
+ msgProtocol_ = Http::ProtocolVersion(0,9);
+ return true;
}
- // get the version minor DIGIT
- SBuf digit;
- if (tok.prefix(digit, CharacterSet::DIGIT, 1) && skipLineTerminator(tok)) {
+ debugs(33, ErrorLevel(), "invalid request-line: not HTTP");
+ parseStatusCode = Http::scBadRequest;
+ return false;
+}
- // found version fully AND terminator
- msgProtocol_ = Http::ProtocolVersion(1, (*digit.rawContent() - '0'));
- parseStatusCode = Http::scOkay;
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
+/**
+ * Skip characters separating request-line fields.
+ * To handle bidirectional parsing, the caller does the actual skipping and
+ * we just check how many character the caller has skipped.
+ */
+bool
+Http::One::RequestParser::skipDelimiter(const size_t count)
+{
+ if (count <= 0) {
+ debugs(33, ErrorLevel(), "invalid request-line: missing delimiter");
+ parseStatusCode = Http::scBadRequest;
+ return false;
+ }
- } else if (tok.atEnd() || (tok.skip('\r') && tok.atEnd())) {
- debugs(74, 5, "Parser needs more data to find version");
- return 0;
+ // tolerant parser allows multiple whitespace characters between request-line fields
+ if (count > 1 && !Config.onoff.relaxed_header_parser) {
+ debugs(33, ErrorLevel(), "invalid request-line: too many delimiters");
+ parseStatusCode = Http::scBadRequest;
+ return false;
+ }
- } // else error ...
+ return true;
+}
- // non-DIGIT. invalid version number.
- parseStatusCode = Http::scHttpVersionNotSupported;
- debugs(33, 5, "invalid request-line. garbage before line terminator");
- return -1;
+/// Parse CRs at the end of request-line, just before the terminating LF.
+bool
+Http::One::RequestParser::skipTrailingCrs(Http1::Tokenizer &tok)
+{
+ if (Config.onoff.relaxed_header_parser) {
+ (void)tok.skipAllTrailing(CharacterSet::CR); // optional; multiple OK
+ } else {
+ if (!tok.skipOneTrailing(CharacterSet::CR)) {
+ debugs(33, ErrorLevel(), "invalid request-line: missing CR before LF");
+ parseStatusCode = Http::scBadRequest;
+ return false;
+ }
+ }
+ return true;
}
/**
* Attempt to parse the first line of a new request message.
*
* Governed by:
* RFC 1945 section 5.1
* RFC 7230 section 2.6, 3.1 and 3.5
*
- * Parsing state is stored between calls. The current implementation uses
- * checkpoints after each successful request-line field.
- * The return value tells you whether the parsing is completed or not.
- *
* \retval -1 an error occurred. parseStatusCode indicates HTTP status result.
* \retval 1 successful parse. member fields contain the request-line items
* \retval 0 more data is needed to complete the parse
*/
int
Http::One::RequestParser::parseRequestFirstLine()
{
- Http1::Tokenizer tok(buf_);
-
debugs(74, 5, "parsing possible request: buf.length=" << buf_.length());
debugs(74, DBG_DATA, buf_);
- // NP: would be static, except it need to change with reconfigure
- CharacterSet WspDelim = CharacterSet::SP; // strict parse only accepts SP
-
- if (Config.onoff.relaxed_header_parser) {
- // RFC 7230 section 3.5
- // tolerant parser MAY accept any of SP, HTAB, VT (%x0B), FF (%x0C), or bare CR
- // as whitespace between request-line fields
- WspDelim += CharacterSet::HTAB
- + CharacterSet("VT,FF","\x0B\x0C")
- + CharacterSet::CR;
- debugs(74, 5, "using Parser relaxed WSP characters");
- }
-
- // only search for method if we have not yet found one
- if (method_ == Http::METHOD_NONE) {
- const int res = parseMethodField(tok, WspDelim);
- if (res < 1)
- return res;
- // else keep going...
- }
+ SBuf line;
- // tolerant parser allows multiple whitespace characters between request-line fields
- if (Config.onoff.relaxed_header_parser) {
- const size_t garbage = tok.skipAll(WspDelim);
- if (garbage > 0) {
- firstLineGarbage_ += garbage;
- buf_ = tok.remaining(); // re-checkpoint after garbage
- }
- }
- if (tok.atEnd()) {
+ // Earlier, skipGarbageLines() took care of any leading LFs (if allowed).
+ // Now, the request line has to end at the first LF.
+ static const CharacterSet lineChars = CharacterSet::LF.complement("notLF");
+ ::Parser::Tokenizer lineTok(buf_);
+ if (!lineTok.prefix(line, lineChars) || !lineTok.skip('\n')) {
debugs(74, 5, "Parser needs more data");
return 0;
}
- // from here on, we have two possible parse paths: whitespace tolerant, and strict
- if (Config.onoff.relaxed_header_parser) {
- // whitespace tolerant
+ Http1::Tokenizer tok(line);
+ const CharacterSet &delimiters = DelimiterCharacters();
- int warnOnError = (Config.onoff.relaxed_header_parser <= 0 ? DBG_IMPORTANT : 2);
+ if (!parseMethodField(tok))
+ return -1;
- // NOTES:
- // * this would be static, except WspDelim changes with reconfigure
- // * HTTP-version charset is included by uriValidCharacters()
- // * terminal CR is included by WspDelim here in relaxed parsing
- CharacterSet LfDelim = uriValidCharacters() + WspDelim;
-
- // seek the LF character, then tokenize the line in reverse
- SBuf line;
- if (tok.prefix(line, LfDelim) && tok.skip('\n')) {
- Http1::Tokenizer rTok(line);
- SBuf nil;
- (void)rTok.suffix(nil,CharacterSet::CR); // optional CR in terminator
- SBuf digit;
- if (rTok.suffix(digit,CharacterSet::DIGIT) && rTok.skipSuffix(Http1magic) && rTok.suffix(nil,WspDelim)) {
- uri_ = rTok.remaining();
- msgProtocol_ = Http::ProtocolVersion(1, (*digit.rawContent() - '0'));
- if (uri_.isEmpty()) {
- debugs(33, warnOnError, "invalid request-line. missing URL");
- parseStatusCode = Http::scBadRequest;
- return -1;
- }
-
- parseStatusCode = Http::scOkay;
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
-
- } else if (method_ == Http::METHOD_GET) {
- // RFC 1945 - for GET the line terminator may follow URL instead of a delimiter
- debugs(33, 5, "HTTP/0.9 syntax request-line detected");
- msgProtocol_ = Http::ProtocolVersion(0,9);
- static const SBuf cr("\r",1);
- uri_ = line.trim(cr,false,true);
- parseStatusCode = Http::scOkay;
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
- }
+ if (!skipDelimiter(tok.skipAll(delimiters)))
+ return -1;
- debugs(33, warnOnError, "invalid request-line. not HTTP");
- parseStatusCode = Http::scBadRequest;
- return -1;
- }
+ /* now parse backwards, to leave just the URI */
+ if (!skipTrailingCrs(tok))
+ return -1;
- if (!tok.atEnd()) {
+ if (!parseHttpVersionField(tok))
+ return -1;
-#if USE_HTTP_VIOLATIONS
- /*
- * RFC 3986 explicitly lists the characters permitted in URI.
- * A non-permitted character was found somewhere in the request-line.
- * However, as long as we can find the LF, accept the characters
- * which we know are invalid in any URI but actively used.
- */
- LfDelim.add('\0'); // Java
- LfDelim.add(' '); // IIS
- LfDelim.add('\"'); // Bing
- LfDelim.add('\\'); // MSIE, Firefox
- LfDelim.add('|'); // Amazon
- LfDelim.add('^'); // Microsoft News
-
- // other ASCII characters for which RFC 2396 has explicitly disallowed use
- // since 1998 and which were not later permitted by RFC 3986 in 2005.
- LfDelim.add('<'); // HTML embedded in URL
- LfDelim.add('>'); // HTML embedded in URL
- LfDelim.add('`'); // Shell Script embedded in URL
- LfDelim.add('{'); // JSON or Javascript embedded in URL
- LfDelim.add('}'); // JSON or Javascript embedded in URL
-
- // reset the tokenizer from anything the above did, then seek the LF character.
- tok.reset(buf_);
-
- if (tok.prefix(line, LfDelim) && tok.skip('\n')) {
-
- Http1::Tokenizer rTok(line);
-
- // strip terminating CR (if any)
- SBuf nil;
- (void)rTok.suffix(nil,CharacterSet::CR); // optional CR in terminator
- line = rTok.remaining();
-
- // strip terminating 'WSP HTTP-version' (if any)
- if (rTok.suffix(nil,CharacterSet::DIGIT) && rTok.skipSuffix(Http1magic) && rTok.suffix(nil,WspDelim)) {
- hackExpectsMime_ = true; // client thinks its speaking HTTP, probably sent a mime block.
- uri_ = rTok.remaining();
- } else
- uri_ = line; // no HTTP/1.x label found. Use the whole line.
-
- if (uri_.isEmpty()) {
- debugs(33, warnOnError, "invalid request-line. missing URL");
- parseStatusCode = Http::scBadRequest;
- return -1;
- }
-
- debugs(33, warnOnError, "invalid request-line. treating as HTTP/0.9" << (hackExpectsMime_?" (with mime)":""));
- msgProtocol_ = Http::ProtocolVersion(0,9);
- parseStatusCode = Http::scOkay;
- buf_ = tok.remaining(); // incremental parse checkpoint
- return 1;
-
- } else if (tok.atEnd()) {
- debugs(74, 5, "Parser needs more data");
- return 0;
- }
- // else, drop back to invalid request-line handling
-#endif
- const SBuf t = tok.remaining();
- debugs(33, warnOnError, "invalid request-line characters." << Raw("data", t.rawContent(), t.length()));
- parseStatusCode = Http::scBadRequest;
- return -1;
- }
- debugs(74, 5, "Parser needs more data");
- return 0;
- }
- // else strict non-whitespace tolerant parse
+ if (!http0() && !skipDelimiter(tok.skipAllTrailing(delimiters)))
+ return -1;
- // only search for request-target (URL) if we have not yet found one
- if (uri_.isEmpty()) {
- const int res = parseUriField(tok);
- if (res < 1 || msgProtocol_.protocol == AnyP::PROTO_HTTP)
- return res;
- // else keep going...
- }
+ /* parsed everything before and after the URI */
- if (tok.atEnd()) {
- debugs(74, 5, "Parser needs more data");
- return 0;
- }
+ if (!parseUriField(tok))
+ return -1;
- // HTTP/1 version suffix (protocol magic) followed by CR*LF
- if (msgProtocol_.protocol == AnyP::PROTO_NONE) {
- return parseHttpVersionField(tok);
+ if (!tok.atEnd()) {
+ debugs(33, ErrorLevel(), "invalid request-line: garbage after URI");
+ parseStatusCode = Http::scBadRequest;
+ return -1;
}
- // If we got here this method has been called too many times
- parseStatusCode = Http::scInternalServerError;
- debugs(33, 5, "ERROR: Parser already processed request-line");
- return -1;
+ parseStatusCode = Http::scOkay;
+ buf_ = lineTok.remaining(); // incremental parse checkpoint
+ return 1;
}
bool
Http::One::RequestParser::parse(const SBuf &aBuf)
{
buf_ = aBuf;
debugs(74, DBG_DATA, "Parse buf={length=" << aBuf.length() << ", data='" << aBuf << "'}");
// stage 1: locate the request-line
if (parsingStage_ == HTTP_PARSE_NONE) {
skipGarbageLines();
// if we hit something before EOS treat it as a message
if (!buf_.isEmpty())
parsingStage_ = HTTP_PARSE_FIRST;
else
return false;
}
// stage 2: parse the request-line
=== modified file 'src/http/one/RequestParser.h'
--- src/http/one/RequestParser.h 2015-04-10 11:02:44 +0000
+++ src/http/one/RequestParser.h 2015-11-03 20:33:49 +0000
@@ -30,40 +30,44 @@
class RequestParser : public Http1::Parser
{
public:
RequestParser();
virtual ~RequestParser() {}
/* Http::One::Parser API */
virtual void clear() {*this = RequestParser();}
virtual Http1::Parser::size_type firstLineSize() const;
virtual bool parse(const SBuf &aBuf);
/// the HTTP method if this is a request message
const HttpRequestMethod & method() const {return method_;}
/// the request-line URI if this is a request message, or an empty string.
const SBuf &requestUri() const {return uri_;}
private:
void skipGarbageLines();
int parseRequestFirstLine();
- int parseMethodField(Http1::Tokenizer &, const CharacterSet &);
- int parseUriField(Http1::Tokenizer &);
- int parseHttpVersionField(Http1::Tokenizer &);
+
+ /* all these return false and set parseStatusCode on parsing failures */
+ bool parseMethodField(Http1::Tokenizer &);
+ bool parseUriField(Http1::Tokenizer &);
+ bool parseHttpVersionField(Http1::Tokenizer &);
+ bool skipDelimiter(const size_t count);
+ bool skipTrailingCrs(Http1::Tokenizer &tok);
+
+ bool http0() const {return !msgProtocol_.major;}
+ static const CharacterSet &DelimiterCharacters();
+ static const CharacterSet &UriCharacters();
/// what request method has been found on the first line
HttpRequestMethod method_;
/// raw copy of the original client request-line URI field
SBuf uri_;
-
- /// amount of garbage bytes tolerantly skipped inside the request-line
- /// may be -1 if sender only omitted CR on terminator
- int64_t firstLineGarbage_;
};
} // namespace One
} // namespace Http
#endif /* _SQUID_SRC_HTTP_ONE_REQUESTPARSER_H */
=== modified file 'src/tests/testHttp1Parser.cc'
--- src/tests/testHttp1Parser.cc 2015-02-20 03:25:12 +0000
+++ src/tests/testHttp1Parser.cc 2015-11-03 20:50:22 +0000
@@ -36,55 +36,99 @@
// default to strict parser. set for loose parsing specifically where behaviour differs.
Config.onoff.relaxed_header_parser = 0;
Config.maxRequestHeaderSize = 1024; // XXX: unit test the RequestParser handling of this limit
}
#if __cplusplus >= 201103L
struct resultSet {
bool parsed;
bool needsMore;
Http1::ParseState parserState;
Http::StatusCode status;
SBuf::size_type suffixSz;
HttpRequestMethod method;
const char *uri;
AnyP::ProtocolVersion version;
};
+// define SQUID_DEBUG_TESTS to see exactly which test sub-cases fail and where
+#ifdef SQUID_DEBUG_TESTS
+// not optimized for runtime use
+static void
+Replace(SBuf &where, const SBuf &what, const SBuf &with)
+{
+ // prevent infinite loops
+ if (!what.length() || with.find(what) != SBuf::npos)
+ return;
+
+ SBuf::size_type pos = 0;
+ while ((pos = where.find(what, pos)) != SBuf::npos) {
+ SBuf buf = where.substr(0, pos);
+ buf.append(with);
+ buf.append(where.substr(pos+what.length()));
+ where = buf;
+ pos += with.length();
+ }
+}
+
+static SBuf Pretty(SBuf raw)
+{
+ Replace(raw, SBuf("\r"), SBuf("\\r"));
+ Replace(raw, SBuf("\n"), SBuf("\\n"));
+ return raw;
+}
+#endif
+
static void
testResults(int line, const SBuf &input, Http1::RequestParser &output, struct resultSet &expect)
{
-#if WHEN_TEST_DEBUG_IS_NEEDED
- printf("TEST @%d, in=%u: " SQUIDSBUFPH "\n", line, input.length(), SQUIDSBUFPRINT(input));
+#ifdef SQUID_DEBUG_TESTS
+ std::cerr << "TEST @" << line << ", in=" << Pretty(input) << "\n";
+#endif
+
+ const bool parsed = output.parse(input);
+
+#ifdef SQUID_DEBUG_TESTS
+ if (expect.parsed != parsed)
+ std::cerr << "\tparse-FAILED: " << expect.parsed << "!=" << parsed << "\n";
+ else if (parsed && expect.method != output.method_)
+ std::cerr << "\tmethod-FAILED: " << expect.method << "!=" << output.method_ << "\n";
+ if (expect.status != output.parseStatusCode)
+ std::cerr << "\tscode-FAILED: " << expect.status << "!=" << output.parseStatusCode << "\n";
+ if (expect.suffixSz != output.buf_.length())
+ std::cerr << "\tsuffixSz-FAILED: " << expect.suffixSz << "!=" << output.buf_.length() << "\n";
#endif
// runs the parse
- CPPUNIT_ASSERT_EQUAL(expect.parsed, output.parse(input));
+ CPPUNIT_ASSERT_EQUAL(expect.parsed, parsed);
+
+ // if parsing was successful, check easily visible field outputs
+ if (parsed) {
+ CPPUNIT_ASSERT_EQUAL(expect.method, output.method_);
+ if (expect.uri != NULL)
+ CPPUNIT_ASSERT_EQUAL(0, output.uri_.cmp(expect.uri));
+ CPPUNIT_ASSERT_EQUAL(expect.version, output.msgProtocol_);
+ }
- // check easily visible field outputs
- CPPUNIT_ASSERT_EQUAL(expect.method, output.method_);
- if (expect.uri != NULL)
- CPPUNIT_ASSERT_EQUAL(0, output.uri_.cmp(expect.uri));
- CPPUNIT_ASSERT_EQUAL(expect.version, output.msgProtocol_);
CPPUNIT_ASSERT_EQUAL(expect.status, output.parseStatusCode);
// check more obscure states
CPPUNIT_ASSERT_EQUAL(expect.needsMore, output.needsMoreData());
if (output.needsMoreData())
CPPUNIT_ASSERT_EQUAL(expect.parserState, output.parsingStage_);
CPPUNIT_ASSERT_EQUAL(expect.suffixSz, output.buf_.length());
}
#endif /* __cplusplus >= 200103L */
void
testHttp1Parser::testParserConstruct()
{
// whether the constructor works
{
Http1::RequestParser output;
CPPUNIT_ASSERT_EQUAL(true, output.needsMoreData());
CPPUNIT_ASSERT_EQUAL(Http1::HTTP_PARSE_NONE, output.parsingStage_);
CPPUNIT_ASSERT_EQUAL(Http::scNone, output.parseStatusCode); // XXX: clear() not being called.
CPPUNIT_ASSERT(output.buf_.isEmpty());
@@ -129,41 +173,41 @@
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,0,9)
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// RFC 1945 : invalid HTTP/0.9 simple-request (only GET is valid)
{
input.append("POST /\r\n", 8);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
- .suffixSz = 3,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_POST),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// RFC 1945 and 7230 : HTTP/1.0 request
{
input.append("GET / HTTP/1.0\r\n", 16);
struct resultSet expect = {
.parsed = false,
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
@@ -200,183 +244,183 @@
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,2)
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// RFC 7230 : future versions do not use request-line syntax
{
input.append("GET / HTTP/2.0\r\n", 16);
struct resultSet expectA = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_MIME,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
- .method = HttpRequestMethod(Http::METHOD_GET),
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectA);
input.clear();
input.append("GET / HTTP/10.12\r\n", 18);
struct resultSet expectB = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_MIME,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectB);
input.clear();
}
// unknown non-HTTP protocol names
{
input.append("GET / FOO/1.0\r\n", 15);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// no version digits
{
input.append("GET / HTTP/\r\n", 13);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// no major version
{
input.append("GET / HTTP/.1\r\n", 15);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// no version dot
{
input.append("GET / HTTP/11\r\n", 15);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// negative major version (bug 3062)
{
input.append("GET / HTTP/-999999.1\r\n", 22);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// no minor version
{
input.append("GET / HTTP/1.\r\n", 15);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// negative major version (bug 3062 corollary)
{
input.append("GET / HTTP/1.-999999\r\n", 22);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
}
void
testHttp1Parser::testParseRequestLineStrange()
{
// ensure MemPools etc exist
globalSetup();
SBuf input;
Http1::RequestParser output;
// space padded URL
@@ -386,75 +430,75 @@
Config.onoff.relaxed_header_parser = 1;
struct resultSet expect = {
.parsed = false,
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1)
};
output.clear();
testResults(__LINE__, input, output, expect);
Config.onoff.relaxed_header_parser = 0;
struct resultSet expectStrict = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
- .suffixSz = input.length()-4,
- .method = HttpRequestMethod(Http::METHOD_GET),
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectStrict);
input.clear();
}
// whitespace inside URI. (nasty but happens)
{
input.append("GET /fo o/ HTTP/1.1\r\n", 21);
Config.onoff.relaxed_header_parser = 1;
struct resultSet expect = {
.parsed = false,
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/fo o/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1)
};
output.clear();
testResults(__LINE__, input, output, expect);
Config.onoff.relaxed_header_parser = 0;
struct resultSet expectStrict = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported, // version being "o/ HTTP/1.1"
- .suffixSz = 13,
- .method = HttpRequestMethod(Http::METHOD_GET),
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectStrict);
input.clear();
}
// additional data in buffer
{
input.append("GET / HTTP/1.1\r\nboo!", 20);
struct resultSet expect = {
.parsed = false,
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 4, // strlen("boo!")
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1)
@@ -481,98 +525,98 @@
input.append("GET / HTTP/1.1\n", 15);
Config.onoff.relaxed_header_parser = 1;
struct resultSet expect = {
.parsed = false,
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1)
};
output.clear();
testResults(__LINE__, input, output, expect);
Config.onoff.relaxed_header_parser = 0;
struct resultSet expectStrict = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = 9,
- .method = HttpRequestMethod(Http::METHOD_GET),
- .uri = "/",
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
+ .uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectStrict);
input.clear();
}
// alternative EOL sequence: double-NL-only
// RFC 7230 tolerance permits omitted CR
// NP: represents a request with no mime headers
{
input.append("GET / HTTP/1.1\n\n", 16);
Config.onoff.relaxed_header_parser = 1;
struct resultSet expect = {
.parsed = true,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(Http::METHOD_GET),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1)
};
output.clear();
testResults(__LINE__, input, output, expect);
Config.onoff.relaxed_header_parser = 0;
struct resultSet expectStrict = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = 10,
- .method = HttpRequestMethod(Http::METHOD_GET),
- .uri = "/",
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
+ .uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectStrict);
input.clear();
}
// space padded version
{
// RFC 7230 specifies version is followed by CRLF. No intermediary bytes.
input.append("GET / HTTP/1.1 \r\n", 17);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scHttpVersionNotSupported,
- .suffixSz = input.length()-6,
- .method = HttpRequestMethod(Http::METHOD_GET),
- .uri = "/",
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
+ .uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
}
void
testHttp1Parser::testParseRequestLineMethods()
{
// ensure MemPools etc exist
globalSetup();
SBuf input;
Http1::RequestParser output;
// RFC 7230 : dot method
{
input.append(". / HTTP/1.1\r\n", 14);
@@ -628,57 +672,59 @@
}
// unknown method
{
input.append("HELLOWORLD / HTTP/1.1\r\n", 23);
struct resultSet expect = {
.parsed = false,
.needsMore = true,
.parserState = Http1::HTTP_PARSE_MIME,
.status = Http::scOkay,
.suffixSz = 0,
.method = HttpRequestMethod(SBuf("HELLOWORLD")),
.uri = "/",
.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1)
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
+#if 0
// too-long method (over 16 bytes)
{
input.append("HELLOSTRANGEWORLD / HTTP/1.1\r\n", 31);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scNotImplemented,
.suffixSz = input.length(),
.method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
+#endif
// method-only
{
input.append("A\n", 2);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
.suffixSz = input.length(),
.method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
{
@@ -865,122 +911,123 @@
}
// no method (with method delimiter)
{
input.append(" / HTTP/1.0\n", 12);
struct resultSet expectStrict = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
.suffixSz = input.length(),
.method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectStrict);
input.clear();
}
- // binary code in method (invalid)
+ // binary code after method (invalid)
{
- input.append("GET\x0A / HTTP/1.1\r\n", 17);
+ input.append("GET\x16 / HTTP/1.1\r\n", 17);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
.suffixSz = input.length(),
.method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
- // binary code NUL! in method (always invalid)
+ // binary code NUL! after method (always invalid)
{
input.append("GET\0 / HTTP/1.1\r\n", 17);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
.suffixSz = input.length(),
.method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
- // no URL (grammer invalid, ambiguous with RFC 1945 HTTP/0.9 simple-request)
+ // Either an RFC 1945 HTTP/0.9 simple-request for an "HTTP/1.1" URI or
+ // an invalid (no URI) HTTP/1.1 request. We treat this as latter, naturally.
{
input.append("GET HTTP/1.1\r\n", 15);
- // RFC 7230 tolerance allows sequence of SP to make this ambiguous
Config.onoff.relaxed_header_parser = 1;
struct resultSet expect = {
- .parsed = true,
+ .parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scOkay,
- .suffixSz = 0,
- .method = HttpRequestMethod(Http::METHOD_GET),
- .uri = "HTTP/1.1",
- .version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,0,9)
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
+ .uri = NULL,
+ .version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
Config.onoff.relaxed_header_parser = 0;
struct resultSet expectStrict = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
- .suffixSz = 11,
- .method = HttpRequestMethod(Http::METHOD_GET),
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expectStrict);
input.clear();
}
- // no URL (grammer invalid, ambiguous with RFC 1945 HTTP/0.9 simple-request)
+ // Either an RFC 1945 HTTP/0.9 simple-request for an "HTTP/1.1" URI or
+ // an invalid (no URI) HTTP/1.1 request. We treat this as latter, naturally.
{
input.append("GET HTTP/1.1\r\n", 14);
struct resultSet expect = {
- .parsed = true,
+ .parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
- .status = Http::scOkay,
- .suffixSz = 0,
- .method = HttpRequestMethod(Http::METHOD_GET),
- .uri = "HTTP/1.1",
- .version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,0,9)
+ .status = Http::scBadRequest,
+ .suffixSz = input.length(),
+ .method = HttpRequestMethod(),
+ .uri = NULL,
+ .version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
// binary line
{
input.append("\xB\xC\xE\xF\n", 5);
struct resultSet expect = {
.parsed = false,
.needsMore = false,
.parserState = Http1::HTTP_PARSE_DONE,
.status = Http::scBadRequest,
.suffixSz = input.length(),
.method = HttpRequestMethod(),
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
@@ -1019,43 +1066,41 @@
.uri = NULL,
.version = AnyP::ProtocolVersion()
};
output.clear();
testResults(__LINE__, input, output, expect);
input.clear();
}
}
void
testHttp1Parser::testDripFeed()
{
// Simulate a client drip-feeding Squid a few bytes at a time.
// extend the size of the buffer from 0 bytes to full request length
// calling the parser repeatedly as visible data grows.
SBuf data;
data.append("\n\n\n\n\n\n\n\n\n\n\n\n", 12);
SBuf::size_type garbageEnd = data.length();
data.append("GET ", 4);
- SBuf::size_type methodEnd = data.length()-1;
data.append("http://example.com/ ", 20);
- SBuf::size_type uriEnd = data.length()-1;
data.append("HTTP/1.1\r\n", 10);
SBuf::size_type reqLineEnd = data.length() - 1;
data.append("Host: example.com\r\n\r\n", 21);
SBuf::size_type mimeEnd = data.length() - 1;
data.append("...", 3); // trailer to catch mime EOS errors.
SBuf ioBuf;
Http1::RequestParser hp;
// start with strict and move on to relaxed
Config.onoff.relaxed_header_parser = 2;
Config.maxRequestHeaderSize = 1024; // large enough to hold the test data.
do {
// state of things we expect right now
struct resultSet expect = {
.parsed = false,
.needsMore = true,
@@ -1074,53 +1119,40 @@
for (SBuf::size_type pos = 0; pos <= data.length(); ++pos) {
// simulate reading one more byte
ioBuf.append(data.substr(pos,1));
// strict does not permit the garbage prefix
if (pos < garbageEnd && !Config.onoff.relaxed_header_parser) {
ioBuf.clear();
continue;
}
// when the garbage is passed we expect to start seeing first-line bytes
if (pos == garbageEnd)
expect.parserState = Http1::HTTP_PARSE_FIRST;
// all points after garbage start to see accumulated bytes looking for end of current section
if (pos >= garbageEnd)
expect.suffixSz = ioBuf.length();
- // at end of request line expect to see method details
- if (pos == methodEnd) {
- expect.suffixSz = 0; // and a checkpoint buffer reset
- expect.method = HttpRequestMethod(Http::METHOD_GET);
- }
-
- // at end of URI strict expects to see method, URI details
- // relaxed must wait to end of line for whitespace tolerance
- if (pos == uriEnd && !Config.onoff.relaxed_header_parser) {
- expect.suffixSz = 0; // and a checkpoint buffer reset
- expect.uri = "http://example.com/";
- }
-
// at end of request line expect to see method, URI, version details
// and switch to seeking Mime header section
if (pos == reqLineEnd) {
expect.parserState = Http1::HTTP_PARSE_MIME;
expect.suffixSz = 0; // and a checkpoint buffer reset
expect.status = Http::scOkay;
expect.method = HttpRequestMethod(Http::METHOD_GET);
expect.uri = "http://example.com/";
expect.version = AnyP::ProtocolVersion(AnyP::PROTO_HTTP,1,1);
}
// one mime header is done we are expecting a new request
// parse results say true and initial data is all gone from the buffer
if (pos == mimeEnd) {
expect.parsed = true;
expect.needsMore = false;
expect.suffixSz = 0; // and a checkpoint buffer reset
}
testResults(__LINE__, ioBuf, hp, expect);
More information about the squid-dev
mailing list