[squid-dev] [PATCH] HTTP Response Parser upgrade

Amos Jeffries squid3 at treenet.co.nz
Thu Nov 13 16:08:20 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This patch contains what I originally planned to be 2 steps:

1) convert the HTTP server read buffer to an SBuf using the same design
and Comm::Read API implemented previousy for the client connections.

The buffer remains default initialized at 16KB per connection but is no
longer absolutely limited to 256KB. Instead it is limited by
configuration options controlling maximum server input sizes on
read_ahead_gap and response message headers.

The Client API has been extended with a new method to estimate size
requirements of an SBuf I/O buffer. Modelled on and deprecating the
existing MemBuf estimator.

The Comm::ReadNow() API is extended to allow limited-size read(2)
operations by setting the CommIoCbParams::size parameter.

The HttpStateData buffer is partially detached from
StoreEntry::delayAwareRead() API due to requirements of the
Comm::ReadNow() API. Instead StoreEntry::bytesWanted() is used directly
to determine read(2) size, and DeferredRead are generated only when
ReadNow() is actually and immediately to be deferred. Theoretically this
means less read operations get deferred in some high load cases.
Practically it means there is no longer an AsyncCall queue plus socket
wait delay between delay_pools promising a read size, doing the
read(2), and accounting for the bytes received - accuracy should be much
improved under load.

This introduces one temporary performance regression converting the SBuf
content to MemBuf for chunked decoder to process.



2) add Http1::ResponseParser class for use parsing HTTP response messages.

Modelled on the same design as used for the HTTP RequestParser, and
inheriting from a mutual parent Http1::Parser.

The Parser is Tokeniser based, incremental and 'consumes' bytes out of
the buffer as they are parsed.

This Parser class recognises HTTP/1.x and ICY/1 syntax messages. Any
unknown syntax input is assumed to be HTTP "0.9" and it will
gateway/transform the response to HTTP/1.1.
 NOTE: these are all semantic actions performed by the code being
replaced in (3). Only the form and OO scoping has changed.

The mime header block detection operation is generalized into the
Http1::Parser for use by both RequestParser and ResponseParser. The
request_parse_status error code has also been adapted for shared use.



3) integrate the HTTP1::ResponseParser with HttpStateData server
response processing.

This is largely code shuffling. Though I have extended the EOF \r\n hack
such that it enables Squid to parse truncated response headers.



Due to polygraph being out of service presently I'm unable to compare
performance to trunk. Coadvisor tests underway now.

Amos

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUZNdsAAoJELJo5wb/XPRjrtAIAKHjMfGST7RbUO7Dk63/xd/Y
9lhU2naIQpUmvn7LFLLz5B8VBoot13UCA+lhhRxunPjpapJuidZV0pq8lMlGNuTA
gxd46QEXkGX5t1mjIdks+lenIZ91vQDS4Jx5UEyG0Up+llJII4OGEwSX9h2nSssm
8e6PndL+r7XwMqcMIbVMwdTtIpZ0tPx5uGgGxzuXlperqVUy9ohf0i+9qPN7Ik9c
nlNzVWVNQ68YKcQp6PcqMy7HN3Z5j3kG5YxRfyH4bQlpKl2uNVCXhGI1F3AATAlS
HwyjwXFX+5avybi3pTFj7waEof4zI0LoCTqqBTPnn7lgIYS/CBU9SeYY7tqMgr8=
=g4Cf
-----END PGP SIGNATURE-----
-------------- next part --------------
=== modified file 'src/client_side.cc'
--- src/client_side.cc	2014-11-10 11:45:36 +0000
+++ src/client_side.cc	2014-11-13 12:15:31 +0000
@@ -2032,41 +2032,41 @@
              CharacterSet::HEXDIG + CharacterSet::ALPHA + CharacterSet::DIGIT;
         if (!tok.skipAll(authority))
             break;
 
         static const SBuf slashUri("/");
         const SBuf t = tok.remaining();
         if (t.isEmpty())
             url = slashUri;
         else if (t[0]=='/') // looks like path
             url = t;
         else if (t[0]=='?' || t[0]=='#') { // looks like query or fragment. fix '/'
             url = slashUri;
             url.append(t);
         } // else do nothing. invalid path
 
     } while(false);
 
 #if SHOULD_REJECT_UNKNOWN_URLS
     // reject URI which are not well-formed even after the processing above
     if (url.isEmpty() || url[0] != '/') {
-        hp->request_parse_status = Http::scBadRequest;
+        hp->parseStatusCode = Http::scBadRequest;
         return conn->abortRequestParsing("error:invalid-request");
     }
 #endif
 
     if (vport < 0)
         vport = http->getConn()->clientConnection->local.port();
 
     const bool switchedToHttps = conn->switchedToHttps();
     const bool tryHostHeader = vhost || switchedToHttps;
     char *host = NULL;
     if (tryHostHeader && (host = hp->getHeaderField("Host"))) {
         debugs(33, 5, "ACCEL VHOST REWRITE: vhost=" << host << " + vport=" << vport);
         char thost[256];
         if (vport > 0) {
             thost[0] = '\0';
             char *t = NULL;
             if (host[strlen(host)] != ']' && (t = strrchr(host,':')) != NULL) {
                 strncpy(thost, host, (t-host));
                 snprintf(thost+(t-host), sizeof(thost)-(t-host), ":%d", vport);
                 host = thost;
@@ -2146,65 +2146,65 @@
  *  \param[out] http_ver will be set as a side-effect of the parsing
  *  \return NULL on incomplete requests,
  *          a ClientSocketContext structure on success or failure.
  */
 ClientSocketContext *
 parseHttpRequest(ConnStateData *csd, const Http1::RequestParserPointer &hp)
 {
     /* Attempt to parse the first line; this will define where the method, url, version and header begin */
     {
         const bool parsedOk = hp->parse(csd->in.buf);
 
         // sync the buffers after parsing.
         csd->in.buf = hp->remaining();
 
         if (hp->needsMoreData()) {
             debugs(33, 5, "Incomplete request, waiting for end of request line");
             return NULL;
         }
 
         if (!parsedOk) {
-            if (hp->request_parse_status == Http::scRequestHeaderFieldsTooLarge || hp->request_parse_status == Http::scUriTooLong)
+            if (hp->parseStatusCode == Http::scRequestHeaderFieldsTooLarge || hp->parseStatusCode == Http::scUriTooLong)
                 return csd->abortRequestParsing("error:request-too-large");
 
             return csd->abortRequestParsing("error:invalid-request");
         }
     }
 
     /* We know the whole request is in parser now */
     debugs(11, 2, "HTTP Client " << csd->clientConnection);
     debugs(11, 2, "HTTP Client REQUEST:\n---------\n" <<
            hp->method() << " " << hp->requestUri() << " " << hp->messageProtocol() << "\n" <<
            hp->mimeHeader() <<
            "\n----------");
 
     /* deny CONNECT via accelerated ports */
     if (hp->method() == Http::METHOD_CONNECT && csd->port != NULL && csd->port->flags.accelSurrogate) {
         debugs(33, DBG_IMPORTANT, "WARNING: CONNECT method received on " << csd->transferProtocol << " Accelerator port " << csd->port->s.port());
         debugs(33, DBG_IMPORTANT, "WARNING: for request: " << hp->method() << " " << hp->requestUri() << " " << hp->messageProtocol());
-        hp->request_parse_status = Http::scMethodNotAllowed;
+        hp->parseStatusCode = Http::scMethodNotAllowed;
         return csd->abortRequestParsing("error:method-not-allowed");
     }
 
     if (hp->method() == Http::METHOD_NONE) {
         debugs(33, DBG_IMPORTANT, "WARNING: Unsupported method: " << hp->method() << " " << hp->requestUri() << " " << hp->messageProtocol());
-        hp->request_parse_status = Http::scMethodNotAllowed;
+        hp->parseStatusCode = Http::scMethodNotAllowed;
         return csd->abortRequestParsing("error:unsupported-request-method");
     }
 
     // Process headers after request line
     debugs(33, 3, "complete request received. " <<
            "prefix_sz = " << hp->messageHeaderSize() <<
            ", request-line-size=" << hp->firstLineSize() <<
            ", mime-header-size=" << hp->headerBlockSize() <<
            ", mime header block:\n" << hp->mimeHeader() << "\n----------");
 
     /* Ok, all headers are received */
     ClientHttpRequest *http = new ClientHttpRequest(csd);
 
     http->req_sz = hp->messageHeaderSize();
     ClientSocketContext *result = new ClientSocketContext(csd->clientConnection, http);
 
     StoreIOBuffer tempBuffer;
     tempBuffer.data = result->reqbuf;
     tempBuffer.length = HTTP_REQBUF_SZ;
 

=== modified file 'src/clients/Client.cc'
--- src/clients/Client.cc	2014-09-29 05:13:17 +0000
+++ src/clients/Client.cc	2014-11-09 15:59:42 +0000
@@ -956,42 +956,83 @@
 #if USE_ADAPTATION
     assert(!adaptationAccessCheckPending); // or would need to buffer while waiting
     if (startedAdaptation) {
         adaptVirginReplyBody(data, len);
         return;
     }
 #endif
     storeReplyBody(data, len);
 }
 
 // writes virgin or adapted reply body to store
 void
 Client::storeReplyBody(const char *data, ssize_t len)
 {
     // write even if len is zero to push headers towards the client side
     entry->write (StoreIOBuffer(len, currentOffset, (char*)data));
 
     currentOffset += len;
 }
 
-size_t Client::replyBodySpace(const MemBuf &readBuf,
-                                       const size_t minSpace) const
+size_t
+Client::needBufferSpace(const SBuf &readBuf, const size_t minSpace) const
+{
+    size_t space = readBuf.spaceSize(); // available space w/o heroic measures
+    if (space < minSpace) {
+        const size_t maxSpace = SBuf::maxSize; // absolute best
+        space = min(minSpace, maxSpace); // do not promise more than asked
+    }
+
+#if USE_ADAPTATION
+    if (responseBodyBuffer) {
+        return 0;	// Stop reading if already overflowed waiting for ICAP to catch up
+    }
+
+    if (virginBodyDestination != NULL) {
+        /*
+         * BodyPipe buffer has a finite size limit.  We
+         * should not read more data from the network than will fit
+         * into the pipe buffer or we _lose_ what did not fit if
+         * the response ends sooner that BodyPipe frees up space:
+         * There is no code to keep pumping data into the pipe once
+         * response ends and serverComplete() is called.
+         *
+         * If the pipe is totally full, don't register the read handler.
+         * The BodyPipe will call our noteMoreBodySpaceAvailable() method
+         * when it has free space again.
+         */
+        size_t adaptation_space =
+            virginBodyDestination->buf().potentialSpaceSize();
+
+        debugs(11,9, "Client may read up to min(" <<
+               adaptation_space << ", " << space << ") bytes");
+
+        if (adaptation_space < space)
+            space = adaptation_space;
+    }
+#endif
+
+    return space;
+}
+
+size_t
+Client::replyBodySpace(const MemBuf &readBuf, const size_t minSpace) const
 {
     size_t space = readBuf.spaceSize(); // available space w/o heroic measures
     if (space < minSpace) {
         const size_t maxSpace = readBuf.potentialSpaceSize(); // absolute best
         space = min(minSpace, maxSpace); // do not promise more than asked
     }
 
 #if USE_ADAPTATION
     if (responseBodyBuffer) {
         return 0;	// Stop reading if already overflowed waiting for ICAP to catch up
     }
 
     if (virginBodyDestination != NULL) {
         /*
          * BodyPipe buffer has a finite size limit.  We
          * should not read more data from the network than will fit
          * into the pipe buffer or we _lose_ what did not fit if
          * the response ends sooner that BodyPipe frees up space:
          * There is no code to keep pumping data into the pipe once
          * response ends and serverComplete() is called.

=== modified file 'src/clients/Client.h'
--- src/clients/Client.h	2014-09-22 19:06:19 +0000
+++ src/clients/Client.h	2014-11-11 13:39:43 +0000
@@ -126,41 +126,44 @@
     void handleAdaptationAborted(bool bypassable = false);
 
     /// called by StoreEntry when it has more buffer space available
     void resumeBodyStorage();
     /// called when the entire adapted response body is consumed
     void endAdaptedBodyConsumption();
 #endif
 
 protected:
     const HttpReply *virginReply() const;
     HttpReply *virginReply();
     HttpReply *setVirginReply(HttpReply *r);
 
     HttpReply *finalReply();
     HttpReply *setFinalReply(HttpReply *r);
 
     // Kids use these to stuff data into the response instead of messing with the entry directly
     void adaptOrFinalizeReply();
     void addVirginReplyBody(const char *buf, ssize_t len);
     void storeReplyBody(const char *buf, ssize_t len);
+    /// \deprecated use SBuf I/O API and needBufferSpace() instead
     size_t replyBodySpace(const MemBuf &readBuf, const size_t minSpace) const;
+    /// determine how much space the buffer needs to reserve
+    size_t needBufferSpace(const SBuf &readBuf, const size_t minSpace) const;
 
     void adjustBodyBytesRead(const int64_t delta);
 
     // These should be private
     int64_t currentOffset;	/**< Our current offset in the StoreEntry */
     MemBuf *responseBodyBuffer;	/**< Data temporarily buffered for ICAP */
 
 public: // should not be
     StoreEntry *entry;
     FwdState::Pointer fwd;
     HttpRequest *request;
 
 protected:
     BodyPipe::Pointer requestBodySource;  /**< to consume request body */
     AsyncCall::Pointer requestSender;     /**< set if we are expecting Comm::Write to call us back */
 
 #if USE_ADAPTATION
     BodyPipe::Pointer virginBodyDestination;  /**< to provide virgin response body */
     CbcPointer<Adaptation::Initiate> adaptedHeadSource;  /**< to get adapted response headers */
     BodyPipe::Pointer adaptedBodySource;      /**< to consume adated response body */

=== modified file 'src/comm/Read.cc'
--- src/comm/Read.cc	2014-09-13 13:59:43 +0000
+++ src/comm/Read.cc	2014-11-11 05:56:19 +0000
@@ -65,41 +65,43 @@
     // Make sure we are either not reading or just passively monitoring.
     // Active/passive conflicts are OK and simply cancel passive monitoring.
     if (ccb->active()) {
         // if the assertion below fails, we have an active comm_read conflict
         assert(fd_table[conn->fd].halfClosedReader != NULL);
         commStopHalfClosedMonitor(conn->fd);
         assert(!ccb->active());
     }
     ccb->conn = conn;
 
     /* Queue the read */
     ccb->setCallback(Comm::IOCB_READ, callback, (char *)buf, NULL, size);
     Comm::SetSelect(conn->fd, COMM_SELECT_READ, Comm::HandleRead, ccb, 0);
 }
 
 Comm::Flag
 Comm::ReadNow(CommIoCbParams &params, SBuf &buf)
 {
     /* Attempt a read */
     ++ statCounter.syscalls.sock.reads;
-    const SBuf::size_type sz = buf.spaceSize();
+    SBuf::size_type sz = buf.spaceSize();
+    if (params.size > 0 && params.size < sz)
+        sz = params.size;
     char *inbuf = buf.rawSpace(sz);
     errno = 0;
     const int retval = FD_READ_METHOD(params.conn->fd, inbuf, sz);
     params.xerrno = errno;
 
     debugs(5, 3, params.conn << ", size " << sz << ", retval " << retval << ", errno " << params.xerrno);
 
     if (retval > 0) { // data read most common case
         buf.append(inbuf, retval);
         fd_bytes(params.conn->fd, retval, FD_READ);
         params.flag = Comm::OK;
         params.size = retval;
 
     } else if (retval == 0) { // remote closure (somewhat less) common
         // Note - read 0 == socket EOF, which is a valid read.
         params.flag = Comm::ENDFILE;
 
     } else if (retval < 0) { // connection errors are worst-case
         debugs(5, 3, params.conn << " Comm::COMM_ERROR: " << xstrerr(params.xerrno));
         if (ignoreErrno(params.xerrno))

=== modified file 'src/comm/Read.h'
--- src/comm/Read.h	2014-09-13 13:59:43 +0000
+++ src/comm/Read.h	2014-11-11 05:58:21 +0000
@@ -15,40 +15,43 @@
 
 class SBuf;
 
 namespace Comm
 {
 
 /**
  * Start monitoring for read.
  *
  * callback is scheduled when the read is possible,
  * or on file descriptor close.
  */
 void Read(const Comm::ConnectionPointer &conn, AsyncCall::Pointer &callback);
 
 /// whether the FD socket is being monitored for read
 bool MonitorsRead(int fd);
 
 /**
  * Perform a read(2) on a connection immediately.
  *
+ * If params.size is non-zero will limit size of the read to either
+ * the buffer free space or params.size, whichever is smallest.
+ *
  * The returned flag is also placed in params.flag.
  *
  * \retval Comm::OK          data has been read and placed in buf, amount in params.size
  * \retval Comm::COMM_ERROR  an error occured, the code is placed in params.xerrno
  * \retval Comm::INPROGRESS  unable to read at this time, or a minor error occured
  * \retval Comm::ENDFILE     0-byte read has occured.
  *                           Usually indicates the remote end has disconnected.
  */
 Comm::Flag ReadNow(CommIoCbParams &params, SBuf &buf);
 
 /// Cancel the read pending on FD. No action if none pending.
 void ReadCancel(int fd, AsyncCall::Pointer &callback);
 
 /// callback handler to process an FD which is available for reading
 extern PF HandleRead;
 
 } // namespace Comm
 
 // Legacy API to be removed
 void comm_read_base(const Comm::ConnectionPointer &conn, char *buf, int len, AsyncCall::Pointer &callback);

=== modified file 'src/http.cc'
--- src/http.cc	2014-11-10 11:45:36 +0000
+++ src/http.cc	2014-11-13 12:13:27 +0000
@@ -5,60 +5,62 @@
  * contributions from numerous individuals and organizations.
  * Please see the COPYING and CONTRIBUTORS files for details.
  */
 
 /* DEBUG: section 11    Hypertext Transfer Protocol (HTTP) */
 
 /*
  * Anonymizing patch by lutz at as-node.jena.thur.de
  * have a look into http-anon.c to get more informations.
  */
 
 #include "squid.h"
 #include "acl/FilledChecklist.h"
 #include "base/AsyncJobCalls.h"
 #include "base/TextException.h"
 #include "base64.h"
 #include "CachePeer.h"
 #include "ChunkedCodingParser.h"
 #include "client_side.h"
 #include "comm/Connection.h"
+#include "comm/Read.h"
 #include "comm/Write.h"
+#include "CommRead.h"
 #include "err_detail_type.h"
 #include "errorpage.h"
 #include "fd.h"
 #include "fde.h"
 #include "globals.h"
 #include "http.h"
+#include "http/one/ResponseParser.h"
 #include "HttpControlMsg.h"
 #include "HttpHdrCc.h"
 #include "HttpHdrContRange.h"
 #include "HttpHdrSc.h"
 #include "HttpHdrScTarget.h"
 #include "HttpHeaderTools.h"
 #include "HttpReply.h"
 #include "HttpRequest.h"
 #include "HttpStateFlags.h"
 #include "log/access_log.h"
 #include "MemBuf.h"
 #include "MemObject.h"
-#include "mime_header.h"
 #include "neighbors.h"
 #include "peer_proxy_negotiate_auth.h"
 #include "profiler/Profiler.h"
 #include "refresh.h"
 #include "RefreshPattern.h"
 #include "rfc1738.h"
 #include "SquidConfig.h"
 #include "SquidTime.h"
 #include "StatCounters.h"
 #include "Store.h"
 #include "StrList.h"
 #include "tools.h"
 #include "URL.h"
 
 #if USE_AUTH
 #include "auth/UserRequest.h"
 #endif
 #if USE_DELAY_POOLS
 #include "DelayPools.h"
 #endif
@@ -73,84 +75,78 @@
     }
 
 CBDATA_CLASS_INIT(HttpStateData);
 
 static const char *const crlf = "\r\n";
 
 static void httpMaybeRemovePublic(StoreEntry *, Http::StatusCode);
 static void copyOneHeaderFromClientsideRequestToUpstreamRequest(const HttpHeaderEntry *e, const String strConnection, const HttpRequest * request,
         HttpHeader * hdr_out, const int we_do_ranges, const HttpStateFlags &);
 //Declared in HttpHeaderTools.cc
 void httpHdrAdd(HttpHeader *heads, HttpRequest *request, const AccessLogEntryPointer &al, HeaderWithAclList &headers_add);
 
 HttpStateData::HttpStateData(FwdState *theFwdState) : AsyncJob("HttpStateData"), Client(theFwdState),
         lastChunk(0), header_bytes_read(0), reply_bytes_read(0),
         body_bytes_truncated(0), httpChunkDecoder(NULL)
 {
     debugs(11,5,HERE << "HttpStateData " << this << " created");
     ignoreCacheControl = false;
     surrogateNoStore = false;
     serverConnection = fwd->serverConnection();
-    readBuf = new MemBuf;
-    readBuf->init(16*1024, 256*1024);
+    inBuf.reserveSpace(16*1024);
 
     // reset peer response time stats for %<pt
     request->hier.peer_http_request_sent.tv_sec = 0;
     request->hier.peer_http_request_sent.tv_usec = 0;
 
     if (fwd->serverConnection() != NULL)
         _peer = cbdataReference(fwd->serverConnection()->getPeer());         /* might be NULL */
 
     if (_peer) {
         request->flags.proxying = true;
         /*
          * This NEIGHBOR_PROXY_ONLY check probably shouldn't be here.
          * We might end up getting the object from somewhere else if,
          * for example, the request to this neighbor fails.
          */
         if (_peer->options.proxy_only)
             entry->releaseRequest();
 
 #if USE_DELAY_POOLS
         entry->setNoDelay(_peer->options.no_delay);
 #endif
     }
 
     /*
      * register the handler to free HTTP state data when the FD closes
      */
     typedef CommCbMemFunT<HttpStateData, CommCloseCbParams> Dialer;
     closeHandler = JobCallback(9, 5, Dialer, this, HttpStateData::httpStateConnClosed);
     comm_add_close_handler(serverConnection->fd, closeHandler);
 }
 
 HttpStateData::~HttpStateData()
 {
     /*
      * don't forget that ~Client() gets called automatically
      */
 
-    if (!readBuf->isNull())
-        readBuf->clean();
-
-    delete readBuf;
-
     if (httpChunkDecoder)
         delete httpChunkDecoder;
 
     cbdataReferenceDone(_peer);
 
     debugs(11,5, HERE << "HttpStateData " << this << " destroyed; " << serverConnection);
 }
 
 const Comm::ConnectionPointer &
 HttpStateData::dataConnection() const
 {
     return serverConnection;
 }
 
 void
 HttpStateData::httpStateConnClosed(const CommCloseCbParams &params)
 {
     debugs(11, 5, "httpStateFree: FD " << params.fd << ", httpState=" << params.data);
     mustStop("HttpStateData::httpStateConnClosed");
 }
@@ -677,87 +673,118 @@
 }
 
 /**
  * This creates the error page itself.. its likely
  * that the forward ported reply header max size patch
  * generates non http conformant error pages - in which
  * case the errors where should be 'BAD_GATEWAY' etc
  */
 void
 HttpStateData::processReplyHeader()
 {
     /** Creates a blank header. If this routine is made incremental, this will not do */
 
     /* NP: all exit points to this function MUST call ctx_exit(ctx) */
     Ctx ctx = ctx_enter(entry->mem_obj->urlXXX());
 
     debugs(11, 3, "processReplyHeader: key '" << entry->getMD5Text() << "'");
 
     assert(!flags.headers_parsed);
 
-    if (!readBuf->hasContent()) {
+    if (!inBuf.length()) {
         ctx_exit(ctx);
         return;
     }
 
-    Http::StatusCode error = Http::scNone;
+    /* Attempt to parse the first line; this will define where the protocol, status, reason-phrase and header begin */
+    {
+        if (hp == NULL)
+            hp = new Http1::ResponseParser;
+
+        bool parsedOk = hp->parse(inBuf);
+
+        // sync the buffers after parsing.
+        inBuf = hp->remaining();
+
+        if (hp->needsMoreData()) {
+            if (eof) { // no more data coming
+                /* Bug 2879: Replies may terminate with \r\n then EOF instead of \r\n\r\n.
+                 * We also may receive truncated responses.
+                 * Ensure here that we have at minimum two \r\n when EOF is seen.
+                 */
+                inBuf.append("\r\n\r\n", 4);
+                // retry the parse
+                parsedOk = hp->parse(inBuf);
+                // sync the buffers after parsing.
+                inBuf = hp->remaining();
+            } else {
+                debugs(33, 5, "Incomplete response, waiting for end of response headers");
+                ctx_exit(ctx);
+                return;
+            }
+        }
 
-    HttpReply *newrep = new HttpReply;
-    const bool parsed = newrep->parse(readBuf, eof, &error);
+        flags.headers_parsed = true;
 
-    if (!parsed && readBuf->contentSize() > 5 && strncmp(readBuf->content(), "HTTP/", 5) != 0 && strncmp(readBuf->content(), "ICY", 3) != 0) {
-        MemBuf *mb;
-        HttpReply *tmprep = new HttpReply;
-        tmprep->setHeaders(Http::scOkay, "Gatewaying", NULL, -1, -1, -1);
-        tmprep->header.putExt("X-Transformed-From", "HTTP/0.9");
-        mb = tmprep->pack();
-        newrep->parse(mb, eof, &error);
-        delete mb;
-        delete tmprep;
-    } else {
-        if (!parsed && error > 0) { // unrecoverable parsing error
-            debugs(11, 3, "processReplyHeader: Non-HTTP-compliant header: '" <<  readBuf->content() << "'");
-            flags.headers_parsed = true;
-            // XXX: when sanityCheck is gone and Http::StatusLine is used to parse,
-            //   the sline should be already set the appropriate values during that parser stage
-            newrep->sline.set(Http::ProtocolVersion(1,1), error);
+        if (!parsedOk) {
+            // unrecoverable parsing error
+            debugs(11, 3, "Non-HTTP-compliant header:\n---------\n" << inBuf << "\n----------");
+            HttpReply *newrep = new HttpReply;
+            newrep->sline.set(Http::ProtocolVersion(1,1), hp->messageStatus());
             HttpReply *vrep = setVirginReply(newrep);
             entry->replaceHttpReply(vrep);
+            // XXX: close the server connection ?
             ctx_exit(ctx);
             return;
         }
+    }
 
-        if (!parsed) { // need more data
-            assert(!error);
-            assert(!eof);
-            delete newrep;
-            ctx_exit(ctx);
-            return;
-        }
+    /* We know the whole response is in parser now */
+    debugs(11, 2, "HTTP Server " << serverConnection);
+    debugs(11, 2, "HTTP Server RESPONSE:\n---------\n" <<
+           hp->messageProtocol() << " " << hp->messageStatus() << " " << hp->reasonPhrase() << "\n" <<
+           hp->mimeHeader() <<
+           "\n----------");
 
-        debugs(11, 2, "HTTP Server " << serverConnection);
-        debugs(11, 2, "HTTP Server REPLY:\n---------\n" << readBuf->content() << "\n----------");
+    header_bytes_read = hp->messageHeaderSize();
 
-        header_bytes_read = headersEnd(readBuf->content(), readBuf->contentSize());
-        readBuf->consume(header_bytes_read);
+    HttpReply *newrep = new HttpReply;
+    // XXX: performance regression, c_str() reallocates.
+    newrep->setHeaders(hp->messageStatus(), hp->reasonPhrase().c_str(), NULL, -1, -1, -1);
+    newrep->sline.protocol = newrep->sline.version.protocol = hp->messageProtocol().protocol;
+    newrep->sline.version.major = hp->messageProtocol().major;
+    newrep->sline.version.minor = hp->messageProtocol().minor;
+
+    // parse headers
+    newrep->pstate = psReadyToParseHeaders;
+    if (newrep->httpMsgParseStep(hp->mimeHeader().rawContent(), hp->mimeHeader().length(), true) < 0) {
+        // XXX: when Http::ProtocolVersion is a function, remove this hack. just set with messageProtocol()
+        newrep->sline.set(Http::ProtocolVersion(), Http::scInvalidHeader);
+        newrep->sline.version.protocol = hp->messageProtocol().protocol;
+        newrep->sline.version.major = hp->messageProtocol().major;
+        newrep->sline.version.minor = hp->messageProtocol().minor;
+        debugs(11, 2, "error parsing response headers mime block");
     }
 
+    // done with Parser, now process using the HttpReply
+    hp = NULL;
+
     newrep->removeStaleWarnings();
 
     if (newrep->sline.protocol == AnyP::PROTO_HTTP && newrep->sline.status() >= 100 && newrep->sline.status() < 200) {
         handle1xx(newrep);
         ctx_exit(ctx);
         return;
     }
 
     flags.chunked = false;
     if (newrep->sline.protocol == AnyP::PROTO_HTTP && newrep->header.chunked()) {
         flags.chunked = true;
         httpChunkDecoder = new ChunkedCodingParser;
     }
 
     if (!peerSupportsConnectionPinning())
         request->flags.connectionAuthDisabled = true;
 
     HttpReply *vrep = setVirginReply(newrep);
     flags.headers_parsed = true;
 
@@ -1083,134 +1110,157 @@
      * If the body size is known, we must wait until we've gotten all of it. */
     if (clen > 0) {
         // old technique:
         // if (entry->mem_obj->endOffset() < vrep->content_length + vrep->hdr_sz)
         const int64_t body_bytes_read = reply_bytes_read - header_bytes_read;
         debugs(11,5, "persistentConnStatus: body_bytes_read=" <<
                body_bytes_read << " content_length=" << vrep->content_length);
 
         if (body_bytes_read < vrep->content_length)
             return INCOMPLETE_MSG;
 
         if (body_bytes_truncated > 0) // already read more than needed
             return COMPLETE_NONPERSISTENT_MSG; // disable pconns
     }
 
     /** \par
      * If there is no message body or we got it all, we can be persistent */
     return statusIfComplete();
 }
 
-/* XXX this function is too long! */
+#if USE_DELAY_POOLS
+static void
+readDelayed(void *context, CommRead const &)
+{
+    HttpStateData *state = static_cast<HttpStateData*>(context);
+    state->maybeReadVirginBody();
+}
+#endif
+
 void
 HttpStateData::readReply(const CommIoCbParams &io)
 {
-    int bin;
-    int clen;
-    int len = io.size;
-
     flags.do_next_read = false;
 
-    debugs(11, 5, HERE << io.conn << ": len " << len << ".");
+    debugs(11, 5, io.conn);
 
     // Bail out early on Comm::ERR_CLOSING - close handlers will tidy up for us
     if (io.flag == Comm::ERR_CLOSING) {
         debugs(11, 3, "http socket closing");
         return;
     }
 
     if (EBIT_TEST(entry->flags, ENTRY_ABORTED)) {
         abortTransaction("store entry aborted while reading reply");
         return;
     }
 
-    // handle I/O errors
-    if (io.flag != Comm::OK || len < 0) {
-        debugs(11, 2, HERE << io.conn << ": read failure: " << xstrerror() << ".");
+    assert(Comm::IsConnOpen(serverConnection));
+    assert(io.conn->fd == serverConnection->fd);
 
-        if (ignoreErrno(io.xerrno)) {
-            flags.do_next_read = true;
-        } else {
-            ErrorState *err = new ErrorState(ERR_READ_ERROR, Http::scBadGateway, fwd->request);
-            err->xerrno = io.xerrno;
-            fwd->fail(err);
-            flags.do_next_read = false;
-            serverConnection->close();
+    /*
+     * Don't reset the timeout value here. The value should be
+     * counting Config.Timeout.request and applies to the request
+     * as a whole, not individual read() calls.
+     * Plus, it breaks our lame *HalfClosed() detection
+     */
+
+    CommIoCbParams rd(this); // will be expanded with ReadNow results
+    rd.conn = io.conn;
+    rd.size = entry->bytesWanted(Range<size_t>(0, inBuf.spaceSize()));
+#if USE_DELAY_POOLS
+    if (rd.size < 1) {
+        assert(entry->mem_obj);
+
+        /* read ahead limit */
+        /* Perhaps these two calls should both live in MemObject */
+        AsyncCall::Pointer nilCall;
+        if (!entry->mem_obj->readAheadPolicyCanRead()) {
+            entry->mem_obj->delayRead(DeferredRead(readDelayed, this, CommRead(io.conn, NULL, 0, nilCall)));
+            return;
         }
 
+        /* delay id limit */
+        entry->mem_obj->mostBytesAllowed().delayRead(DeferredRead(readDelayed, this, CommRead(io.conn, NULL, 0, nilCall)));
         return;
     }
+#endif
 
-    // update I/O stats
-    if (len > 0) {
-        readBuf->appended(len);
-        reply_bytes_read += len;
+    switch (Comm::ReadNow(rd, inBuf)) {
+    case Comm::INPROGRESS:
+        if (inBuf.isEmpty())
+            debugs(33, 2, io.conn << ": no data to process, " << xstrerr(rd.xerrno));
+        maybeReadVirginBody();
+        return;
+
+    case Comm::OK:
+    {
+        reply_bytes_read += rd.size;
 #if USE_DELAY_POOLS
         DelayId delayId = entry->mem_obj->mostBytesAllowed();
-        delayId.bytesIn(len);
+        delayId.bytesIn(rd.size);
 #endif
 
-        kb_incr(&(statCounter.server.all.kbytes_in), len);
-        kb_incr(&(statCounter.server.http.kbytes_in), len);
+        kb_incr(&(statCounter.server.all.kbytes_in), rd.size);
+        kb_incr(&(statCounter.server.http.kbytes_in), rd.size);
         ++ IOStats.Http.reads;
 
-        for (clen = len - 1, bin = 0; clen; ++bin)
+        int bin = 0;
+        for (int clen = rd.size - 1; clen; ++bin)
             clen >>= 1;
 
         ++ IOStats.Http.read_hist[bin];
 
         // update peer response time stats (%<pt)
         const timeval &sent = request->hier.peer_http_request_sent;
         request->hier.peer_response_time =
             sent.tv_sec ? tvSubMsec(sent, current_time) : -1;
     }
 
-    /** \par
-     * Here the RFC says we should ignore whitespace between replies, but we can't as
-     * doing so breaks HTTP/0.9 replies beginning with witespace, and in addition
-     * the response splitting countermeasures is extremely likely to trigger on this,
-     * not allowing connection reuse in the first place.
-     *
-     * 2012-02-10: which RFC? not 2068 or 2616,
-     *     tolerance there is all about whitespace between requests and header tokens.
-     */
+        /* Continue to process previously read data */
+        break;
 
-    if (len == 0) { // reached EOF?
+    case Comm::ENDFILE: // close detected by 0-byte read
         eof = 1;
         flags.do_next_read = false;
 
-        /* Bug 2879: Replies may terminate with \r\n then EOF instead of \r\n\r\n
-         * Ensure here that we have at minimum two \r\n when EOF is seen.
-         * TODO: Add eof parameter to headersEnd() and move this hack there.
-         */
-        if (readBuf->contentSize() && !flags.headers_parsed) {
-            /*
-             * Yes Henrik, there is a point to doing this.  When we
-             * called httpProcessReplyHeader() before, we didn't find
-             * the end of headers, but now we are definately at EOF, so
-             * we want to process the reply headers.
-             */
-            /* Fake an "end-of-headers" to work around such broken servers */
-            readBuf->append("\r\n", 2);
+        /* Continue to process previously read data */
+        break;
+
+        // case Comm::COMM_ERROR:
+    default: // no other flags should ever occur
+        debugs(11, 2, io.conn << ": read failure: " << xstrerr(rd.xerrno));
+
+        if (ignoreErrno(rd.xerrno)) {
+            flags.do_next_read = true;
+        } else {
+            ErrorState *err = new ErrorState(ERR_READ_ERROR, Http::scBadGateway, fwd->request);
+            err->xerrno = rd.xerrno;
+            fwd->fail(err);
+            flags.do_next_read = false;
+            io.conn->close();
         }
+
+        return;
     }
 
+    /* Process next response from buffer */
     processReply();
 }
 
 /// processes the already read and buffered response data, possibly after
 /// waiting for asynchronous 1xx control message processing
 void
 HttpStateData::processReply()
 {
 
     if (flags.handling1xx) { // we came back after handling a 1xx response
         debugs(11, 5, HERE << "done with 1xx handling");
         flags.handling1xx = false;
         Must(!flags.headers_parsed);
     }
 
     if (!flags.headers_parsed) { // have not parsed headers yet?
         PROF_start(HttpStateData_processReplyHeader);
         processReplyHeader();
         PROF_stop(HttpStateData_processReplyHeader);
 
@@ -1222,145 +1272,151 @@
 
     // kick more reads if needed and/or process the response body, if any
     PROF_start(HttpStateData_processReplyBody);
     processReplyBody(); // may call serverComplete()
     PROF_stop(HttpStateData_processReplyBody);
 }
 
 /**
  \retval true    if we can continue with processing the body or doing ICAP.
  */
 bool
 HttpStateData::continueAfterParsingHeader()
 {
     if (flags.handling1xx) {
         debugs(11, 5, HERE << "wait for 1xx handling");
         Must(!flags.headers_parsed);
         return false;
     }
 
     if (!flags.headers_parsed && !eof) {
-        debugs(11, 9, HERE << "needs more at " << readBuf->contentSize());
+        debugs(11, 9, "needs more at " << inBuf.length());
         flags.do_next_read = true;
         /** \retval false If we have not finished parsing the headers and may get more data.
          *                Schedules more reads to retrieve the missing data.
          */
         maybeReadVirginBody(); // schedules all kinds of reads; TODO: rename
         return false;
     }
 
     /** If we are done with parsing, check for errors */
 
     err_type error = ERR_NONE;
 
     if (flags.headers_parsed) { // parsed headers, possibly with errors
         // check for header parsing errors
         if (HttpReply *vrep = virginReply()) {
             const Http::StatusCode s = vrep->sline.status();
             const Http::ProtocolVersion &v = vrep->sline.version;
             if (s == Http::scInvalidHeader && v != Http::ProtocolVersion(0,9)) {
                 debugs(11, DBG_IMPORTANT, "WARNING: HTTP: Invalid Response: Bad header encountered from " << entry->url() << " AKA " << request->GetHost() << request->urlpath.termedBuf() );
                 error = ERR_INVALID_RESP;
             } else if (s == Http::scHeaderTooLarge) {
                 fwd->dontRetry(true);
                 error = ERR_TOO_BIG;
             } else {
                 return true; // done parsing, got reply, and no error
             }
         } else {
             // parsed headers but got no reply
             debugs(11, DBG_IMPORTANT, "WARNING: HTTP: Invalid Response: No reply at all for " << entry->url() << " AKA " << request->GetHost() << request->urlpath.termedBuf() );
             error = ERR_INVALID_RESP;
         }
     } else {
         assert(eof);
-        if (readBuf->hasContent()) {
+        if (inBuf.length()) {
             error = ERR_INVALID_RESP;
             debugs(11, DBG_IMPORTANT, "WARNING: HTTP: Invalid Response: Headers did not parse at all for " << entry->url() << " AKA " << request->GetHost() << request->urlpath.termedBuf() );
         } else {
             error = ERR_ZERO_SIZE_OBJECT;
             debugs(11, (request->flags.accelerated?DBG_IMPORTANT:2), "WARNING: HTTP: Invalid Response: No object data received for " <<
                    entry->url() << " AKA " << request->GetHost() << request->urlpath.termedBuf() );
         }
     }
 
     assert(error != ERR_NONE);
     entry->reset();
     fwd->fail(new ErrorState(error, Http::scBadGateway, fwd->request));
     flags.do_next_read = false;
     serverConnection->close();
     return false; // quit on error
 }
 
 /** truncate what we read if we read too much so that writeReplyBody()
     writes no more than what we should have read */
 void
 HttpStateData::truncateVirginBody()
 {
     assert(flags.headers_parsed);
 
     HttpReply *vrep = virginReply();
     int64_t clen = -1;
     if (!vrep->expectingBody(request->method, clen) || clen < 0)
         return; // no body or a body of unknown size, including chunked
 
     const int64_t body_bytes_read = reply_bytes_read - header_bytes_read;
     if (body_bytes_read - body_bytes_truncated <= clen)
         return; // we did not read too much or already took care of the extras
 
     if (const int64_t extras = body_bytes_read - body_bytes_truncated - clen) {
         // server sent more that the advertised content length
         debugs(11,5, HERE << "body_bytes_read=" << body_bytes_read <<
                " clen=" << clen << '/' << vrep->content_length <<
                " body_bytes_truncated=" << body_bytes_truncated << '+' << extras);
 
-        readBuf->truncate(extras);
+        inBuf.chop(0, inBuf.length() - extras);
         body_bytes_truncated += extras;
     }
 }
 
 /**
  * Call this when there is data from the origin server
  * which should be sent to either StoreEntry, or to ICAP...
  */
 void
 HttpStateData::writeReplyBody()
 {
     truncateVirginBody(); // if needed
-    const char *data = readBuf->content();
-    int len = readBuf->contentSize();
+    const char *data = inBuf.rawContent();
+    int len = inBuf.length();
     addVirginReplyBody(data, len);
-    readBuf->consume(len);
+    inBuf.consume(len);
 }
 
 bool
 HttpStateData::decodeAndWriteReplyBody()
 {
     const char *data = NULL;
     int len;
     bool wasThereAnException = false;
     assert(flags.chunked);
     assert(httpChunkDecoder);
     SQUID_ENTER_THROWING_CODE();
     MemBuf decodedData;
     decodedData.init();
-    const bool doneParsing = httpChunkDecoder->parse(readBuf,&decodedData);
+    // XXX: performance regression. SBuf-convert (or Parser-convert?) the chunked decoder.
+    MemBuf encodedData;
+    encodedData.init();
+    // NP: we must do this instead of pointing encodedData at the SBuf::rawContent
+    // because chunked decoder uses MemBuf::consume, which shuffles buffer bytes around.
+    encodedData.append(inBuf.rawContent(), inBuf.length());
+    const bool doneParsing = httpChunkDecoder->parse(&encodedData,&decodedData);
     len = decodedData.contentSize();
     data=decodedData.content();
     addVirginReplyBody(data, len);
     if (doneParsing) {
         lastChunk = 1;
         flags.do_next_read = false;
     }
     SQUID_EXIT_THROWING_CODE(wasThereAnException);
     return wasThereAnException;
 }
 
 /**
  * processReplyBody has two purposes:
  *  1 - take the reply body data, if any, and put it into either
  *      the StoreEntry, or give it over to ICAP.
  *  2 - see if we made it to the end of the response (persistent
  *      connections and such)
  */
 void
 HttpStateData::processReplyBody()
@@ -1457,62 +1513,57 @@
 }
 
 bool
 HttpStateData::mayReadVirginReplyBody() const
 {
     // TODO: Be more precise here. For example, if/when reading trailer, we may
     // not be doneWithServer() yet, but we should return false. Similarly, we
     // could still be writing the request body after receiving the whole reply.
     return !doneWithServer();
 }
 
 void
 HttpStateData::maybeReadVirginBody()
 {
     // too late to read
     if (!Comm::IsConnOpen(serverConnection) || fd_table[serverConnection->fd].closing())
         return;
 
     // we may need to grow the buffer if headers do not fit
     const int minRead = flags.headers_parsed ? 0 :1024;
-    const int read_size = replyBodySpace(*readBuf, minRead);
+    const int read_size = needBufferSpace(inBuf, minRead);
 
-    debugs(11,9, HERE << (flags.do_next_read ? "may" : "wont") <<
+    debugs(11,9, (flags.do_next_read ? "may" : "wont") <<
            " read up to " << read_size << " bytes from " << serverConnection);
 
-    /*
-     * why <2? Because delayAwareRead() won't actually read if
-     * you ask it to read 1 byte.  The delayed read request
-     * just gets re-queued until the client side drains, then
-     * the I/O thread hangs.  Better to not register any read
-     * handler until we get a notification from someone that
-     * its okay to read again.
-     */
-    if (read_size < 2)
+    if (!flags.do_next_read)
         return;
 
-    if (flags.do_next_read) {
-        flags.do_next_read = false;
-        typedef CommCbMemFunT<HttpStateData, CommIoCbParams> Dialer;
-        entry->delayAwareRead(serverConnection, readBuf->space(read_size), read_size,
-                              JobCallback(11, 5, Dialer, this,  HttpStateData::readReply));
-    }
+    flags.do_next_read = false;
+
+    // must not already be waiting for read(2) ...
+    assert(!Comm::MonitorsRead(serverConnection->fd));
+
+    // wait for read(2) to be possible.
+    typedef CommCbMemFunT<HttpStateData, CommIoCbParams> Dialer;
+    AsyncCall::Pointer call = JobCallback(11, 5, Dialer, this, HttpStateData::readReply);
+    Comm::Read(serverConnection, call);
 }
 
 /// called after writing the very last request byte (body, last-chunk, etc)
 void
 HttpStateData::wroteLast(const CommIoCbParams &io)
 {
     debugs(11, 5, HERE << serverConnection << ": size " << io.size << ": errflag " << io.flag << ".");
 #if URL_CHECKSUM_DEBUG
 
     entry->mem_obj->checkUrlChecksum();
 #endif
 
     if (io.size > 0) {
         fd_bytes(io.fd, io.size, FD_WRITE);
         kb_incr(&(statCounter.server.all.kbytes_out), io.size);
         kb_incr(&(statCounter.server.http.kbytes_out), io.size);
     }
 
     if (io.flag == Comm::ERR_CLOSING)
         return;

=== modified file 'src/http.h'
--- src/http.h	2014-11-02 00:10:01 +0000
+++ src/http.h	2014-11-12 11:40:19 +0000
@@ -33,41 +33,41 @@
 
     virtual const Comm::ConnectionPointer & dataConnection() const;
     /* should be private */
     bool sendRequest();
     void processReplyHeader();
     void processReplyBody();
     void readReply(const CommIoCbParams &io);
     virtual void maybeReadVirginBody(); // read response data from the network
 
     // Determine whether the response is a cacheable representation
     int cacheableReply();
 
     CachePeer *_peer;		/* CachePeer request made to */
     int eof;			/* reached end-of-object? */
     int lastChunk;		/* reached last chunk of a chunk-encoded reply */
     HttpStateFlags flags;
     size_t read_sz;
     int header_bytes_read;	// to find end of response,
     int64_t reply_bytes_read;	// without relying on StoreEntry
     int body_bytes_truncated; // positive when we read more than we wanted
-    MemBuf *readBuf;
+    SBuf inBuf;                ///< I/O buffer for receiving server responses
     bool ignoreCacheControl;
     bool surrogateNoStore;
 
     void processSurrogateControl(HttpReply *);
 
 protected:
     void processReply();
     void proceedAfter1xx();
     void handle1xx(HttpReply *msg);
 
 private:
     /**
      * The current server connection.
      * Maybe open, closed, or NULL.
      * Use doneWithServer() to check if the server is available for use.
      */
     Comm::ConnectionPointer serverConnection;
     AsyncCall::Pointer closeHandler;
     enum ConnectionStatus {
         INCOMPLETE_MSG,
@@ -93,28 +93,30 @@
     // consuming request body
     virtual void handleMoreRequestBodyAvailable();
     virtual void handleRequestBodyProducerAborted();
 
     void writeReplyBody();
     bool decodeAndWriteReplyBody();
     bool finishingBrokenPost();
     bool finishingChunkedRequest();
     void doneSendingRequestBody();
     void requestBodyHandler(MemBuf &);
     virtual void sentRequestBody(const CommIoCbParams &io);
     void wroteLast(const CommIoCbParams &io);
     void sendComplete();
     void httpStateConnClosed(const CommCloseCbParams &params);
     void httpTimeout(const CommTimeoutCbParams &params);
 
     mb_size_t buildRequestPrefix(MemBuf * mb);
     static bool decideIfWeDoRanges (HttpRequest * orig_request);
     bool peerSupportsConnectionPinning() const;
 
+    /// Parser being used at present to parse the HTTP/ICY server response.
+    Http1::ResponseParserPointer hp;
     ChunkedCodingParser *httpChunkDecoder;
 };
 
 int httpCachable(const HttpRequestMethod&);
 void httpStart(FwdState *);
 const char *httpMakeVaryMark(HttpRequest * request, HttpReply const * reply);
 
 #endif /* SQUID_HTTP_H */

=== modified file 'src/http/one/Makefile.am'
--- src/http/one/Makefile.am	2014-05-20 10:21:14 +0000
+++ src/http/one/Makefile.am	2014-11-09 03:52:44 +0000
@@ -1,11 +1,13 @@
 include $(top_srcdir)/src/Common.am
 include $(top_srcdir)/src/TestHeaders.am
 
 noinst_LTLIBRARIES = libhttp1.la
 
 libhttp1_la_SOURCES = \
 	forward.h \
 	Parser.cc \
 	Parser.h \
 	RequestParser.cc \
-	RequestParser.h
+	RequestParser.h \
+	ResponseParser.cc \
+	ResponseParser.h

=== modified file 'src/http/one/Parser.cc'
--- src/http/one/Parser.cc	2014-09-14 12:43:00 +0000
+++ src/http/one/Parser.cc	2014-11-09 09:51:48 +0000
@@ -1,45 +1,84 @@
 /*
  * Copyright (C) 1996-2014 The Squid Software Foundation and contributors
  *
  * Squid software is distributed under GPLv2+ license and includes
  * contributions from numerous individuals and organizations.
  * Please see the COPYING and CONTRIBUTORS files for details.
  */
 
 #include "squid.h"
 #include "Debug.h"
 #include "http/one/Parser.h"
+#include "mime_header.h"
 #include "parser/Tokenizer.h"
 
 /// RFC 7230 section 2.6 - 7 magic octets
 const SBuf Http::One::Parser::Http1magic("HTTP/1.");
 
 void
 Http::One::Parser::clear()
 {
     parsingStage_ = HTTP_PARSE_NONE;
     buf_ = NULL;
     msgProtocol_ = AnyP::ProtocolVersion();
     mimeHeaderBlock_.clear();
 }
 
+bool
+Http::One::Parser::findMimeBlock(const char *which, size_t limit)
+{
+    if (msgProtocol_.major == 1) {
+        /* NOTE: HTTP/0.9 messages do not have a mime header block.
+         *       So the rest of the code will need to deal with '0'-byte headers
+         *       (ie, none, so don't try parsing em)
+         */
+        int64_t mimeHeaderBytes = 0;
+        // XXX: c_str() reallocates. performance regression.
+        if ((mimeHeaderBytes = headersEnd(buf_.c_str(), buf_.length())) == 0) {
+            if (buf_.length()+firstLineSize() >= limit) {
+                debugs(33, 5, "Too large " << which);
+                parseStatusCode = Http::scHeaderTooLarge;
+                parsingStage_ = HTTP_PARSE_DONE;
+            } else
+                debugs(33, 5, "Incomplete " << which << ", waiting for end of headers");
+            return false;
+        }
+        mimeHeaderBlock_ = buf_.consume(mimeHeaderBytes);
+        debugs(74, 5, "mime header (0-" << mimeHeaderBytes << ") {" << mimeHeaderBlock_ << "}");
+
+    } else
+        debugs(33, 3, "Missing HTTP/1.x identifier");
+
+    // NP: we do not do any further stages here yet so go straight to DONE
+    parsingStage_ = HTTP_PARSE_DONE;
+
+    // Squid could handle these headers, but admin does not want to
+    if (messageHeaderSize() >= limit) {
+        debugs(33, 5, "Too large " << which);
+        parseStatusCode = Http::scHeaderTooLarge;
+        return false;
+    }
+
+    return true;
+}
+
 // arbitrary maximum-length for headers which can be found by Http1Parser::getHeaderField()
 #define GET_HDR_SZ	1024
 
 // BUG: returns only the first header line with given name,
 //      ignores multi-line headers and obs-fold headers
 char *
 Http::One::Parser::getHeaderField(const char *name)
 {
     if (!headerBlockSize() || !name)
         return NULL;
 
     LOCAL_ARRAY(char, header, GET_HDR_SZ);
     const int namelen = name ? strlen(name) : 0;
 
     debugs(25, 5, "looking for " << name);
 
     // while we can find more LF in the SBuf
     static CharacterSet iso8859Line = CharacterSet("non-LF",'\0','\n'-1) + CharacterSet(NULL, '\n'+1, (unsigned char)0xFF);
     ::Parser::Tokenizer tok(mimeHeaderBlock_);
     SBuf p;

=== modified file 'src/http/one/Parser.h'
--- src/http/one/Parser.h	2014-10-15 14:09:32 +0000
+++ src/http/one/Parser.h	2014-11-09 09:43:38 +0000
@@ -1,110 +1,124 @@
 /*
  * Copyright (C) 1996-2014 The Squid Software Foundation and contributors
  *
  * Squid software is distributed under GPLv2+ license and includes
  * contributions from numerous individuals and organizations.
  * Please see the COPYING and CONTRIBUTORS files for details.
  */
 
 #ifndef _SQUID_SRC_HTTP_ONE_PARSER_H
 #define _SQUID_SRC_HTTP_ONE_PARSER_H
 
 #include "anyp/ProtocolVersion.h"
 #include "http/one/forward.h"
+#include "http/StatusCode.h"
 #include "SBuf.h"
 
 namespace Http {
 namespace One {
 
 // Parser states
 enum ParseState {
     HTTP_PARSE_NONE,     ///< initialized, but nothing usefully parsed yet
     HTTP_PARSE_FIRST,    ///< HTTP/1 message first-line
     HTTP_PARSE_MIME,     ///< HTTP/1 mime-header block
     HTTP_PARSE_DONE      ///< parsed a message header, or reached a terminal syntax error
 };
 
 /** HTTP/1.x protocol parser
  *
  * Works on a raw character I/O buffer and tokenizes the content into
  * the major CRLF delimited segments of an HTTP/1 procotol message:
  *
  * \item first-line (request-line / simple-request / status-line)
  * \item mime-header 0*( header-name ':' SP field-value CRLF)
  */
 class Parser : public RefCountable
 {
 public:
     typedef SBuf::size_type size_type;
 
-    Parser() : parsingStage_(HTTP_PARSE_NONE) {}
+    Parser() : parseStatusCode(Http::scNone), parsingStage_(HTTP_PARSE_NONE) {}
     virtual ~Parser() {}
 
     /// Set this parser back to a default state.
     /// Will DROP any reference to a buffer (does not free).
     virtual void clear() = 0;
 
     /// attempt to parse a message from the buffer
     /// \retval true if a full message was found and parsed
     /// \retval false if incomplete, invalid or no message was found
     virtual bool parse(const SBuf &aBuf) = 0;
 
     /** Whether the parser is waiting on more data to complete parsing a message.
      * Use to distinguish between incomplete data and error results
      * when parse() returns false.
      */
     bool needsMoreData() const {return parsingStage_!=HTTP_PARSE_DONE;}
 
     /// size in bytes of the first line including CRLF terminator
     virtual size_type firstLineSize() const = 0;
 
     /// size in bytes of the message headers including CRLF terminator(s)
     /// but excluding first-line bytes
     size_type headerBlockSize() const {return mimeHeaderBlock_.length();}
 
     /// size in bytes of HTTP message block, includes first-line and mime headers
     /// excludes any body/entity/payload bytes
     /// excludes any garbage prefix before the first-line
     size_type messageHeaderSize() const {return firstLineSize() + headerBlockSize();}
 
     /// buffer containing HTTP mime headers, excluding message first-line.
     SBuf mimeHeader() const {return mimeHeaderBlock_;}
 
     /// the protocol label for this message
     const AnyP::ProtocolVersion & messageProtocol() const {return msgProtocol_;}
 
     /**
-     * Scan the mime header block (badly) for a header with teh given name.
+     * Scan the mime header block (badly) for a header with the given name.
      *
      * BUG: omits lines when searching for headers with obs-fold or multiple entries.
      *
      * BUG: limits output to just 1KB when Squid accepts up to 64KB line length.
      *
      * \return A pointer to a field-value of the first matching field-name, or NULL.
      */
     char *getHeaderField(const char *name);
 
     /// the remaining unprocessed section of buffer
     const SBuf &remaining() const {return buf_;}
 
+    /**
+     * HTTP status code resulting from the parse process.
+     * to be used on the invalid message handling.
+     *
+     * Http::scNone indicates incomplete parse,
+     * Http::scOkay indicates no error,
+     * other codes represent a parse error.
+     */
+    Http::StatusCode parseStatusCode;
+
 protected:
+    /// parse scan to find the mime headers block for current message
+    bool findMimeBlock(const char *which, size_t limit);
+
     /// RFC 7230 section 2.6 - 7 magic octets
     static const SBuf Http1magic;
 
     /// bytes remaining to be parsed
     SBuf buf_;
 
     /// what stage the parser is currently up to
     ParseState parsingStage_;
 
     /// what protocol label has been found in the first line (if any)
     AnyP::ProtocolVersion msgProtocol_;
 
     /// buffer holding the mime headers (if any)
     SBuf mimeHeaderBlock_;
 };
 
 } // namespace One
 } // namespace Http
 
 #endif /*  _SQUID_SRC_HTTP_ONE_PARSER_H */

=== modified file 'src/http/one/RequestParser.cc'
--- src/http/one/RequestParser.cc	2014-10-15 14:09:32 +0000
+++ src/http/one/RequestParser.cc	2014-11-09 09:51:40 +0000
@@ -1,31 +1,29 @@
 #include "squid.h"
 #include "Debug.h"
 #include "http/one/RequestParser.h"
 #include "http/ProtocolVersion.h"
-#include "mime_header.h"
 #include "profiler/Profiler.h"
 #include "SquidConfig.h"
 
 Http::One::RequestParser::RequestParser() :
-        Parser(),
-        request_parse_status(Http::scNone)
+        Parser()
 {
     req.start = req.end = -1;
     req.m_start = req.m_end = -1;
     req.u_start = req.u_end = -1;
     req.v_start = req.v_end = -1;
 }
 
 /**
  * Attempt to parse the first line of a new request message.
  *
  * Governed by RFC 7230 section 3.5
  *  "
  *    In the interest of robustness, a server that is expecting to receive
  *    and parse a request-line SHOULD ignore at least one empty line (CRLF)
  *    received prior to the request-line.
  *  "
  *
  * Parsing state is stored between calls to avoid repeating buffer scans.
  * If garbage is found the parsing offset is incremented.
  */
@@ -58,41 +56,41 @@
                    "Ignored due to relaxed_header_parser.");
         // Be tolerant of prefix spaces (other bytes are valid method values)
         while (!buf_.isEmpty() && buf_[0] == ' ') {
             buf_.consume(1);
         }
     }
 #endif
 }
 
 /**
  * Attempt to parse the first line of a new request message.
  *
  * Governed by:
  *  RFC 1945 section 5.1
  *  RFC 7230 section 3.1 and 3.5
  *
  * Parsing state is stored between calls. However the current implementation
  * begins parsing from scratch on every call.
  * The return value tells you whether the parsing state fields are valid or not.
  *
- * \retval -1  an error occurred. request_parse_status indicates HTTP status result.
+ * \retval -1  an error occurred. parseStatusCode indicates HTTP status result.
  * \retval  1  successful parse. member fields contain the request-line items
  * \retval  0  more data is needed to complete the parse
  */
 int
 Http::One::RequestParser::parseRequestFirstLine()
 {
     int second_word = -1; // track the suspected URI start
     int first_whitespace = -1, last_whitespace = -1; // track the first and last SP byte
     int line_end = -1; // tracks the last byte BEFORE terminal \r\n or \n sequence
 
     debugs(74, 5, "parsing possible request: buf.length=" << buf_.length());
     debugs(74, DBG_DATA, buf_);
 
     // Single-pass parse: (provided we have the whole line anyways)
 
     req.start = 0;
     req.end = -1;
     for (SBuf::size_type i = 0; i < buf_.length(); ++i) {
         // track first and last whitespace (SP only)
         if (buf_[i] == ' ') {
@@ -124,169 +122,169 @@
                     line_end = i - 1;
                 while (i < buf_.length() - 1 && buf_[i + 1] == '\r')
                     ++i;
 
                 if (buf_[i + 1] == '\n') {
                     req.end = i + 1;
                     break;
                 }
             } else {
                 if (buf_[i + 1] == '\n') {
                     req.end = i + 1;
                     line_end = i - 1;
                     break;
                 }
             }
 
             // RFC 7230 section 3.1.1 does not prohibit embeded CR like RFC 2616 used to.
             // However it does explicitly state an exact syntax which omits un-encoded CR
             // and defines 400 (Bad Request) as the required action when
             // handed an invalid request-line.
-            request_parse_status = Http::scBadRequest;
+            parseStatusCode = Http::scBadRequest;
             return -1;
         }
     }
 
     if (req.end == -1) {
         // DoS protection against long first-line
         if ((size_t)buf_.length() >= Config.maxRequestHeaderSize) {
             debugs(33, 5, "Too large request-line");
             // RFC 7230 section 3.1.1 mandatory 414 response if URL longer than acceptible.
-            request_parse_status = Http::scUriTooLong;
+            parseStatusCode = Http::scUriTooLong;
             return -1;
         }
 
         debugs(74, 5, "Parser: retval 0: from " << req.start <<
                "->" << req.end << ": needs more data to complete first line.");
         return 0;
     }
 
     // NP: we have now seen EOL, more-data (0) cannot occur.
     //     From here on any failure is -1, success is 1
 
     // Input Validation:
 
     // DoS protection against long first-line
     if ((size_t)(req.end-req.start) >= Config.maxRequestHeaderSize) {
         debugs(33, 5, "Too large request-line");
-        request_parse_status = Http::scUriTooLong;
+        parseStatusCode = Http::scUriTooLong;
         return -1;
     }
 
     // Process what we now know about the line structure into field offsets
     // generating HTTP status for any aborts as we go.
 
     // First non-whitespace = beginning of method
     if (req.start > line_end) {
-        request_parse_status = Http::scBadRequest;
+        parseStatusCode = Http::scBadRequest;
         return -1;
     }
     req.m_start = req.start;
 
     // First whitespace = end of method
     if (first_whitespace > line_end || first_whitespace < req.start) {
-        request_parse_status = Http::scBadRequest; // no method
+        parseStatusCode = Http::scBadRequest; // no method
         return -1;
     }
     req.m_end = first_whitespace - 1;
     if (req.m_end < req.m_start) {
-        request_parse_status = Http::scBadRequest; // missing URI?
+        parseStatusCode = Http::scBadRequest; // missing URI?
         return -1;
     }
 
     /* Set method_ */
     const SBuf tmp = buf_.substr(req.m_start, req.m_end - req.m_start + 1);
     method_ = HttpRequestMethod(tmp);
 
     // First non-whitespace after first SP = beginning of URL+Version
     if (second_word > line_end || second_word < req.start) {
-        request_parse_status = Http::scBadRequest; // missing URI
+        parseStatusCode = Http::scBadRequest; // missing URI
         return -1;
     }
     req.u_start = second_word;
 
     // RFC 1945: SP and version following URI are optional, marking version 0.9
     // we identify this by the last whitespace being earlier than URI start
     if (last_whitespace < second_word && last_whitespace >= req.start) {
         msgProtocol_ = Http::ProtocolVersion(0,9);
         req.u_end = line_end;
         uri_ = buf_.substr(req.u_start, req.u_end - req.u_start + 1);
-        request_parse_status = Http::scOkay; // HTTP/0.9
+        parseStatusCode = Http::scOkay; // HTTP/0.9
         return 1;
     } else {
         // otherwise last whitespace is somewhere after end of URI.
         req.u_end = last_whitespace;
         // crop any trailing whitespace in the area we think of as URI
         for (; req.u_end >= req.u_start && xisspace(buf_[req.u_end]); --req.u_end);
     }
     if (req.u_end < req.u_start) {
-        request_parse_status = Http::scBadRequest; // missing URI
+        parseStatusCode = Http::scBadRequest; // missing URI
         return -1;
     }
     uri_ = buf_.substr(req.u_start, req.u_end - req.u_start + 1);
 
     // Last whitespace SP = before start of protocol/version
     if (last_whitespace >= line_end) {
-        request_parse_status = Http::scBadRequest; // missing version
+        parseStatusCode = Http::scBadRequest; // missing version
         return -1;
     }
     req.v_start = last_whitespace + 1;
     req.v_end = line_end;
 
     /* RFC 7230 section 2.6 : handle unsupported HTTP major versions cleanly. */
     if ((req.v_end - req.v_start +1) < (int)Http1magic.length() || !buf_.substr(req.v_start, SBuf::npos).startsWith(Http1magic)) {
         // non-HTTP/1 protocols not supported / implemented.
-        request_parse_status = Http::scHttpVersionNotSupported;
+        parseStatusCode = Http::scHttpVersionNotSupported;
         return -1;
     }
     // NP: magic octets include the protocol name and major version DIGIT.
     msgProtocol_.protocol = AnyP::PROTO_HTTP;
     msgProtocol_.major = 1;
 
     int i = req.v_start + Http1magic.length() -1;
 
     // catch missing minor part
     if (++i > line_end) {
-        request_parse_status = Http::scHttpVersionNotSupported;
+        parseStatusCode = Http::scHttpVersionNotSupported;
         return -1;
     }
     /* next should be one or more digits */
     if (!isdigit(buf_[i])) {
-        request_parse_status = Http::scHttpVersionNotSupported;
+        parseStatusCode = Http::scHttpVersionNotSupported;
         return -1;
     }
     int min = 0;
     for (; i <= line_end && (isdigit(buf_[i])) && min < 65536; ++i) {
         min = min * 10;
         min = min + (buf_[i]) - '0';
     }
     // catch too-big values or trailing garbage
     if (min >= 65536 || i < line_end) {
-        request_parse_status = Http::scHttpVersionNotSupported;
+        parseStatusCode = Http::scHttpVersionNotSupported;
         return -1;
     }
     msgProtocol_.minor = min;
 
     /*
      * Rightio - we have all the schtuff. Return true; we've got enough.
      */
-    request_parse_status = Http::scOkay;
+    parseStatusCode = Http::scOkay;
     return 1;
 }
 
 bool
 Http::One::RequestParser::parse(const SBuf &aBuf)
 {
     buf_ = aBuf;
     debugs(74, DBG_DATA, "Parse buf={length=" << aBuf.length() << ", data='" << aBuf << "'}");
 
     // stage 1: locate the request-line
     if (parsingStage_ == HTTP_PARSE_NONE) {
         skipGarbageLines();
 
         // if we hit something before EOS treat it as a message
         if (!buf_.isEmpty())
             parsingStage_ = HTTP_PARSE_FIRST;
         else
             return false;
     }
 
@@ -302,55 +300,29 @@
         }
 
         debugs(74, 5, "request-line: retval " << retcode << ": from " << req.start << "->" << req.end <<
                " line={" << aBuf.length() << ", data='" << aBuf << "'}");
         debugs(74, 5, "request-line: method " << req.m_start << "->" << req.m_end << " (" << method_ << ")");
         debugs(74, 5, "request-line: url " << req.u_start << "->" << req.u_end << " (" << uri_ << ")");
         debugs(74, 5, "request-line: proto " << req.v_start << "->" << req.v_end << " (" << msgProtocol_ << ")");
         debugs(74, 5, "Parser: bytes processed=" << (aBuf.length()-buf_.length()));
         PROF_stop(HttpParserParseReqLine);
 
         // syntax errors already
         if (retcode < 0) {
             parsingStage_ = HTTP_PARSE_DONE;
             return false;
         }
     }
 
     // stage 3: locate the mime header block
     if (parsingStage_ == HTTP_PARSE_MIME) {
         // HTTP/1.x request-line is valid and parsing completed.
-        if (msgProtocol_.major == 1) {
-            /* NOTE: HTTP/0.9 requests do not have a mime header block.
-             *       So the rest of the code will need to deal with '0'-byte headers
-             *       (ie, none, so don't try parsing em)
-             */
-            int64_t mimeHeaderBytes = 0;
-            // XXX: c_str() reallocates. performance regression.
-            if ((mimeHeaderBytes = headersEnd(buf_.c_str(), buf_.length())) == 0) {
-                if (buf_.length()+firstLineSize() >= Config.maxRequestHeaderSize) {
-                    debugs(33, 5, "Too large request");
-                    request_parse_status = Http::scRequestHeaderFieldsTooLarge;
-                    parsingStage_ = HTTP_PARSE_DONE;
-                } else
-                    debugs(33, 5, "Incomplete request, waiting for end of headers");
-                return false;
-            }
-            mimeHeaderBlock_ = buf_.consume(mimeHeaderBytes);
-            debugs(74, 5, "mime header (0-" << mimeHeaderBytes << ") {" << mimeHeaderBlock_ << "}");
-
-        } else
-            debugs(33, 3, "Missing HTTP/1.x identifier");
-
-        // NP: we do not do any further stages here yet so go straight to DONE
-        parsingStage_ = HTTP_PARSE_DONE;
-
-        // Squid could handle these headers, but admin does not want to
-        if (messageHeaderSize() >= Config.maxRequestHeaderSize) {
-            debugs(33, 5, "Too large request");
-            request_parse_status = Http::scRequestHeaderFieldsTooLarge;
+        if (!findMimeBlock("Request", Config.maxRequestHeaderSize)) {
+            if (parseStatusCode == Http::scHeaderTooLarge)
+                parseStatusCode = Http::scRequestHeaderFieldsTooLarge;
             return false;
         }
     }
 
     return !needsMoreData();
 }

=== modified file 'src/http/one/RequestParser.h'
--- src/http/one/RequestParser.h	2014-10-15 14:09:32 +0000
+++ src/http/one/RequestParser.h	2014-11-09 09:38:41 +0000
@@ -1,61 +1,54 @@
 #ifndef _SQUID_SRC_HTTP_ONE_REQUESTPARSER_H
 #define _SQUID_SRC_HTTP_ONE_REQUESTPARSER_H
 
 #include "http/one/Parser.h"
 #include "http/RequestMethod.h"
-#include "http/StatusCode.h"
 
 namespace Http {
 namespace One {
 
 /** HTTP/1.x protocol request parser
  *
  * Works on a raw character I/O buffer and tokenizes the content into
  * the major CRLF delimited segments of an HTTP/1 request message:
  *
  * \item request-line (method, URL, protocol, version)
  * \item mime-header (set of RFC2616 syntax header fields)
  */
 class RequestParser : public Http1::Parser
 {
 public:
     RequestParser();
     virtual ~RequestParser() {}
 
     /* Http::One::Parser API */
     virtual void clear() {*this = RequestParser();}
     virtual Http1::Parser::size_type firstLineSize() const {return req.end - req.start + 1;}
     virtual bool parse(const SBuf &aBuf);
 
     /// the HTTP method if this is a request message
     const HttpRequestMethod & method() const {return method_;}
 
     /// the request-line URI if this is a request message, or an empty string.
     const SBuf &requestUri() const {return uri_;}
 
-    /** HTTP status code to be used on the invalid-request error page.
-     * Http::scNone indicates incomplete parse,
-     * Http::scOkay indicates no error.
-     */
-    Http::StatusCode request_parse_status;
-
 private:
     void skipGarbageLines();
     int parseRequestFirstLine();
 
     /// Offsets for pieces of the (HTTP request) Request-Line as per RFC 7230 section 3.1.1.
     /// only valid before and during parse stage HTTP_PARSE_FIRST
     struct request_offsets {
         int start, end;
         int m_start, m_end; // method
         int u_start, u_end; // url
         int v_start, v_end; // version (full text)
     } req;
 
     /// what request method has been found on the first line
     HttpRequestMethod method_;
 
     /// raw copy of the original client reqeust-line URI field
     SBuf uri_;
 };
 

=== added file 'src/http/one/ResponseParser.cc'
--- src/http/one/ResponseParser.cc	1970-01-01 00:00:00 +0000
+++ src/http/one/ResponseParser.cc	2014-11-12 11:12:05 +0000
@@ -0,0 +1,234 @@
+#include "squid.h"
+#include "Debug.h"
+#include "http/one/ResponseParser.h"
+#include "http/ProtocolVersion.h"
+#include "parser/Tokenizer.h"
+#include "profiler/Profiler.h"
+#include "SquidConfig.h"
+
+const SBuf Http::One::ResponseParser::IcyMagic("ICY ");
+
+Http1::Parser::size_type
+Http::One::ResponseParser::firstLineSize() const
+{
+    Http1::Parser::size_type result = 0;
+
+    switch (msgProtocol_.protocol)
+    {
+    case AnyP::PROTO_HTTP:
+        result += Http1magic.length();
+        break;
+    case AnyP::PROTO_ICY:
+        result += IcyMagic.length();
+        break;
+    default: // no other protocols supported
+        return result;
+    }
+    // NP: the parser does not accept >2 DIGIT for version numbers
+    if (msgProtocol_.minor >10)
+        result += 2;
+    else
+        result += 1;
+
+    result += 5; /* 5 octets in: SP status SP */
+    result += reasonPhrase_.length();
+    return result;
+}
+
+// NP: we found the protocol version and consumed it already.
+// just need the status code and reason phrase
+const int
+Http::One::ResponseParser::parseResponseStatusAndReason()
+{
+    if (buf_.isEmpty())
+        return 0;
+
+    ::Parser::Tokenizer tok(buf_);
+
+    if (!completedStatus_) {
+        debugs(74, 9, "seek status-code in: " << tok.remaining().substr(0,10) << "...");
+        SBuf status;
+        // status code is 3 DIGIT octets
+        // NP: search space is >3 to get terminator character)
+        if(!tok.prefix(status, CharacterSet::DIGIT, 4))
+            return -1; // invalid status
+        // NOTE: multiple SP or non-SP bytes between version and status code are invalid.
+        if (tok.atEnd())
+            return 0; // need more to be sure we have it all
+        if(!tok.skip(' '))
+            return -1; // invalid status, a single SP terminator required
+        // NOTE: any whitespace after the single SP is part of the reason phrase.
+
+        debugs(74, 6, "found string status-code=" << status);
+
+        // get the actual numeric value of the 0-3 digits we found
+        ::Parser::Tokenizer t2(status);
+        int64_t statusValue;
+        if (!t2.int64(statusValue))
+            return -1; // ouch. digits not forming a valid number?
+        debugs(74, 6, "found int64 status-code=" << statusValue);
+        if (statusValue < 0 || statusValue > 999)
+            return -1; // ouch. digits not within valid status code range.
+
+        statusCode_ = static_cast<Http::StatusCode>(statusValue);
+
+        buf_ = tok.remaining(); // resume checkpoint
+        completedStatus_ = true;
+    }
+
+    if (tok.atEnd())
+        return 0; // need more to be sure we have it all
+
+    /* RFC 7230 says we SHOULD ignore the reason phrase content
+     * but it has a definite valid vs invalid character set.
+     * We interpret the SHOULD as ignoring absence and syntax, but
+     * producing an error if it contains an invalid octet.
+     */
+
+    debugs(74, 9, "seek reason-phrase in: " << tok.remaining().substr(0,50) << "...");
+
+    // if we got here we are still looking for reason-phrase bytes
+    static const CharacterSet phraseChars = CharacterSet::WSP + CharacterSet::VCHAR + CharacterSet::OBSTEXT;
+    tok.prefix(reasonPhrase_, phraseChars); // optional, no error if missing
+    tok.skip('\r'); // optional trailing CR
+
+    if (tok.atEnd())
+        return 0; // need more to be sure we have it all
+
+    // LF existence matters
+    if (!tok.skip('\n')) {
+        reasonPhrase_.clear();
+        return -1; // found invalid characters in the phrase
+    }
+
+    debugs(74, DBG_DATA, "parse remaining buf={length=" << tok.remaining().length() << ", data='" << tok.remaining() << "'}");
+    buf_ = tok.remaining(); // resume checkpoint
+    return 1;
+}
+
+const int
+Http::One::ResponseParser::parseResponseFirstLine()
+{
+    ::Parser::Tokenizer tok(buf_);
+
+    if (msgProtocol_.protocol != AnyP::PROTO_NONE) {
+        debugs(74, 6, "continue incremental parse for " << msgProtocol_);
+        debugs(74, DBG_DATA, "parse remaining buf={length=" << tok.remaining().length() << ", data='" << tok.remaining() << "'}");
+        // we already found the magic, but not the full line. keep going.
+        return parseResponseStatusAndReason();
+
+    } else if (tok.skip(Http1magic)) {
+        debugs(74, 6, "found prefix magic " << Http1magic);
+        // HTTP Response status-line parse
+
+        // magic contains major version, still need to find minor
+        SBuf verMinor;
+        // NP: we limit to 2-digits for speed, there really is no limit
+        // XXX: the protocols we accept dont have valid versions > 10 anyway
+        if (!tok.prefix(verMinor, CharacterSet::DIGIT, 2))
+            return -1; // invalid version minor code
+        if (tok.atEnd())
+            return 0; // need more to be sure we have it all
+        if(!tok.skip(' '))
+            return -1; // invalid version, a single SP terminator required
+
+        debugs(74, 6, "found string version-minor=" << verMinor);
+
+        // get the actual numeric value of the 0-3 digits we found
+        ::Parser::Tokenizer t2(verMinor);
+        int64_t tvm = 0;
+        if (!t2.int64(tvm))
+            return -1; // ouch. digits not forming a valid number?
+        msgProtocol_.minor = static_cast<unsigned int>(tvm);
+
+        msgProtocol_.protocol = AnyP::PROTO_HTTP;
+        msgProtocol_.major = 1;
+
+        debugs(74, 6, "found version=" << msgProtocol_);
+
+        debugs(74, DBG_DATA, "parse remaining buf={length=" << tok.remaining().length() << ", data='" << tok.remaining() << "'}");
+        buf_ = tok.remaining(); // resume checkpoint
+        return parseResponseStatusAndReason();
+
+    } else if (tok.skip(IcyMagic)) {
+        debugs(74, 6, "found prefix magic " << IcyMagic);
+        // ICY Response status-line parse (same as HTTP/1 after the magic version)
+        msgProtocol_.protocol = AnyP::PROTO_ICY;
+        // NP: ICY has no /major.minor details
+        debugs(74, DBG_DATA, "parse remaining buf={length=" << tok.remaining().length() << ", data='" << tok.remaining() << "'}");
+        buf_ = tok.remaining(); // resume checkpoint
+        return parseResponseStatusAndReason();
+
+    } else if (buf_.length() > Http1magic.length() && buf_.length() > IcyMagic.length()) {
+        debugs(74, 2, "unknown/missing prefix magic. Interpreting as HTTP/0.9");
+        // found something that looks like an HTTP/0.9 response
+        // Gateway/Transform it into HTTP/1.1
+        msgProtocol_ = Http::ProtocolVersion(1,1);
+        // XXX: probably should use version 0.9 here and upgrade on output,
+        // but the old code did 1.1 transformation now.
+        statusCode_ = Http::scOkay;
+        static const SBuf gatewayPhrase("Gatewaying");
+        reasonPhrase_ = gatewayPhrase;
+        static const SBuf fakeHttpMimeBlock("X-Transformed-From: HTTP/0.9\r\n"
+                                            /* Server: visible_appname_string */
+                                            "Mime-Version: 1.0\r\n"
+                                            /* Date: squid_curtime */
+                                            "Expires: -1\r\n\r\n");
+        mimeHeaderBlock_ = fakeHttpMimeBlock;
+        parsingStage_ = HTTP_PARSE_DONE;
+        return 1; // no more parsing
+    }
+
+    return 0; // need more to parse anything.
+}
+
+bool
+Http::One::ResponseParser::parse(const SBuf &aBuf)
+{
+    buf_ = aBuf;
+    debugs(74, DBG_DATA, "Parse buf={length=" << aBuf.length() << ", data='" << aBuf << "'}");
+
+    // stage 1: locate the status-line
+    if (parsingStage_ == HTTP_PARSE_NONE) {
+        // RFC 7230 explicitly states whether garbage whitespace is to be handled
+        // at each point of the message framing boundaries.
+        // It omits mentioning garbage prior to HTTP Responses.
+        // Therefore, if we receive anything at all treat it as Response message.
+        if (!buf_.isEmpty())
+            parsingStage_ = HTTP_PARSE_FIRST;
+        else
+            return false;
+    }
+
+    // stage 2: parse the status-line
+    if (parsingStage_ == HTTP_PARSE_FIRST) {
+        PROF_start(HttpParserParseReplyLine);
+
+        int retcode = parseResponseFirstLine();
+
+        // first-line (or a look-alike) found successfully.
+        if (retcode > 0)
+            parsingStage_ = HTTP_PARSE_MIME;
+        debugs(74, 5, "status-line: retval " << retcode);
+        debugs(74, 5, "status-line: proto " << msgProtocol_);
+        debugs(74, 5, "status-line: status-code " << statusCode_);
+        debugs(74, 5, "status-line: reason-phrase " << reasonPhrase_);
+        debugs(74, 5, "Parser: bytes processed=" << (aBuf.length()-buf_.length()));
+        PROF_stop(HttpParserParseReplyLine);
+
+        // syntax errors already
+        if (retcode < 0) {
+            parsingStage_ = HTTP_PARSE_DONE;
+            statusCode_ = Http::scInvalidHeader;
+            return false;
+        }
+    }
+
+    // stage 3: locate the mime header block
+    if (parsingStage_ == HTTP_PARSE_MIME) {
+        if (!findMimeBlock("Response", Config.maxReplyHeaderSize))
+            return false;
+    }
+
+    return !needsMoreData();
+}

=== added file 'src/http/one/ResponseParser.h'
--- src/http/one/ResponseParser.h	1970-01-01 00:00:00 +0000
+++ src/http/one/ResponseParser.h	2014-11-09 05:51:25 +0000
@@ -0,0 +1,57 @@
+#ifndef _SQUID_SRC_HTTP_ONE_RESPONSEPARSER_H
+#define _SQUID_SRC_HTTP_ONE_RESPONSEPARSER_H
+
+#include "http/one/Parser.h"
+#include "http/StatusCode.h"
+
+namespace Http {
+namespace One {
+
+/** HTTP/1.x  protocol response parser
+ *
+ * Also capable of parsing unexpected ICY responses and
+ * upgrading HTTP/0.9 syntax responses to HTTP/1.1
+ *
+ * Works on a raw character I/O buffer and tokenizes the content into
+ * the major CRLF delimited segments of an HTTP/1 respone message:
+ *
+ * \item status-line (version SP status SP reash-phrase)
+ * \item mime-header (set of RFC2616 syntax header fields)
+ */
+class ResponseParser : public Http1::Parser
+{
+public:
+    ResponseParser() : Parser(), completedStatus_(false) {}
+    virtual ~ResponseParser() {}
+
+    /* Http::One::Parser API */
+    virtual void clear() {*this=ResponseParser();}
+    virtual Http1::Parser::size_type firstLineSize() const;
+    virtual bool parse(const SBuf &aBuf);
+
+    /* respone specific fields, read-only */
+    Http::StatusCode messageStatus() const { return statusCode_;}
+    SBuf reasonPhrase() const { return reasonPhrase_;}
+
+private:
+    const int parseResponseFirstLine();
+    const int parseResponseStatusAndReason();
+
+    /// magic prefix for identifying ICY response messages
+    static const SBuf IcyMagic;
+
+    /// Whether we found the status code yet.
+    /// We cannot rely on status value because server may send "000".
+    bool completedStatus_;
+
+    /// HTTP/1 status-line status code
+    Http::StatusCode statusCode_;
+
+    /// HTTP/1 status-line reason phrase
+    SBuf reasonPhrase_;
+};
+
+} // namespace One
+} // namespace Http
+
+#endif /* _SQUID_SRC_HTTP_ONE_RESPONSEPARSER_H */

=== modified file 'src/http/one/forward.h'
--- src/http/one/forward.h	2014-06-05 16:30:24 +0000
+++ src/http/one/forward.h	2014-11-11 14:46:20 +0000
@@ -1,20 +1,23 @@
 #ifndef SQUID_SRC_HTTP_ONE_FORWARD_H
 #define SQUID_SRC_HTTP_ONE_FORWARD_H
 
 #include "base/RefCount.h"
 
 namespace Http {
 namespace One {
 
 class Parser;
 typedef RefCount<Http::One::Parser> ParserPointer;
 
 class RequestParser;
 typedef RefCount<Http::One::RequestParser> RequestParserPointer;
 
+class ResponseParser;
+typedef RefCount<Http::One::ResponseParser> ResponseParserPointer;
+
 } // namespace One
 } // namespace Http
 
 namespace Http1 = Http::One;
 
 #endif /* SQUID_SRC_HTTP_ONE_FORWARD_H */

=== modified file 'src/servers/HttpServer.cc'
--- src/servers/HttpServer.cc	2014-11-10 12:11:20 +0000
+++ src/servers/HttpServer.cc	2014-11-13 13:51:36 +0000
@@ -138,57 +138,57 @@
 }
 
 void clientProcessRequestFinished(ConnStateData *conn, const HttpRequest::Pointer &request);
 
 bool
 Http::Server::buildHttpRequest(ClientSocketContext *context)
 {
     HttpRequest::Pointer request;
     ClientHttpRequest *http = context->http;
     if (context->flags.parsed_ok == 0) {
         clientStreamNode *node = context->getClientReplyContext();
         debugs(33, 2, "Invalid Request");
         quitAfterError(NULL);
         // setLogUri should called before repContext->setReplyToError
         setLogUri(http, http->uri, true);
         clientReplyContext *repContext = dynamic_cast<clientReplyContext *>(node->data.getRaw());
         assert(repContext);
 
         // determine which error page templates to use for specific parsing errors
         err_type errPage = ERR_INVALID_REQ;
-        switch (parser_->request_parse_status) {
+        switch (parser_->parseStatusCode) {
         case Http::scRequestHeaderFieldsTooLarge:
             // fall through to next case
         case Http::scUriTooLong:
             errPage = ERR_TOO_BIG;
             break;
         case Http::scMethodNotAllowed:
             errPage = ERR_UNSUP_REQ;
             break;
         case Http::scHttpVersionNotSupported:
             errPage = ERR_UNSUP_HTTPVERSION;
             break;
         default:
             // use default ERR_INVALID_REQ set above.
             break;
         }
-        repContext->setReplyToError(errPage, parser_->request_parse_status, parser_->method(), http->uri,
+        repContext->setReplyToError(errPage, parser_->parseStatusCode, parser_->method(), http->uri,
                                     clientConnection->remote, NULL, in.buf.c_str(), NULL);
         assert(context->http->out.offset == 0);
         context->pullData();
         return false;
     }
 
     if ((request = HttpRequest::CreateFromUrlAndMethod(http->uri, parser_->method())) == NULL) {
         clientStreamNode *node = context->getClientReplyContext();
         debugs(33, 5, "Invalid URL: " << http->uri);
         quitAfterError(request.getRaw());
         // setLogUri should called before repContext->setReplyToError
         setLogUri(http, http->uri, true);
         clientReplyContext *repContext = dynamic_cast<clientReplyContext *>(node->data.getRaw());
         assert(repContext);
         repContext->setReplyToError(ERR_INVALID_URL, Http::scBadRequest, parser_->method(), http->uri, clientConnection->remote, NULL, NULL, NULL);
         assert(context->http->out.offset == 0);
         context->pullData();
         return false;
     }
 

=== modified file 'src/tests/testHttp1Parser.cc'
--- src/tests/testHttp1Parser.cc	2014-11-09 14:57:25 +0000
+++ src/tests/testHttp1Parser.cc	2014-11-13 12:13:27 +0000
@@ -52,85 +52,85 @@
     HttpRequestMethod method;
     int uriStart;
     int uriEnd;
     const char *uri;
     int versionStart;
     int versionEnd;
     AnyP::ProtocolVersion version;
 };
 
 static void
 testResults(int line, const SBuf &input, Http1::RequestParser &output, struct resultSet &expect)
 {
 #if WHEN_TEST_DEBUG_IS_NEEDED
     printf("TEST @%d, in=%u: " SQUIDSBUFPH "\n", line, input.length(), SQUIDSBUFPRINT(input));
 #endif
 
     CPPUNIT_ASSERT_EQUAL(expect.parsed, output.parse(input));
     CPPUNIT_ASSERT_EQUAL(expect.needsMore, output.needsMoreData());
     if (output.needsMoreData())
         CPPUNIT_ASSERT_EQUAL(expect.parserState, output.parsingStage_);
-    CPPUNIT_ASSERT_EQUAL(expect.status, output.request_parse_status);
+    CPPUNIT_ASSERT_EQUAL(expect.status, output.parseStatusCode);
     CPPUNIT_ASSERT_EQUAL(expect.msgStart, output.req.start);
     CPPUNIT_ASSERT_EQUAL(expect.msgEnd, output.req.end);
     CPPUNIT_ASSERT_EQUAL(expect.suffixSz, output.buf_.length());
     CPPUNIT_ASSERT_EQUAL(expect.methodStart, output.req.m_start);
     CPPUNIT_ASSERT_EQUAL(expect.methodEnd, output.req.m_end);
     CPPUNIT_ASSERT_EQUAL(expect.method, output.method_);
     CPPUNIT_ASSERT_EQUAL(expect.uriStart, output.req.u_start);
     CPPUNIT_ASSERT_EQUAL(expect.uriEnd, output.req.u_end);
     if (expect.uri != NULL)
         CPPUNIT_ASSERT_EQUAL(0, output.uri_.cmp(expect.uri));
     CPPUNIT_ASSERT_EQUAL(expect.versionStart, output.req.v_start);
     CPPUNIT_ASSERT_EQUAL(expect.versionEnd, output.req.v_end);
     CPPUNIT_ASSERT_EQUAL(expect.version, output.msgProtocol_);
 }
 
 void
 testHttp1Parser::testParserConstruct()
 {
     // whether the constructor works
     {
         Http1::RequestParser output;
         CPPUNIT_ASSERT_EQUAL(true, output.needsMoreData());
         CPPUNIT_ASSERT_EQUAL(Http1::HTTP_PARSE_NONE, output.parsingStage_);
-        CPPUNIT_ASSERT_EQUAL(Http::scNone, output.request_parse_status); // XXX: clear() not being called.
+        CPPUNIT_ASSERT_EQUAL(Http::scNone, output.parseStatusCode); // XXX: clear() not being called.
         CPPUNIT_ASSERT_EQUAL(-1, output.req.start);
         CPPUNIT_ASSERT_EQUAL(-1, output.req.end);
         CPPUNIT_ASSERT(output.buf_.isEmpty());
         CPPUNIT_ASSERT_EQUAL(-1, output.req.m_start);
         CPPUNIT_ASSERT_EQUAL(-1, output.req.m_end);
         CPPUNIT_ASSERT_EQUAL(HttpRequestMethod(Http::METHOD_NONE), output.method_);
         CPPUNIT_ASSERT_EQUAL(-1, output.req.u_start);
         CPPUNIT_ASSERT_EQUAL(-1, output.req.u_end);
         CPPUNIT_ASSERT(output.uri_.isEmpty());
         CPPUNIT_ASSERT_EQUAL(-1, output.req.v_start);
         CPPUNIT_ASSERT_EQUAL(-1, output.req.v_end);
         CPPUNIT_ASSERT_EQUAL(AnyP::ProtocolVersion(), output.msgProtocol_);
     }
 
     // whether new() works
     {
         Http1::RequestParser *output = new Http1::RequestParser;
         CPPUNIT_ASSERT_EQUAL(true, output->needsMoreData());
         CPPUNIT_ASSERT_EQUAL(Http1::HTTP_PARSE_NONE, output->parsingStage_);
-        CPPUNIT_ASSERT_EQUAL(Http::scNone, output->request_parse_status);
+        CPPUNIT_ASSERT_EQUAL(Http::scNone, output->parseStatusCode);
         CPPUNIT_ASSERT_EQUAL(-1, output->req.start);
         CPPUNIT_ASSERT_EQUAL(-1, output->req.end);
         CPPUNIT_ASSERT(output->buf_.isEmpty());
         CPPUNIT_ASSERT_EQUAL(-1, output->req.m_start);
         CPPUNIT_ASSERT_EQUAL(-1, output->req.m_end);
         CPPUNIT_ASSERT_EQUAL(HttpRequestMethod(Http::METHOD_NONE), output->method_);
         CPPUNIT_ASSERT_EQUAL(-1, output->req.u_start);
         CPPUNIT_ASSERT_EQUAL(-1, output->req.u_end);
         CPPUNIT_ASSERT(output->uri_.isEmpty());
         CPPUNIT_ASSERT_EQUAL(-1, output->req.v_start);
         CPPUNIT_ASSERT_EQUAL(-1, output->req.v_end);
         CPPUNIT_ASSERT_EQUAL(AnyP::ProtocolVersion(), output->msgProtocol_);
         delete output;
     }
 }
 
 void
 testHttp1Parser::testParseRequestLineProtocols()
 {
     // ensure MemPools etc exist

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ParserNG_pt2_response_parser_mk1.patch.sig
Type: application/octet-stream
Size: 287 bytes
Desc: not available
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20141114/069eee44/attachment-0001.obj>


More information about the squid-dev mailing list