[squid-dev] [PATCH] Increase request buffer size to 64kb

Mon Apr 4 00:32:46 UTC 2016

On 03/30/2016 11:50 PM, Nathan Hoad wrote:

> Alex, I've tried 8, 16, 32, 128 and 512 KB values - all sizes leading
> up to 64 KB scaled appropriately. 128 and 512 were the same or
> slightly worse than 64, so I think 64 KB is the "best value".

Sounds good, but it is even more important that you have explained *why*
below.

> On 30 March 2016 at 21:29, Amos Jeffries <squid3 at treenet.co.nz> wrote:
>> One thing you need to keep in mind with all this is that the above
>> macros *does not* configure the network I/O buffers.

> I don't think this is quite true - I don't think it's intentional, but
> I am lead to believe that HTTP_REQBUF_SZ does influence network IO
> buffers in some way. See below.

You are probably both right: HTTP_REQBUF_SZ influences network I/O but
not necessarily all network I/O buffer capacities. In other words,
increasing HTTP_REQBUF_SZ improves uncachable miss performance until
HTTP_REQBUF_SZ matches the size of the second smallest buffer (or I/O
size) in the buffer chain. In your test, that limit was read_ahead_gap.

>> In the long-term plan those internal uses will be replaced by SBuf

... or MemBlob or something that does not exist yet but uses MemBlob.

SBuf it trying to provide "everything to everybody" and, naturally,
becomes a bad choice in some situations with strict requirements (e.g.,
when one has to preserve raw buffer pointer but does not want single
buffer content ownership).

> Looking purely at system calls, it shows the reads from the upstream
> server are being read in 16 KB chunks, where as writes to the client
> are done in 4 KB chunks. With the patch, the writes to the client
> increase to 16 KB, so it appears that HTTP_REQBUF_SZ does influence
> network IO in this way.

Naturally: One cannot write using 16 KB chunks when the data goes
through a 4KB buffer. Please note that the position of that 4KB buffer
in the buffer chain is almost irrelevant -- the weakest link in the
chain determines "bytes pipelining" or "bytes streaming" speed.

> However post-patch the improvement is quite
> substantial, as the write(2) calls are now using the full 64 KB
> buffer. 

This is an important finding! It shows that read_ahead_gap documentation
lies or at least misleads: Evidently, that directive controls not just
accumulation of data but [maximum] network read sizes, at least for
HTTP. Please fix that documentation in your next patch revision.

> At this stage, I'm not entirely sure what the best course of action
> is. I'm happy to investigate things further, if people have
> suggestions. read_ahead_gap appears to influence downstream write
> buffer sizes, at least up to the maximum of HTTP_REQBUF_SZ.

In other words, Squid Client (http.cc and clients/*) is able to give
Squid Server (client_side*cc and servers/*) the results of a single
Client network read (i.e., a single read from the origin server of
peer). The result is copied to the ClientStream StoreIOBuffer. Thus, if
we do not want to slow Squid down artificially, the StoreIOBuffer
capacity should match the maximum Client read I/O size.

Do we use that StoreIOBuffer to write(2) to the HTTP client? If not,
what controls the I/O buffer size for writing to the HTTP client?

> It would
> be nice if that buffer size was independently run-time configurable

Actually, it would be nice if the buffer sizes just matched, avoiding
both artificial slowdown and the need for careful configuration in most
cases.

Whether configuration is also desirable in some special situations is a
separate question. If nobody comes forward with such a special
situation/need, then we may be better off avoiding adding a yet another
tunable and simply tying ClientStream buffer capacity to the
read_ahead_gap value. The only reason to add a configuration option
[that nobody we know needs] would be backward compatibility.

If backward compatibility is deemed important here, we can add a
configuration option, but change the _default_ buffer size to track
read_ahead_gap and, hence, avoid artificial slowdown (at least in v4).

How much impact does increasing read_ahead_gap and HTTP_REQBUF_SZ by B
and G bytes, respectively have on overall Squid memory usage? Is that
B+G per concurrent transaction, roughly? Are any other buffer capacities
that depend on those two parameters?

If the increase is B+G then your changes increase Squid RAM footprint by
C*(60+48) KB where C is the number of concurrent master transactions. A
Squid dealing with 1000 concurrent transactions would see its RAM usage
increase by about 100 MB, which is not terrible (and decreased response
times may reduce the number of concurrent transactions, partially
mitigating that increase). The commit message (and release notes) should
disclose the estimated increase though.

Thank you,

Alex.