[squid-users] Http write cache
Amos Jeffries
squid3 at treenet.co.nz
Sun Sep 10 17:25:18 UTC 2017
On 10/09/17 21:14, Olivier MARCHETTA wrote:
> Hello,
>
>> Origin servers can sometimes respond to requests with payload ("uploads") before the request has fully arrived, but any subsequent network issues are guaranteed to result in data loss - so the practice is discouraged.
>
> If I understand, when it's a download (GET), Squid will replace the payload with the object in cache, if fresh.
Nod. This is possible because two identical requests
> But the HTTP control messages are still coming from the Origin server.
Not necessarily. There are no "control messages" as such in HTTP. The
cache controls are delivered along with the cached payload to indicate
what can be done with it. Synchronous server contact (aka revalidation)
to deliver responses is only required if those controls say so.
> In case of an upload (PUT), it won't accelerate to use the Squid cache,
> because the client has to wait for the Origin server's response of the payload transfer (or request).
Yes. Squid has never seen the request before, so has no idea what
response will appear as a result.
>
> The only option to make uploads faster is if the Origin server is aware that the client is using a reverse proxy cache and respond to the upload request before the full payload transfer.
>
Close, bit not quite. The server does not need to know about the proxy,
it just has to know the upload payload is "pointless waste of bandwidth"
(where data loss don't matter) and deliver its response early.
For example; this is usually seen with NTLM authentication, where
uploads without credentials are denied early. Because the upload has to
be repeated in full with the right credentials and all the bytes from
the first attempt can be dropped in-transit by the proxy.
> Tell me if I'm wrong, but I think that I understand now.
> Meaning that if I want to "bufferize" the writes it has to happen with another protocol before the WebDAV connection to Sharepoint Online.
>
The "other protocol" is WebDAV as far as I know. HTTP is just about
delivery of some request and its corresponding response. How WebDAV
transfers use HTTP messaging, and which parts of HTTP and WebDAV the
client and server implement may or may not support the behaviour you want.
You are then colliding with the definition differences between "cache"
and "buffer". Caches store *past* data for the purpose of reducing
current/future server work, buffers store *current* data awaiting delivery.
An upload is normally not something seen previously, so not cacheable.
Proxies and the network itself *do* buffer data along the way. But that
in no way adds any asynchronous properties to HTTP. The client still has
to wait for the HTTP response to be delivered back to it before it can
consider the HTTP part of that transaction over - the "transaction" in
this context may or may not be the full WebDAV upload+processing on the
server.
HTTP has some mechanisms that can help improve upload behaviour and
avoid pointless bandwidth delivery. Notably the Expect:100-continue and
Range features and 201/202 status codes. WebDAV extensions to HTTP add
various other things I'm not very familiar with.
Between them they can signal to the client a server is a) contactable
before data gets delivered, b) deliver it in small chunks to minimize
loss, and c) that any given part has completed arrival and awaiting some
state (ie full object arrival) and/or some async processing.
BUT, as should be obvious these are all application-logic level things
(ie WebDAV) and require explicit support by both the endpoint
applications on server and client for that logic to take place. The
async properties arise from how things are done *between* HTTP
transactions. The interactions are separate synchronous request+response
message pairs as far as Squid and any HTTP infrastructure is concerned.
Amos
More information about the squid-users
mailing list