[squid-users] TCP_MISS/304 question

Fri Oct 14 12:07:21 UTC 2016

On 14/10/2016 10:44 a.m., Yuri Voinov wrote:
> 
> 
> 
> 14.10.2016 2:48, Alex Rousskov пишет:
>> On 10/13/2016 01:44 PM, Yuri Voinov wrote:
> 
>>> However, this is nothing more than word games, Alex.
> 
>> ... unless the definition of a hit affects your billing or your
>> interpretation of Squid documentation or the developer interpretation of
>> the code. Definitions matter! You yourself have seen their importance
>> when you showed your excellent byte hit ratio results but folks were
>> looking at the ordinary document hit ratio numbers instead.
> Sure. But difference with TCP_HIT itself and byte hit is obvious.
> 

Even that is not so obvious as it appears at first. Since 3.5.22 it is
possible that the proxy finds an item in its cache that needs to be
revalidated. When asked the server responds with a 304 that contains
info far older than the proxy found (maybe from a more up-to-date server
IP).
In this situation Squid discards the outdated server response and
deliveres its content. That should be marked as TCP_HIT under the
current tagging system.

This case and others involving the (thankfully rare) If-Unmodified-Since
header are why I am trying to work out a better tagging scheme that will
be less confusing and more descriptive for the operations.

> 
>>> The question is -
>>> can we more or less significant differences from known what hit proxy
>>> code level and / or transactions which, obviously, on the proxy level,
>>> we can see in its entirety.
> 
>> Sorry, I do not understand the question.
> I want to say that on the proxy level, seeing the transaction as a
> whole, we are able to differentiate hit or his likeness from all other
> transactions. We see the whole session in its entirety. We see repeated
> queries of the same client to the same resource. Accordingly, we can
> quite clearly be judged by the behavior of the header from the client or
> server that is happening. Correctly?
> 

HTTP/1.0 was simple. HIT and MISS were easily known and matched what you
have learnt to expect.

HTTP/1.1 adds 2x time-based If-*Since headers, 2x ETag based If-*Match
headrs, and one extendable If: header - each of which has a 200 or 304
response. With a client cache, proxy cache and upstream cache each
having one of the content-vs-nothing states.

The result is that the log line describes one of the 3*(2^5) = 96
different transaction cases which could occur.

My math there is assuming the that each header adds only a binary
condition (X, or not X). If they add trinary (X, not-X, ignore) then its
2^2*(3^4) = 972 cases, which seems a bit much to me.

> Specifically, in this particular case. Proxy IMS settings is enabled:
> 
> refresh_all_ims on
> reload_into_ims on
> 

These are converting CC:no-cache or CC:max-age=0 client headers into IMS
(If-Modified-Since) revalidations against the server. All they do is
convert the Squid<->server transaction into a revalidation.

The client should still get a /200 object even though the server side
used revalidation. It is a bug for Squid to respond to those client
requests with /304 unless the client *also* sent If-* headers along with
the CC reload/refresh header.

> On web-page level we have: periodically reload/refresh directive, which
> is forces to check (after initially store in shared cache) freshness of
> content.
> 

These reload/refresh coming from the client are supposed to only happen
when a user identifies breakage in the web page and manually forces it
to happen (refresh button in the browser). Bots can use them, but should
not need to.

So when HTTP is working properly they should almost never happen (client
never sees breakage).

If you find them happening a lot it is a sign of breakage.

> In this situation (and I've checked this web-page elements stored in
> cache) TCP_MISS/304 means TCP_REFRESH_UNMODIFIED.
> 
> So, this is HIT exactly.

No "exactly" about it.

If it was *exactly* a HIT that would mean the cached response was a 304
message. Not some object the 304 was about, but the actual 304 status
line + mime headers.

Which also would be logged as as TCP_HIT/304. Identical to the different
case where a normal cache object being HIT on generated a new 304 to the
client.

> 
> I'm not saying - literally. And in fact. Correctly?

With 96 permutations of cases there may be the odd mistake. But where
Squid has a tag code for one case that case is usually tagged correctly.
So the TCP_MISS/304 is definitely *not* a TCP_REFRESH_UNMODIFIED - but
it also may not be strictly a MISS either.

It is entirely possible that a TCP_MISS/304 is a real TCP_MISS/304. MISS
on the proxy cache and 304 from upstream server.

The REFRESH MODIFIED/UNMODIFIED tags are give to the time-based
If-*-Since headers. The ETag based ones are not tagged the same because
'(un)modified' is not applicable. So this TCP_MISS/304 may be one of the
ETag revalidations on proxy content that we dont have a special code for
yet.

> 
>>>> Unfortunately, there are too many definitions of a "hit".
> 
>>> There is no many definitions of hit. We are talking about the caching
>>> proxy, which is basically no different from all the other caches, and
>>> subject to the same rules.
> 
>> You are oversimplifying a complex subject matter. If Squid delivers a
>> single response comprising 1000 bytes from the cache and 10 bytes from
>> the origin server, is that a hit or a miss? If Squid delivers the entire
>> response from the cache but spends 10 minutes talking to the origin
>> server about that object first, is that a hit or a miss? Different
>> people will give you different answers to those questions.
> 10 minutes a bit above TCP timeout and will be aborted, I think. So,
> Squid's write TCP_MISS_ABORTED in access.log. :)
> 

Or TCP_HIT_ABORTED or TCP_REFRESH_ABORTED or TCP_CLIENT_REFRESH_ABORTED :-(

the distiniction is in the headers and your definition of HIT/MISS.

> 
>> We have [poorly defined] byte hits, document hits, revalidation hits,
>> stale hits, partial hits, etc., etc.
> What yes - yes. The documentation is the problem.
> 

Partially. Documentation issue is caused in part by the worse problem of
cases overlapping and currently not being grouped together very well.

I am working on the fix for that. But it is gong very slowly because it
has to touch so much code in areas that I'm not very familiar with yet.
Plus the definitions issue Alex talked about, agreeing on how to define
each tag as it is created leads to long discussions like this one.

> 
>>> If the first access does not find an object in the cache, it requests
>>> from the network,
> 
>> yes
> 
>>> saves in the cache,
> 
>> or does not
> Yes. May be or may be not. But in this case we are:
> 1) Know about transaction history and we know the object(s) in cache.
> 2) Proxy can easy check it, right? Just swap in object from disk in
> memory. If this success, object in cache, so we can qualify it as HIT.
> Otherwise, exactly MISS.
> 
> 
>>> and re-treatment or gets a hit,
> 
>> or does not
> 
>>> "the object is not changed." Dot.
> 
>> or the Squid-cached object did not change but the client-cached object
>> did. Or vice versa.
> Client-cached object gives from Squid. They (ideally) must not be the
> different. Client cache and squid's cache operates like chain, one is
> source for another.

Within HTTP that is true. But clients are not restricted to HTTP as
sources for their cache contents. So we cannot draw such a distinctive
line in the definition of what is happening for a random line selected
out of somebodies access.log.

All we can say is what sub-group of those 96 cases has happened to cause
those tag values to appear. For exact knowledge (or to say what that
means in terms of byte counts) one needs to see the HTTP headers, at
least the client ones and preferrrably the full 11,2 trace data
containing both client and server messages.

> 
>>> If the time in the cache
>>> object lifetime expires, or a lifetime on the server timed out - the
>>> object is requested again and a miss is recorded.
> 
>> * Yes, if you define a miss as "contact with the origin server".
> I want to add: "contact with the origin server for get content". Not for
> revalidation purposes. If revalidation returns "Object not changed" -
> this is positive and must be qualified as HIT IMO.

Even this definition is tricky. For revalidations using the
If-Unmodified-Since header the server (or squid) is instructed to
deliver a 200+object when the "Object not changed" state happens.

That is usually where TCP_REFRESH_UNMODIFIED/200 come from.

> 
>> * No, if contact with the origin server is OK for a hit as long as the
>> server does not send the response _body_ back to Squid.
> .... when revalidation true - i.e. object in shared cache not stale,
> this is HIT. We're not interested in client browser's cache state. Only
> shared cache matters.
> 
> 
> 
>>> if
>>> the proxy responds to the client "has not changed", it means, in fact,
>>> that the client has a copy of the object
> 
>> Yes.
> 
>>> and a copy of the proxy object,
> 
>> The copy in the proxy cache may be different from the copy in the client
>> cache or may not exist at all.
> Yes. If object not exists in proxy - this is proxy MISS. If client cache
> not contains object - client go to proxy and asks it about object. Found
> - excellent, for client this is MISS, for proxy - HIT. If proxy also not
> contains object - it will be MISS-MISS and loading object from origin.
> 

This is where the ETag based If-*Match headers get in the way of a
definition. The proxy could easily have what appears to be a useful
response to the URL, but be told only to send it if *not* matching a set
of ETag values. Or the opposite.

I guess what I (we?) am trying to get across is that defining the entire
Internet in black-and-white terms of HIT vs MISS is wrong for HTTP/1.1
traffic. Within 1.1 traffic REFRESH is the norm, and HIT/MISS on local
proxy cache is only a minor subset of what is going on - even for
transactions logged as TCP_HIT or TCP_MISS.

Squid [read: "the changes we are making to Squid-3/4"] is trying to
minimize bandwidth on both server and client connections. Sometimes only
one connection can be reduced, many times both can be.

What any particular sysadmin sees in regards to client or server
bandwidth used by Squid varies a lot. Usually by how much Squid HTTP/1.1
feature support matches the relevant client or server feature support.

> 
>>> the proxy and responds to the client, performing REFRESH that the object
>>> did not change. What is this, if not hit?
> 
>> Assuming the proxy asked the origin server whether the object in the
>> client (or the proxy, depending on the circumstances) cache is fresh,
>> for many, it is
> 
>> * a [document] miss (because there was a potentially very slow contact
>> with the origin server) or
>> * a [byte] hit (because the response body came from the Squid cache and
>> not from the origin server).
> 
>> Resisting the existence of different valid hit definitions is futile
>> IMO. State what _your_ definition is (be as precise as possible; this
>> may require several iterations) and then you may ask whether a
>> particular transaction is a hit.
> 
>> Alex.
> I agree that there are a number of boundary cases. However, in most
> cases we are dealing with a relatively simple chain, which should be
> considered and, in my opinion. How is it to be regarded revalidation
> facilities and its results? If revalidation confirms that the object is
> not stale and not expired - it's a hit, is not it?
> 
> If revalidation fails - object stale/expired - everything is clear and
> there is nothing to discuss. Definitely miss.

Revalidation does not "fail". It happens or it does not. The result is
always a fresh object when it happens.

In order to cram the 96 revalidation states down into the overly simple
HIT vs MISS terminology you are implicitly using the byte-HIT
definition. Where size of updating headers-only vs updating payload body
is different.
 So your definition only means HIT = low bytes, MISS = many bytes.

> 
> Well, let's say we do not know and can not know about the object in the
> client cache. Assume also that we do not want to check - whether this
> object is in the cache proxy. Let us assume that we do not want to spend
> resources to figure out what happened to the object in the future, in
> client's browser, or on proxy's disk cache. Ok. Is, in this case, would
> not be more correct to write in log TCP_NONE/304?

No. Knowing zero about the client object is one of the 96 cases. (no
revalidation headers present). It looks like this:

 GET /foo HTTP/1.1
 Host: example.com

The response to this has to always involve a "/200". So your /304 is not
possible.
The remainder of the tags are decided by whether Squid has cached
content and needs to revalidate it. That will determine whether
HIT/MISS/REFRESH_* is logged.

Both REFRESH and MISS will involve a server, but different ways and
different amounts of bandwidth used.

Which brings us to your description of the squid cache...

> 
> In this case, we're talking directly - "We do not know, hit it or not.
> We only know that the object has not changed since the last
> request/revalidation. We do not want to know, and you can interpret it
> any way you like".

That would be a TCP_HIT/200. Maximum possible to-client bandwidth
expenditure. Zero server bandwidth expenditure.

However, by using the condition "We only know that the object has not
changed" you have defined this use-case as a situation where
revalidation is not performed. That alone makes it irrelevant to this
discussion about revalidations.

> 
>  It would be more correct, it seems to me, than just to say -
> "TCP_MISS/304 - This is a cache miss, whatever it was not really."
> 

This status code does not describe that.

It describes a cache MISS which when the server response was stored
allowed Squid to optimize away the bandwidth to the client and send only
a 304 response.

OR, it describes a MISS where Squid relayed conditional If-* headers to
the server. So the server responded with a 304 Squid had to relay on to
the client (without adding to its cache).

Amos