[squid-users] TCP_MISS/304 question

Yuri Voinov yvoinov at gmail.com
Thu Oct 13 21:44:12 UTC 2016


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
 


14.10.2016 2:48, Alex Rousskov пишет:
> On 10/13/2016 01:44 PM, Yuri Voinov wrote:
>
>> However, this is nothing more than word games, Alex.
>
> ... unless the definition of a hit affects your billing or your
> interpretation of Squid documentation or the developer interpretation of
> the code. Definitions matter! You yourself have seen their importance
> when you showed your excellent byte hit ratio results but folks were
> looking at the ordinary document hit ratio numbers instead.
Sure. But difference with TCP_HIT itself and byte hit is obvious.
>
>
>
>> The question is -
>> can we more or less significant differences from known what hit proxy
>> code level and / or transactions which, obviously, on the proxy level,
>> we can see in its entirety.
>
> Sorry, I do not understand the question.
I want to say that on the proxy level, seeing the transaction as a
whole, we are able to differentiate hit or his likeness from all other
transactions. We see the whole session in its entirety. We see repeated
queries of the same client to the same resource. Accordingly, we can
quite clearly be judged by the behavior of the header from the client or
server that is happening. Correctly?

Specifically, in this particular case. Proxy IMS settings is enabled:

refresh_all_ims on
reload_into_ims on

On web-page level we have: periodically reload/refresh directive, which
is forces to check (after initially store in shared cache) freshness of
content.

In this situation (and I've checked this web-page elements stored in
cache) TCP_MISS/304 means TCP_REFRESH_UNMODIFIED.

So, this is HIT exactly.

I'm not saying - literally. And in fact. Correctly?

>
>
>>> Unfortunately, there are too many definitions of a "hit".
>
>> There is no many definitions of hit. We are talking about the caching
>> proxy, which is basically no different from all the other caches, and
>> subject to the same rules.
>
> You are oversimplifying a complex subject matter. If Squid delivers a
> single response comprising 1000 bytes from the cache and 10 bytes from
> the origin server, is that a hit or a miss? If Squid delivers the entire
> response from the cache but spends 10 minutes talking to the origin
> server about that object first, is that a hit or a miss? Different
> people will give you different answers to those questions.
10 minutes a bit above TCP timeout and will be aborted, I think. So,
Squid's write TCP_MISS_ABORTED in access.log. :)
>
>
> We have [poorly defined] byte hits, document hits, revalidation hits,
> stale hits, partial hits, etc., etc.
What yes - yes. The documentation is the problem.
>
>
>
>> If the first access does not find an object in the cache, it requests
>> from the network,
>
> yes
>
>> saves in the cache,
>
> or does not
Yes. May be or may be not. But in this case we are:
1) Know about transaction history and we know the object(s) in cache.
2) Proxy can easy check it, right? Just swap in object from disk in
memory. If this success, object in cache, so we can qualify it as HIT.
Otherwise, exactly MISS.
>
>
>> and re-treatment or gets a hit,
>
> or does not
>
>> "the object is not changed." Dot.
>
> or the Squid-cached object did not change but the client-cached object
> did. Or vice versa.
Client-cached object gives from Squid. They (ideally) must not be the
different. Client cache and squid's cache operates like chain, one is
source for another.
>
>
>
>> If the time in the cache
>> object lifetime expires, or a lifetime on the server timed out - the
>> object is requested again and a miss is recorded.
>
> * Yes, if you define a miss as "contact with the origin server".
I want to add: "contact with the origin server for get content". Not for
revalidation purposes. If revalidation returns "Object not changed" -
this is positive and must be qualified as HIT IMO.
>
> * No, if contact with the origin server is OK for a hit as long as the
> server does not send the response _body_ back to Squid.
.... when revalidation true - i.e. object in shared cache not stale,
this is HIT. We're not interested in client browser's cache state. Only
shared cache matters.
>
>
>
>> if
>> the proxy responds to the client "has not changed", it means, in fact,
>> that the client has a copy of the object
>
> Yes.
>
>> and a copy of the proxy object,
>
> The copy in the proxy cache may be different from the copy in the client
> cache or may not exist at all.
Yes. If object not exists in proxy - this is proxy MISS. If client cache
not contains object - client go to proxy and asks it about object. Found
- excellent, for client this is MISS, for proxy - HIT. If proxy also not
contains object - it will be MISS-MISS and loading object from origin.

>
>
>
>> the proxy and responds to the client, performing REFRESH that the object
>> did not change. What is this, if not hit?
>
> Assuming the proxy asked the origin server whether the object in the
> client (or the proxy, depending on the circumstances) cache is fresh,
> for many, it is
>
> * a [document] miss (because there was a potentially very slow contact
> with the origin server) or
> * a [byte] hit (because the response body came from the Squid cache and
> not from the origin server).
>
> Resisting the existence of different valid hit definitions is futile
> IMO. State what _your_ definition is (be as precise as possible; this
> may require several iterations) and then you may ask whether a
> particular transaction is a hit.
>
> Alex.
I agree that there are a number of boundary cases. However, in most
cases we are dealing with a relatively simple chain, which should be
considered and, in my opinion. How is it to be regarded revalidation
facilities and its results? If revalidation confirms that the object is
not stale and not expired - it's a hit, is not it?

If revalidation fails - object stale/expired - everything is clear and
there is nothing to discuss. Definitely miss.

Well, let's say we do not know and can not know about the object in the
client cache. Assume also that we do not want to check - whether this
object is in the cache proxy. Let us assume that we do not want to spend
resources to figure out what happened to the object in the future, in
client's browser, or on proxy's disk cache. Ok. Is, in this case, would
not be more correct to write in log TCP_NONE/304?

In this case, we're talking directly - "We do not know, hit it or not.
We only know that the object has not changed since the last
request/revalidation. We do not want to know, and you can interpret it
any way you like".

 It would be more correct, it seems to me, than just to say -
"TCP_MISS/304 - This is a cache miss, whatever it was not really."

WBR, Yuri
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
iQEcBAEBCAAGBQJYAAAsAAoJENNXIZxhPexGo5EH/15gr4ShSolL0I0RFZOnZbcg
UPZle45kf9ODeLHQ8RKeUwlmOo3jIEeX1WDoYV++scHsqMeBaydwG4ysjED8RhGf
TzfGJyzmTUDzcxe4QpYft3JFvml0uIc74RAPCVq7w6a4FKuPMVHvjqJwJeQtiKSU
V8zkME6SA4K2HrhtiZjvWvFV0YOmH9oQEj7t4S2lt/OJG6w0AsTV3qLHdC6kyeso
rLJkxmJDW7oH7Va+xoP7R6hflsoMv9t3MvuOS1slNyDSZ+nqsRPTJSrhksl7oCIV
I3GI4aPtOecWJOKeenbjWHDyueI6A1+1E5PKgpQW0ysziHi39HyB0Gz1yBhbeWo=
=bTMv
-----END PGP SIGNATURE-----

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0x613DEC46.asc
Type: application/pgp-keys
Size: 2437 bytes
Desc: not available
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20161014/aab382fe/attachment-0001.key>


More information about the squid-users mailing list