[squid-users] TCP_MISS/304 question

Amos Jeffries squid3 at treenet.co.nz
Fri Oct 14 12:30:44 UTC 2016


On 15/10/2016 12:34 a.m., Yuri Voinov wrote:
> 
> A bit more details.
> 
> This is 4 transactions in chronological order. Two from wget -S and two
> from same PC via browser for the same URL:
> 
> *root @ khorne /tmp # wget -S
> http://www.gazeta.ru/nm2015/js/gazeta.media.query.js*
> --2016-10-14 17:18:05-- 
> http://www.gazeta.ru/nm2015/js/gazeta.media.query.js
> Connecting to 127.0.0.1:3128... connected.
> Proxy request sent, awaiting response...
>   HTTP/1.1 301 Moved Permanently
>   Server: nginx
>   Date: Fri, 14 Oct 2016 11:18:07 GMT
>   Content-Type: text/html
>   Content-Length: 178
>   Location: https://www.gazeta.ru/nm2015/js/gazeta.media.query.js
>   X-Cache: MISS from khorne
>   X-Cache-Lookup: MISS from khorne:3128
>   Connection: keep-alive
> Location: https://www.gazeta.ru/nm2015/js/gazeta.media.query.js [following]
> --2016-10-14 17:18:07-- 
> https://www.gazeta.ru/nm2015/js/gazeta.media.query.js

Notice how the Location header made wget fetch send a second fetch to
*actually* load an HTTPS object.

This means your use of HTTP is irrelevant. HTTP just results in an 301
response. That is the end of the HTTP...


> Connecting to 127.0.0.1:3128... connected.
> Proxy request sent, awaiting response...
>   HTTP/1.1 200 OK
>   Server: nginx
>   Date: Fri, 14 Oct 2016 10:45:57 GMT
>   Content-Type: application/javascript; charset=windows-1251
>   Last-Modified: Fri, 30 Oct 2015 12:33:38 GMT
>   ETag: W/"cdf370-758-52351a306ac80"
>   Cache-Control: max-age=3600
>   Expires: Fri, 14 Oct 2016 11:45:57 GMT
>   Access-Control-Allow-Origin: *
>   Age: 1930
>   X-Cache: HIT from khorne
>   X-Cache-Lookup: HIT from khorne:3128
>   Transfer-Encoding: chunked
>   Connection: keep-alive
> Length: unspecified [application/javascript]
> Saving to: 'gazeta.media.query.js'
> 
> gazeta.media.query.     [ <=>               ]   1.84K  --.-KB/s    in
> 0s     
> 
> 2016-10-14 17:18:07 (138 MB/s) - 'gazeta.media.query.js' saved [1880]
> 
> /HTTP object in cache and HIT./

No. *HTTPS* object in cache and HIT.


> *
> **root @ khorne /tmp # wget -S
> https://www.gazeta.ru/nm2015/js/gazeta.media.query.js*
> --2016-10-14 17:18:30-- 
> https://www.gazeta.ru/nm2015/js/gazeta.media.query.js
> Connecting to 127.0.0.1:3128... connected.
> Proxy request sent, awaiting response...
>   HTTP/1.1 200 OK
>   Server: nginx
>   Date: Fri, 14 Oct 2016 10:45:57 GMT
>   Content-Type: application/javascript; charset=windows-1251
>   Last-Modified: Fri, 30 Oct 2015 12:33:38 GMT
>   ETag: W/"cdf370-758-52351a306ac80"
>   Cache-Control: max-age=3600
>   Expires: Fri, 14 Oct 2016 11:45:57 GMT
>   Access-Control-Allow-Origin: *
>   Age: 1953
>   X-Cache: HIT from khorne
>   X-Cache-Lookup: HIT from khorne:3128
>   Transfer-Encoding: chunked
>   Connection: keep-alive
> Length: unspecified [application/javascript]
> Saving to: 'gazeta.media.query.js.1'
> 
> gazeta.media.query.     [ <=>               ]   1.84K  --.-KB/s    in
> 0s     
> 
> 2016-10-14 17:18:30 (120 MB/s) - 'gazeta.media.query.js.1' saved [1880]
> 
> /HTTPS object in cache and HIT too./

No. Same HTTPS object from test #1 is still in cache and still being HIT.

> 
> This is ok.

Uh, not if you are going to interpret the first test as being an HTTP
object in cache.

What this tells is that fetching an HTTPS object twice in a row will
produce a HIT.

> 
> *Ctrl+F5 (force reload) from browser:*
> 
> 1476443947.419     92 192.168.100.103 TCP_MISS/200 2323 GET
> https://www.gazeta.ru/nm2015/js/gazeta.media.query.js -
> HIER_DIRECT/81.19.72.0 application/javascript
> 
> MISS - it is ok too, client browser sends no-cache.

Did you check the client request to verify that "no-cache" statement?

> 
> At this point we sure object in cache, right? Both in proxy cache and in
> client cache (client is the same in attempt 3 and 4). Now - refresh from
> browser on the same page (same session), which is equivalent of page
> auto-refresh.

Yes, that is a reasonable state to assume at this point. Though possibly
wrong, since it is an assumption.

> 
> *F5 (refresh) from the same browser:*
> 

NP: be aware that two fetches in a row is different form force-refresh,
which is different from non-forced refresh.

One of the two refresh involved no-cache header, the other involves
max-age=0 header.
The double-fetch does not send either no-cache nor max-age=0.

Also be aware that the MSIE browser name for the button "Refresh" which
got copied by the others is browser GUI terminology, not HTTP terminology.
HTTP terminology uses "force-refresh" or "reload" for the two request
header cases caused by F5 and Ctrl+F5.


> 1476443997.252     96 192.168.100.103 TCP_MISS/304 353 GET
> https://www.gazeta.ru/nm2015/js/gazeta.media.query.js -
> HIER_DIRECT/81.19.72.0 -
> 
> Here is it. Object in proxy cache, in client cache, revalidation is ok -
> object not changed. It must be TCP_REFRESH_UNMODIFIED, and this tag
> we've got with HTTP object via browser.

No. I think you have confused the GUI button name with the Squid log tag
name.
REFRESH tag occurs on revalidation transactions. The F5 (aka
"forced-refresh") can lead to that.
The Ctrl+F5 (aka. "reload") can only lead to MISS (aks reload, re-fetch,
"discard cached contents").

When a browser send no-cache the cached content must not be used, but
can be updated by the reply that comes back.

When the browser sends max-age=0, the cached contents *can* be used
provided they meet the 0sec old criteria (ie revalidate, then use the
resulting cache object).

> 
> /But shit! HTTPS goes TCP_MISS/304! We're expected to get
> TCP_REFRESH_UNMODIFIED/304! Because this is refresh operation, we're
> sure object in both caches - proxy and client, revalidation is ok, but
> this marks as MISS./

Your expectation was fooled by the browser mis-naming of things.

> 
> Why HTTP refresh goes with TCP_REFRESH_UNMODIFIED, and the same object
> via HTTPS goes with TCP_MISS? As shown above, object has no headers
> preventing caching.

HTTP is not relevant for the reason stated above. What your test has
done is fetch the same https:// URL in three different ways and seen
three different log entries result from that.

Whether they were the right responses is unknown. But at least they were
different. I would be more worried if they were the same.

> 
> Is it bug or feature? Because of, when site goes under HTTPS, it will
> has lower hit with the same content. It seems wrong.

I hope the above clarifies the situation. There may or may not be bugs
involved, but this test does not demonstrate any that I can see.

It also does not demonstrate any difference in HIT with HTTP vs HTTPS.
In fact it demonstrates that HTTPS is getting HITs.

Amos


More information about the squid-users mailing list