[squid-users] squid cache

Thu Oct 1 00:29:29 UTC 2015

On 1/10/2015 8:47 a.m., Antony Stone wrote:
> On Wednesday 30 September 2015 at 21:35:32, Magic Link wrote:
> 
>> Hi,i configure squid to use cache. It seems to work because when i did a
>> try with a software's download, the second download is TCP_HIT in the
>> access.log.
> 
> Congratulations.
> 
>> The question i have is : why the majority of requests can't be cached (i
>> have a lot of tcp_miss/200) ?
> 
> Whether that is the *majority* of requests depends greatly upon what sort of 
> content you are requesting.
> 
> That may sound trite, but I can't think of a better way of expressing it.  Get 
> 10 users on your network to download the same image file (no, not all at the 
> same time), and you'll see that the 9 who were not first get the content a lot 
> faster than the 1 who was first.
> 
> If they're downloading other types of content, though, you may not get such a 
> "good" result.
> 

Or even if they are downloading different variations of the same object.
For example IE and Firefox request gzip'ed versions, but in different
ways. Chrome requests sdch compressed versions, Safari requests deflated
versions, and robots generally request identity (non-)encoded forms.
Thats up to 5 MISS that will get logged for one object before HITs
become the majority.

>> i found that dynamic content is not cached but i don't understand.
> 
> What does "dynamic" mean?  It means it is not fixed / constant / stable.  In 
> other words, requesting the "same content" twice might result in different 
> answers, therefore if Squid gave you the first answer again in response to the 
> second request, that would not be what you would have got from the remote 
> server, and is therefore wrong.

Firstly check your config file for these, or some variation of them:
 acl QUERY url_regex cgi-bin \?
 cache deny QUERY

Remove them if you find them. This is how Squid-2 used to deal with
HTTP/1.1 dynamic content in the HTTP/1.0-only proxy. A lot of tutorisals
still exist and say to do it, but that is wrong for current Squid.

Current Squid (3.2+) are all HTTP/1.1 proxies and can safely deal with
caching all that dynamic content properly.

> 
> Example: eBay
> 
> You look up an auction which is due to end in 2 minutes.  You see the current 
> price and the number of bids (plus the details of what it is, etc).
> 
> 5 minutes later you request the same URL again.  It would be wrong of Squid to 
> show you the same page, with 2 minutes to go, and the bidding at X currency 
> units, from Y other bidders.  No, Squid should realise that the content it 
> previously requested is now stale, and it needs to fetch the new current 
> content and show you who won the auction and for how much.nt.
> 
> That is dynamic content.  The remote server tells Squid that there is no point 
> in caching the page it just fetched, because within 1 second it may well be 
> stale and need fetching anew.

In HTTP/1.0 that would have been the end of it. With the content not
cacheable.

But 1 whole second is a long time and popular auctions might have lots
of visitors during that time.

So in HTTP/1.1 the response *is* cacheable and even re-usable for that
whole second. The server response usually now instructs the proxy cache
to store the object, but do a fast revalidation (REFRESH) query to find
out if it as been changed before using it as a HIT. These get logged as
REFRESH instead of HIT or MISS.

> 
> A lot of the Internet works that way these days.
> 

The newer your Squid the better it will be at dealing with modern uses
of HTTP.

Amos