[squid-users] Range header is a hit ratio killer

Eliezer Croitoru eliezer at ngtech.co.il
Wed Aug 10 11:19:38 UTC 2016


Well it will be different from system to system but one of the main things was not about prefetching.
Indeed sometimes prefetching is not possible, but, when you have a situation which parallel requests are causing amplified download by your proxy it means the proxy or the clients have some conflict of "interest".
The client and you will blame the proxy while you will probably blame the software or the code.

Prefetching in the caching world and from my perspective and understanding of how proxies work is only compared to repeated download.
Means that since a proxy can never fetch something without the client requesting anything, then we can separate couple things in the proxy.
The request, the prefetching and the cache policy are couple different "things".
When you are using a rule\config which forces the proxy to utilize 500% of the bandwidth then you have an "issue".
This specific issue can be converted from one to another with enough admin logic leaving the cache policy to the internal parts of the cache.

The simplest way to understand the issue is to understand what Amos described.. It is possible that the proxy is trying to download the full object 5 times if 5 clients(or the  same but couple connections) are asking the same object.
The solution to such an issue would be to consolidate these requests into one.
And since code changes in Squid takes time you could convert the issue into another form.
The simplest way to do so is inspect each request in a level that will identify a 206 request and will send it into one "prefetch" queue.
This prefetch external queue software\script\code will be able to resolve the "500% amplification" which some would describe as an attack.
This way the clients requests will be served live and without causing amplification attacks while the cache will be filled externally\artificially with objects.
Depends on the cache purpose you would be able to make it work.
I have implemented couple times this idea in the past using a set of ruby scripts and I must admit that some objects do not worth the code invested in them.

Hope it clears the picture\words and meanings,
Eliezer

----
Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: eliezer at ngtech.co.il


-----Original Message-----
From: k simon [mailto:chio1990 at gmail.com] 
Sent: Tuesday, August 9, 2016 6:36 AM
To: Eliezer Croitoru; squid-users at lists.squid-cache.org
Subject: Re: [squid-users] Range header is a hit ratio killer



在 16/8/7 21:20, Eliezer Croitoru 写道:
> Hey Simon,
>
> I do not know the plans but it will depend on couple things which can fit to one case but not the other.
> The assumption that we can fetch any part of the object is the first step for any solution what so ever.
> However it is not guaranteed that each request will be public.
>
> The idea of static chunks exists for many years in many applications and in many forms and YouTube videos player uses a similar idea. Google video clients and servers uses a bytes "range" request in the url rather then in the request header.
> Technically it would be possible to implement such an idea but it has it's own cost.
> Eventually if the file is indeed public(what squid was designed to cache) then it might not be of a big problem.
> Depends on the target sites a the solution will be different.
> Before deciding on a specific solution my preferred path is to analyze the requests.
>
> By observing amplified traffic of 500% to  clients side you mean that the incoming traffic to the server is 500% compared to the output towards the clients?
> If so I think that there might be a "smarter" solution then 206 range offset limit.
> The old method of prefetching works pretty good in many cases. From what you describe it might have better luck then the plain "fetch everything on the wire in real time".
>
> I cannot guarantee that prefetching is the right solution for you but I think that a case like this deserves couple eyes to understand if there is a right way to handle the situation.
>
  I think prefetch may not be fit for forward proxy, as we do not know what's "hot" request exactly. LRU should do more efficient.

Simon



More information about the squid-users mailing list