[squid-users] Range header is a hit ratio killer

Sun Aug 7 14:12:49 UTC 2016

On 6/08/2016 9:56 p.m., k simon wrote:
> Hi,list,
>   Code 206 is the most pain for our forwed proxy. Squid use
> “range_offset_limit” to process byte-range request. when set it "none",
> it has 2 wellknown issue:
> 1.  boost the traffic on the server side, we observed it's amplified
> 500% compared to clients side on our box.

To which the answer currently is to see if enabling collapsed_forwarding
works okay for your needs.

> 2.  it's always failed on a lossy link, and squid refetched it again and
> again.
>   I've noticed that nginx have supported "byte-range cacheing" since
> 1.9.8  by Module ngx_http_slice_module officially.
> (1.
> http://nginx.org/en/docs/http/ngx_http_slice_module.html?_ga=1.140845234.106894549.1470474534
> 

So? what relevance does other software features have to Squid behaviour?

 <http://wiki.squid-cache.org/SquidFaq/AboutSquid#How_to_add_a_new_Squid_feature.2C_enhance.2C_of_fix_something.3F>

... to be fair the storage code in Squid is a bit hairy in places. So
paying for it to be done is unlikely to be cheap. But still, waiting
wont fix the problem. We nearly go there in Squid-2.7, but the
experiment there is not able to completely port across to Squid-3 and
had some important problems anyway.

> 2. https://www.nginx.com/resources/admin-guide/content-caching/  ).
>   The solution is not perfect, but it's really more usable than
> "range_offset_limit". The secret it's use a fixed size object replaced
> the whole file, and we can alter the request range offset and passed it
> to server;

Ah, thats what range_offset_limit does today. Updates the server request
to say "deliver all of it" and stores the response in a file the size of
the whole expected response the server informs will be arriving.

The reason you are seeing that 500% increase in bandwidth is that
multiple Range requests arrive while the initial part of the first
response is still arriving back to Squid, so 5 of them get sent through
to the server. When that first one finishes, its object becomes
available for use as a HIT and followup Range requests get bits of it
(so you dont see 600% -> millions of % bandwidth increase).

collapsed_fowarding alters this by letting the first response be used by
other requests while it is still incomplete. But YMMV regarding the
savings and CF affects all traffic, so it may cause behaviours you dont
want on other types of request. Worth a try though.

> perhaps forward the origin range offset and cache a part of
> the object with a range key is a better idea.
> And squid should know how
> to make up those object and process the request with range header.
>   And with a fixed size object to cache it may benefits to disk IO.
> Sounds it's similar like big-rock db concept, though I've not got
> successed with rock on FreeBSD nor ubuntu box.
>   Does squid has some plan to support this method or have another solution?
> 

squid is software. It doesn't have its own plans (at least I hope not).

I'm not aware of any plans specifically to add Range caching any time
soon. Ideas for how to do it get thrown around in squid-dev a couple of
times a year, so lots of ideas but so far nothing concrete has come out
of it. Yes rock and/or memory caches are looking like the most easily
adapted cache types to enable storing partial objects in, someone still
has to do the actual coding work though.

Amos