[squid-users] Dynamic/CDN Content Caching Challenges

Thu Apr 14 09:18:25 UTC 2016

On 14/04/2016 8:03 p.m., Muhammad Faisal wrote:
> Hi,
> I'm trying to deal with dynamic content to be cached by Squid 3.5 (i
> tried many other version of squid e.g 2.7, 3.1, 3.4). By Dynamic I mean
> the URL for the actual content is always change this results in the
> wastage of Cache storage and low hit rate. As per my understanding I
> have two challenges atm:
> 
> 1- Websites with dynamic URL for requested content (e.g filehippo,
> download.com etc etc)

If the URL is dynamicaly generated, the resource/content behind it may
be too and thus would fail to meet the uniqueness requirement of StoreID:
 That every *object* cached at a particular store ID MUST be identical
(either binary and/or semantically) regardless of the URL(s) mapped to
that ID.

> 2- Streaming web sites where the dynamic URL has 206 (partial content)
> tune.pk videos for e.g or windows updates (enabling range off set limit
> to -1 causes havoc on upstream to we kept it disable is there some way
> to control the behavior ?)

3- non-HTTP streaming. ICY and such like which are transferred over HTTP
proxies and supported by Squid but are not cacheable.

4- Dynamic sites where the content *actually* changes between two URLs
which you think are the same. Unless one has access to the server code
(ie open source CDN, or at minimum they published the mirroring URL
structure) it can be a lot of work to guarantee that any two URLs
matched by a regex pattern meet the uniqueness requirement of StoreID.

Solving #4 usually solves #1 on the way.

> 
> If someone has successfully configured the above scenario please help me
> out as i dont have programming background to deal with this complexity.

NP: You should not need programming know-how for this. Just "good" level
of regex experience and knowledge (mistakes are painful in this
particular situation). Plus ability+willingness to analyse the targeted
CDNs URL structure vs behaviour in a lot of detail (more systems
analysis skill than coding).

The Store-ID helper provided with Squid fed with the regex patterns in
the wiki "database" for some of the popular CDNs should give you some
savings. If only the popular jQuery ones.

> 
> I tried using different store-ID helpers but no saving on upstream the
> content is still coming from origin. Below is the helper i have used:

Your low success rate may be from the focus on http:// URLs. Most of the
majors CDNs you are listing in this helper are actually using https://
URLs for their content nowdays.

Amos