[squid-users] Dynamic/CDN Content Caching Challenges
Muhammad Faisal
faisalusuf at yahoo.com
Thu Apr 14 09:32:42 UTC 2016
Thanks Amos for a detailed response.
Well for Squid we are redirecting only HTTP traffic from policy routing.
The object is unique which is being served to clients but due to
different redirection of every user a new object is stored.
What about http streaming content having 206 response code how to deal
with it? afaik squid dont cache 206 partial content. Is this correct?
e.g filehippo below is the sequence:
When I click download button there are two requests one 301 which
contains (Location header for the requested content) and second 200:
301 Headers: ?
GET
/download/file/6853a2c840eaefd1d7da43d6f2c94863adc5f470927402e6518d70573a99114d/
HTTP/1.1
Host: filehippo.com
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: FHSession=mfzdaugt4nu11q3yfxfkjyox;
FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM;
__utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
__utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
__utmb=144473122.3.10.1459345119355; __utmc=144473122;
__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
__utmv=144473122.|1=AB%20Test=new-home-v1=1
Referer:
http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36
HTTP/1.1 301 Moved Permanently
Accept-Ranges: bytes
Age: 0
Cache-Control: private
Connection: keep-alive
Content-Length: 0
Content-Type: text/html
Date: Wed, 30 Mar 2016 13:38:45 GMT
Location:
http://fs37.filehippo.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe
Via: 1.1 varnish
X-Cache: MISS
X-Cache-Hits: 0
x-debug-output: FHSession=mfzdaugt4nu11q3yfxfkjyox;
FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM;
__utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
__utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
__utmb=144473122.3.10.1459345119355; __utmc=144473122;
__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
__utmv=144473122.|1=AB%20Test=new-home-v1=1
X-Served-By: cache-lhr6334-LHR
200 Header: Why ATS is not caching octet stream despite having CONFIG
proxy.config.http.cache.required_headers INT 1
GET /9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe HTTP/1.1
Host: fs37.filehippo.com
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
__utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
__utmb=144473122.3.10.1459345119355; __utmc=144473122;
__utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
__utmv=144473122.|1=AB%20Test=new-home-v1=1
Referer:
http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 739
Connection: keep-alive
Content-Length: 31367109
Content-Type: application/octet-stream
Date: Wed, 30 Mar 2016 13:26:43 GMT
ETag: "81341be3a62d11:0"
Last-Modified: Mon, 08 Feb 2016 06:34:21 GMT
--
Regards,
Faisal.
------ Original Message ------
From: "Amos Jeffries" <squid3 at treenet.co.nz>
To: squid-users at lists.squid-cache.org
Sent: 4/14/2016 2:18:25 PM
Subject: Re: [squid-users] Dynamic/CDN Content Caching Challenges
>On 14/04/2016 8:03 p.m., Muhammad Faisal wrote:
>> Hi,
>> I'm trying to deal with dynamic content to be cached by Squid 3.5 (i
>> tried many other version of squid e.g 2.7, 3.1, 3.4). By Dynamic I
>>mean
>> the URL for the actual content is always change this results in the
>> wastage of Cache storage and low hit rate. As per my understanding I
>> have two challenges atm:
>>
>> 1- Websites with dynamic URL for requested content (e.g filehippo,
>> download.com etc etc)
>
>If the URL is dynamicaly generated, the resource/content behind it may
>be too and thus would fail to meet the uniqueness requirement of
>StoreID:
> That every *object* cached at a particular store ID MUST be identical
>(either binary and/or semantically) regardless of the URL(s) mapped to
>that ID.
>
>
>> 2- Streaming web sites where the dynamic URL has 206 (partial
>>content)
>> tune.pk videos for e.g or windows updates (enabling range off set
>>limit
>> to -1 causes havoc on upstream to we kept it disable is there some
>>way
>> to control the behavior ?)
>
>3- non-HTTP streaming. ICY and such like which are transferred over
>HTTP
>proxies and supported by Squid but are not cacheable.
>
>4- Dynamic sites where the content *actually* changes between two URLs
>which you think are the same. Unless one has access to the server code
>(ie open source CDN, or at minimum they published the mirroring URL
>structure) it can be a lot of work to guarantee that any two URLs
>matched by a regex pattern meet the uniqueness requirement of StoreID.
>
>Solving #4 usually solves #1 on the way.
>
>
>>
>> If someone has successfully configured the above scenario please help
>>me
>> out as i dont have programming background to deal with this
>>complexity.
>
>NP: You should not need programming know-how for this. Just "good"
>level
>of regex experience and knowledge (mistakes are painful in this
>particular situation). Plus ability+willingness to analyse the targeted
>CDNs URL structure vs behaviour in a lot of detail (more systems
>analysis skill than coding).
>
>
>The Store-ID helper provided with Squid fed with the regex patterns in
>the wiki "database" for some of the popular CDNs should give you some
>savings. If only the popular jQuery ones.
>
>
>>
>> I tried using different store-ID helpers but no saving on upstream
>>the
>> content is still coming from origin. Below is the helper i have used:
>
>Your low success rate may be from the focus on http:// URLs. Most of
>the
>majors CDNs you are listing in this helper are actually using https://
>URLs for their content nowdays.
>
>Amos
>
>_______________________________________________
>squid-users mailing list
>squid-users at lists.squid-cache.org
>http://lists.squid-cache.org/listinfo/squid-users
More information about the squid-users
mailing list