[squid-users] Dynamic/CDN Content Caching Challenges

Amos Jeffries squid3 at treenet.co.nz
Thu Apr 14 10:59:14 UTC 2016


On 14/04/2016 9:32 p.m., Muhammad Faisal wrote:
> Thanks Amos for a detailed response.
> Well for Squid we are redirecting only HTTP traffic from policy routing.
> The object is unique which is being served to clients but due to
> different redirection of every user a new object is stored.
> 
> What about http streaming content having 206 response code how to deal
> with it? afaik squid dont cache 206 partial content. Is this correct?

Squid does not cache 206 from the server. But a HIT served by Squid can
be 206 status.

> 
> e.g filehippo below is the sequence:
> 
> When I click download button there are two requests one 301 which
> contains (Location header for the requested content) and second 200:
> 
> 301 Headers: ?
> 
> GET
> /download/file/6853a2c840eaefd1d7da43d6f2c94863adc5f470927402e6518d70573a99114d/
> HTTP/1.1
> Host: filehippo.com
> Accept:
> text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
> Accept-Encoding: gzip, deflate, sdch
> Accept-Language: en-US,en;q=0.8
> Cookie: FHSession=mfzdaugt4nu11q3yfxfkjyox;
> FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM;
> __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
> __utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
> __utmb=144473122.3.10.1459345119355; __utmc=144473122;
> __utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
> __utmv=144473122.|1=AB%20Test=new-home-v1=1
> Referer:
> http://filehippo.com/download_vlc_64/download/56a450f832aee6bb4fda3b01259f9866/
> 
> Upgrade-Insecure-Requests: 1
> User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
> (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36
> 
> HTTP/1.1 301 Moved Permanently
> Accept-Ranges: bytes
> Age: 0
> Cache-Control: private
> Connection: keep-alive
> Content-Length: 0
> Content-Type: text/html
> Date: Wed, 30 Mar 2016 13:38:45 GMT
> Location:
> http://fs37.filehippo.com/9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe
> 
> Via: 1.1 varnish
> X-Cache: MISS
> X-Cache-Hits: 0
> x-debug-output: FHSession=mfzdaugt4nu11q3yfxfkjyox;
> FH_PreferredCulture=l=en-US&e=3/30/2017 1:38:22 PM;
> __utmt_UA-5815250-1=1; __qca=P0-1359511593-1459345103148;
> __utma=144473122.1934842269.1459345103.1459345103.1459345103.1;
> __utmb=144473122.3.10.1459345119355; __utmc=144473122;
> __utmz=144473122.1459345103.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
> __utmv=144473122.|1=AB%20Test=new-home-v1=1
> X-Served-By: cache-lhr6334-LHR
> 

Ew. Borked server. 302 may be old but there are situations (this being
one) where it actually is appropriate to respond with a temporary status.

It also seems to contain an amateur attempt at cache-optimization by
someone who does not understand what middleware does.


You could technically force this to cache. But its not worth it. Let the
site admin who made that yucky response deal with the 2x latency cost
they created. Better to Store-ID cache the thing its Location header is
pointing to.


> 200 Header: Why ATS is not caching octet stream despite having CONFIG
> proxy.config.http.cache.required_headers INT 1

Squid is not ATS. The 301 response above is CC:private so only the
receiving browser is allowed to cache it. What was the question?

> GET /9546/46cfd241f1da4ae9812f512f7b36643c/vlc-2.2.2-win64.exe HTTP/1.1
> Host: fs37.filehippo.com

What do you know about the components of that URL...

* What does "9546" mean;
 - just a random number?
 - some form of customer-ID videolan have with Filehippo ?
 - some form of category ID that represents VLC software type etc?

* What does the long random looking hex number mean;
 - just a random visitor session ID?
 - the hash sum for the VLC binary being fetched?

... or something else?

try some manual requests with different values and see what happens to
the response. Pay particular attention to the ETag response header, its
size, and if you want to be paranoid take the SHA1 and MD5 hashes of the
response object when it looks like it should be identical.

Check your logs for patterns in the URLs and test in teh same ways the
other files you find people fetching.

If that checks out then you know what your Store-ID pattern can drop and
what needs to be kept.

This is the hard way, and a "lot of work" as I mentioned earlier. If you
want to help the community then please contribute back by putting your
findings into the wiki Store-ID database pages so all that work does not
go to waste.

> 
> HTTP/1.1 200 OK
> Accept-Ranges: bytes
> Age: 739
> Connection: keep-alive
> Content-Length: 31367109
> Content-Type: application/octet-stream
> Date: Wed, 30 Mar 2016 13:26:43 GMT
> ETag: "81341be3a62d11:0"
> Last-Modified: Mon, 08 Feb 2016 06:34:21 GMT
> 

Amos



More information about the squid-users mailing list