[squid-users] cache github zip repositories

Amos Jeffries squid3 at treenet.co.nz
Thu Sep 15 00:35:03 UTC 2016


On 15/09/2016 11:54 a.m., Hardik Dangar wrote:
> Hello,
> 
> I am trying to cache Github zip URL's so it can be effectively cached as a
> composer(php dependency management tool) uses them and in our local setup (
> we are about 40 developers on a Lan and it will really help us managing
> cache.). My squid version is 3.5.12 and our squid cache server is ubuntu
> 16.04. Here is squid.conf file we use,
> https://gist.github.com/hardikdangar/df31d5bce725eff66e06f3abd6e77600
> 
> Here is the part which I want to cache,
> say for example you want to download repo from GitHub then URL looks like
> https://github.com/hardikdangar/test/archive/master.zip
> but it redirects to the following,
> https://codeload.github.com/hardikdangar/test/zip/master
> 
> You can see the response parameters via redbot.org
> https://redbot.org/?uri=https%3A%2F%2Fcodeload.github.com%
> 2Fhardikdangar%2Ftest%2Fzip%2Fmaster
> 
>   HTTP/1.1 200 OK
>     Content-Length: 929
>     Access-Control-Allow-Origin: https://render.githubusercontent.com
>     Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline'
>     Strict-Transport-Security: max-age=31536000
>     Vary: Authorization,Accept-Encoding
>     X-Content-Type-Options: nosniff
>     X-Frame-Options: deny
>     X-XSS-Protection: 1; mode=block
>     ETag: "9ea9838812d6f7bc53763eb1577da04e2fa473d5"
>     Content-Type: application/zip
>     Content-Disposition: attachment; filename=test-master.zip
>     X-Geo-Block-List:
>     Date: Wed, 14 Sep 2016 23:24:44 GMT
>     X-GitHub-Request-Id: 77092BF1:7F40:346461:57D9DC3C
> 
> Now if i do any change to above repository github does change ETAG and if i
> don't change anything then ETAG remains the same so i believe we should be
> able to cache those .zip files.
> 
> By default, squid does not cache codeload.github.com, to put it into cache,
> I added,
> refresh_pattern codeload.github.com 900 20% 4320 reload-into-ims
> 
> Now as per my understanding this should check etag as Last-Modified is not
> provided by github for each new request. This does cache the zip file but
> what happens is in next request even if i change the content and etag
> changes squid sends the cached file from its cache instead of downloading
> new file.
> 
> I have no clue why this happens. Can anyone help me figure out what's wrong
> here? why squid does not detect new etag when repository is updated? why it
> sends cache file even though there is new file available.
> 

Consider: how does Squid know the ETag has changed on the server?

What you know about things happening in RL is not what Squid knows.

I fact how do *you* know someone else did not commit a change during
that ~1 second it takes to look at the page and click the download button?
 Simply, you don't, and cannot until the new object has been fetched.

Likewise, Squid cannot know if the object is the same until it has
fetched a MISS from the server. Except that Squid does not look at the
previous page content, so it cannot even 'see' if there is a commit
listed there that might be different since whenever it got the previous
object.

There is no Cache-Control or Expires header indicating a specific
storage timeout or revalidation procedure. So refresh_pattern defaults
will be used. These responses will be cached for the refresh_pattern
'Min' duration (900 minutes) before being considered for revalidated.


NP 1: Synthesizing Last-Modified from the Date header is only just being
fixed in Squid the past few weeks, and some parts of it still to be
committed. So I would not expect that response to be revalidated, just
re-fetched fully in older Squid.


NP 2: The Vary header indicates that every person logged in gets a
differently cached response based on how their credentials are hashed on
each request (in Authorization tokens). So caching these objects will
not help much with many developers involved. It will be of most help for
the anonymous visitors where username is always a generic NIL value.

HTH
Amos



More information about the squid-users mailing list