[squid-users] Using Digests to reduce traffic between peers, Parent - Sibling configuration question

Jester Purtteman jester at optimera.us
Tue Oct 27 04:14:22 UTC 2015



I have been wrestling with squid for a while and my reading has brought
"Cache-Digests" to my attention.  I suspect the answer is "that would be
neat, but that's not how it works", but I thought I'd ask a few questions.
I am running an ISP in a remote area only served by satellite links.  The
layout of my network is approximately (yes, very much simplifying here):


----Internet----<HORRIFICALLY-EXPENSIVE-SATELLITE-LINK> ---- Servers ----


We use the servers on the cheap link to perform some basic tunneling,
administration, and hosting some of our websites, nothing too fancy.  The
servers behind the nightmare satellite link provide RADIUS, Squid, a web
based login system, and some primitive SCADA that we use to monitor the


The wall that I am running up against is that this whole issue of caching
dynamic content and browsers that do goofy things like asking for pages to
be served un-cacheable.  It is pretty clear the bookshelf missing the leg on
craiglist that was posted in 2012 is never, ever going to sell, but Chrome
just can't be sure enough I guess.  So, like any good amateur Squid admin I
violate http standards (I know, for shame :)!) and have it cache things for
a few minutes (so you may get shown the same ad twice, so what?).  I have
been pondering things like Riverbed and a bunch of other technologies, but
in the final analysis, they tend to only really work well when you're doing
SAMBA or something else with CRAZY repetition in the datastream and small
byte shifts.  Oh its good voodoo when it works, but it's not really
applicable to the caching the http problem, especially when they want about
3 kidneys, an arm, and firstborn child every year for a license.


Then I read the Cache-Digests section, and I *think* it either does
something very cool, or is perhaps a bit of hacking away from doing
something very cool.  So, I thought I'd ask for thoughts.  I am wondering if
it is possible using the existing layout, theoretically possible, or is just
a plain bad idea to use digests to refresh content on the expensive side of
the link.  The idea would go something like this: we'd have a server on the
cheap link, a server at the expensive end of the link, and a VPN type tunnel
between them.  I can tell you from much practice that openVPN and some
compression can get this part done.  We'll call them cheap-server,
expensive-server, and clients.  The layout becomes:


<Cheap-Server> ---- Internet --- Satellite ----<Expensive-Server> ---


<Expensive-server> is a transparent proxy using tproxy and exists already.
It has a pretty poor cache rate, mostly because of my ham-handed inability
to write good cache rules, but partly because the content providers in this
world need that jpg that hasn't changed since 2006 to go with "no-cache"
set, (grr ;). 


So, here's my theory:  Setup <expensive-server> so that it caches
EVERYTHING, all of it, and catalogs it with this Digest.  It doesn't expire
anything, ever, the only way something gets released from that cache is when
the drive starts running out of room.  It's digest is then sent to
<cheap-server>, which doesn't cache ANYTHING, NOTHING.  When a request comes
through from a client, <Expensive-Server> checks the refresh rules, and if
it isn't too stale it gets served just like it does now, but if it IS
expired, it then asks <Cheap-Server> "hey, how expired is this?" and
<Cheap-Server> (which has all the bandwidth it could ever want) grabs the
content, and digests it.  If the digest for the new retrieval matches
something in the digest sent by <expensive-server>, then <cheap-server>
sends up a message that says "it's still fresh, the content was written by
lazy people or idiots, carry on".


As far as I can tell from (very limited) experimenting and reading, this
doesn't *appear* to be how it works, but I may well just have this messed
up.  So, I thought I'd ask, is that the idea, is that possible, plausible,
on the road map, or just plain insane.  I'm not a gifted coder, but in a
pinch I can usually do more good than harm, just wondering if this is worth
digging into.  Curious what your thoughts are on this, thank you!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20151026/2f17adff/attachment.html>

More information about the squid-users mailing list