[squid-users] High-Availability in Squid

Amos Jeffries squid3 at treenet.co.nz
Mon Aug 31 14:28:59 UTC 2015


On 31/08/2015 6:23 p.m., Ashish Mukherjee wrote:
> Hi,
> 
> Agree that Squid is a specialized proxy and more optimal architecture for
> the purpose and trying to achieve HA on the Browser side is certainly a bad
> idea.
> 
> Talking specifically of a reverse proxy scenario, whether one uses Squid or
> Apache mod_proxy or something else may well depend upon what other features
> are needed and the traffic volumes. In many reverse proxy environments
> where more complex control is needed, mod_proxy seems to be often used with
> modules like mod_rewrite.

Does it not strike you as somehow wrong that "flexibility" is gained by
mangling the original request URL in ways such that the script engines
do not see what the original actually is?

A very large portion of that "complexity" in the backend applications
and CGI is having to deal with the way the URL was (or might have been)
mangled by the server itself. Then guessing what URIs to output that the
client would understand in the public context.


> My understanding of Squid is that it does great
> as a proxy but does not provide these features, as that is not its
> purpose.  Does Squid have its own production level extensions for some
> scenarios which may be typically addressed by Apache modules?


The key is again in the middle word of the phrase "reverse proxy
scenario". If it is *proxy* related Squid does it.


extensions? everything relevant is core functionality to a proxy. But
yes, there are addons and extensions for integrating to particular
network situations. We call them "helpers".

"production level"? Squid is the de-facto benchmark all the other
proxying software is compared against. Including mod_proxy. We usually
see them crowing about how fast they are at one particular little
targeted feature while glossing over the things they traded away to get
that speed. Squid goes somewhat slower overall, but fast enough and
"does everything".


virtual hosting?
 http_port 80 accel vhost
 https_port 80 accel vhost

mod_rewrite?
 url_rewrite_program (a helper interface, script your own poison)

mod_proxy?
 cache_peer

authentication?
 auth_param (a helper interface)

security policies?
 acl (including a helper interface ACL)
 many *_access directives

message payload transcoding?
 ESI
 icap_service
 ecap_service


Okay that last one is not internal to Squid exactly (except ESI is), but
that is because of the line between proxy and origin: touching the
message content is not a proxy functionality.


The one scenario where using Apache modules makes complete sense is when
dealing with FastCGI and/or a mix of FastCGI and static content on the
same server. Thats where Apache came from, and it does it well.

Once you start getting into using HTTP to pull from other servers and/or
ports on one server you are moving well into territory where a proper
proxy is the better tool (not just Squid, there are others). Apache
simply wont scale. Squid scales both horizontally and vertically. Our
poster child installations are Wikimedia (~200 Squid serving up
Wikipedia on a scale of TB/sec), and FrontierNET at CERN (a mesh layout
pumping Petabytes of science data around, where the small files are
measured in GB).

Amos



More information about the squid-users mailing list