[squid-users] load balancing and site failover
Brendan Kearney
bpk678 at gmail.com
Wed Mar 25 21:26:07 UTC 2015
On Wed, 2015-03-25 at 15:03 +1300, Amos Jeffries wrote:
> On 25/03/2015 9:55 a.m., brendan kearney wrote:
> > Was not sure if bugzilla was used for mailing list issues. If you would
> > like me to open one, I will but it looks like the list is working again.
>
> Bugzilla is used, list bugs under the "project services" product.
>
>
> As for your query...
>
> > On Mar 24, 2015 2:25 PM, "Brendan Kearney" wrote:
> >
> >> On Tue, 2015-03-24 at 10:18 -0400, Brendan Kearney wrote:
> >>> while load balancing is not a requirement in a proxy environment, it
> >>> does afford a great deal of functionality, scaling and fault tolerance
> >>> in one. several if not many on this list probably employ them for their
> >>> proxies and likely other technologies, but they are not all created
> >>> equal.
> >>>
> >>> i recently looked to see if a specific feature was in HAProxy. i was
> >>> looking to see if HAProxy could reply to a new connection with a RST
> >>> packet if no pool member was available.
> >>>
> >>> the idea behind this is, if all of the proxies are not passing the
> >>> service check and are marked down by the load balancer, the reply of a
> >>> RST in the TCP handshake (i.e. SYN -> RST, not SYN -> SYN/ACK -> ACK)
> >>> tells the browser to failover to the next proxy assigned by the PAC
> >>> file.
> >>>
> >>> where i work, we have this configuration working. the load balancers
> >>> are configured with the option to send a reset when no proxy is
> >>> available in the pool. the PAC file assigns all 4 of the proxy VIPs in
> >>> a specific order based on which proxy VIP is assigned as the primary.
> >>> In every case, if the primary VIP does not have an available pool
> >>> member, the browser fails over to the next in the list. failover would
> >>> happen again, if the secondary VIP replies with a RST during the
> >>> connection establishing. the process repeats until a TCP connection
> >>> establishes or all proxies assigned have been exhausted. the browser
> >>> will use the proxy VIP that it successfully connects to, for the
> >>> duration of the session. once the browser is closed and reopened, the
> >>> evaluation of the PAC file occurs again, and the process starts anew.
> >>> plug-ins such as Proxy Selector are the exception to this, and can be
> >>> used to reevaluate a PAC file by selecting it for use.
> >>>
> >>> we have used this configuration several times, when we found an ISP link
> >>> was flapping or some other issue more global in nature than just the
> >>> proxies was affecting our egress and internet access. i can attest to
> >>> the solution as working and elegantly handling site wide failures.
> >>>
> >>> being that the solutions where i work are proprietary commercial
> >>> products, i wanted to find an open source product that does this. i
> >>> have been a long time user of HAProxy, and have recommended it for
> >>> others here, but sadly they cannot perform this function. per their
> >>> mailing list, they use the network stack of the OS for connection
> >>> establishment and cannot cause a RST to be sent to the client during a
> >>> TCP handshake if no pool member is available.
> >>>
> >>> they suggested an external helper that manipulates IPTables rules based
> >>> on a pool member being available. they do not feel that a feature like
> >>> this belongs in a layer 4/7 reverse proxy application.
>
> They are right. HTTP != TCP.
i didnt confuse that detail. it was unknown to me that HAProxy could
not tie layer 7 status to layer 3/4 actions. the decisions they made
and how they architected the app is why they cannot do this, not that it
is technically impossible to do it. i may be spoiled because i work
with equipment that can do this for me.
>
> In particular TCP depends on routers having a full routing map of the
> entire Internet (provided by BGP) and deciding the best upstream hop
> based on that global info. Clients have one (and only one) upstream
> router for each server they want to connect to.
i will contest this. my router does not need a full BGP map to route
traffic locally on my LAN or remotely out its WAN interface. hell, it
does not even run BGP, and i can still get to the intarwebs, no problem.
it too, only has one upstream router / default route.
>
> In HTTP each proxy (aka router) performs independent upstream connection
> attempts, failover, and verifies it worked before responding to the
> client with a final response. Each proxy only has enough detail to check
> its upstream(s). Each proxy can connect to any server (subject to ACLs).
how are you comparing a HTTP proxy (a layer 7 application) to a router
(a layer 3 device)? routers route traffic and proxies proxy traffic.
very different functions. routers dont look past a certain point in the
headers in order to make decisions on where to send the traffic.
proxies look all the way to the end of the headers and sometimes into
the payload, too. proxies are more akin to a protocol specific
firewall. proxies also dont send the incoming traffic out an interface.
they terminate the client session, and initiate a new session on behalf
of the client. simply because the proxy can elect how to send a request
it is making on behalf of a client, does not make a proxy a router. the
fact that one connection is terminated and a new one is initiated rules
out a proxy from being any kind of router, in my opinion. even with SSL
or the CONNECT Method, the connection is still made by the proxy to the
remote server. the client never makes a connection to the remote
server, therefore the traffic was not routed. it was proxied.
>
> >>>
> >>> my search for a load balancer solution went through ipvsadm, balance and
> >>> haproxy before i selected haproxy. haproxy was more feature rich than
> >>> balance, and easier to implement than ipvsadm. do any other list
> >>> members have a need for such a feature from their load balancers? do
> >>> any other list members have site failover solutions that have been
> >>> tested or used and would consider sharing their design and/or pain
> >>> points? i am not looking for secret sauce or confidential info, but
> >>> more high level architecture decisions and such.
>
>
> I havent tested it but this should do what you are asking:
>
> acl err http_status 500-505 408
> deny_info TCP_RESET err
> http_reply_access deny err
>
> It replaces the response from Squid with a TCP RST packet.
this is useful in the case that the proxy is alive and well, but cannot
get to the internet. in my example, the ISP issue would seem to be
covered, though i am not sure how the actual implementation would go.
the client has a TCP session established with the load balancer, which
gets the full SYN -> SYN/ACK -> ACK treatement. the load balancer would
get the SYN -> RST from the proxy, and presumably sends the RST back to
the client. While this does seem to hold up logically, the
implementation may have nuances that have to be dealt with. Does the
RST in the middle of an established TCP session cause the browser to
failover to the next proxy assigned? i would have to test that out.
now, what about the case where the proxies are not alive and well behind
the load balancer, and they are not able to reply with a RST? This is
the scenario that i would want the load balancer to be able to manage.
this is where tying a layer 7 status to a layer 3/4 action on the load
balancer becomes relevant. then, the ability for the load balancer to
do this negates the need to manage this in the proxy layer, and removes
any nuances that may be encountered with the implementation.
>
> Amos
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
More information about the squid-users
mailing list