[squid-users] No failover when default parent proxy fails (Squid 3.5.12)

Jens Offenbach wolle5050 at gmx.de
Thu Mar 16 11:36:59 UTC 2017


Thanks for your detailed explaination...

Well, the "default" option does not seem to be the right choice to achieve the expected behaviour. Hopefully, the following change will make all the traffic passing through the primary proxy when it is reachable:

# OPTIONS WHICH AFFECT THE NEIGHBOR SELECTION ALGORITHM
# -----------------------------------------------------------------------------
  cache_peer proxy.materna.de parent 8080 0 no-digest no-query connect-timeout=5 connect-fail-limit=2 weight=2
  cache_peer roxy.materna.de parent 8080 0 no-digest no-query connect-timeout=5 connect-fail-limit=2

I am sorry for the incorrect test setup... I just added:
$ iptables -A OUTPUT -p tcp -d 139.2.1.3 -j REJECT

and the failover takes place very fast:
# iptables -D OUTPUT -p tcp -d 139.2.1.3 -j REJECT (Simulate primary proxy is online)
1489663755.767   4063 10.30.216.160 TCP_TUNNEL/200 26329103 CONNECT repository.apache.org:443 - FIRSTUP_PARENT/139.2.1.3 -
# iptables -A OUTPUT -p tcp -d 139.2.1.3 -j REJECT (Simulate primary proxy offline)
1489663818.913  33845 10.30.216.160 TCP_TUNNEL/200 20569674 CONNECT repository.apache.org:443 - ANY_OLD_PARENT/139.2.1.4 -
# iptables -D OUTPUT -p tcp -d 139.2.1.3 -j REJECT (Simulate primary proxy back online)
1489663850.148   4521 10.30.216.160 TCP_TUNNEL/200 20809195 CONNECT repository.apache.org:443 - FIRSTUP_PARENT/139.2.1.3 -

Those two parent proxies are not under my control. I will talk with my IT guys what is wrong here.

Next Wednesday is the next planned downtime of the primary proxy. I will test the failover at this time again.

You really helped me a lot. Thank you every much!

Regards,
Jens

Gesendet: Donnerstag, 16. März 2017 um 11:51 Uhr
Von: "Amos Jeffries" <squid3 at treenet.co.nz>
An: squid-users at lists.squid-cache.org
Betreff: Re: [squid-users] No failover when default parent proxy fails (Squid 3.5.12)
On 16/03/2017 10:39 p.m., Jens Offenbach wrote:
> This is the sceanrio;
>
> Squid 3.5.12 is installed on "squid-proxy.mycompany.com". The two parent proxies are:
> - Primary: proxy.mycompany.de:8080 (139.2.1.3)
> - Fallback: roxy.mycompany.de:8080 (139.2.1.4)
>
> I have misunderstood the "default" option in "cache_peer". When I got it right, it has the meaning of a fallback, so I switched it to "roxy.mycompany.de". "proxy.mycompany.de" should always be used and "roxy.mycompany.de" only when "proxy.mycompany.de" fails.
>

Well, kind of. Unless that peer is selected by one of the other
algorithms (for that it has to be 'alive') it will be appended as the
last-resort peer to be used regardless of DEAD/alive status.


> squid.conf:
>
...
>
> # OPTIONS WHICH AFFECT THE NEIGHBOR SELECTION ALGORITHM
> # -----------------------------------------------------------------------------
> cache_peer proxy.materna.de parent 8080 0 no-digest no-query connect-timeout=5 connect-fail-limit=2
> cache_peer roxy.materna.de parent 8080 0 no-digest no-query connect-timeout=5 connect-fail-limit=2 default
>
...
> # OPTIONS INFLUENCING REQUEST FORWARDING
> # -----------------------------------------------------------------------------
> always_direct allow to_matnet
> never_direct allow all
>
> # DNS OPTIONS
> # -----------------------------------------------------------------------------
> dns_nameservers 139.2.34.171
> dns_nameservers 139.2.34.37
>
...
>
> Now, I block traffic on "squid-proxy.mycompany.com" to the primary proxy "proxy.mycompany.de" (139.2.1.3) using IPTables:
> $ iptables -A OUTPUT -p icmp -d 139.2.1.3 -j DROP
> $ iptables -A OUTPUT -p tcp -d 139.2.1.3 -j DROP
> $ iptables -A OUTPUT -p udp -d 139.2.1.3 -j DROP
>

Are you trying to test connection timeout issues or a host going offline?
These iptables rules will force a timeout but not emulate a host
disconnection. Particularly when ICMP is also dropped.

When a host disconnects Squid will receive active signals (maybe via
ICMP) that the TCP SYN packet cannot get through. That speeds failure
recovery things up enormously. If the peer software simply
crashes/exits, different signals happen but with the same super fast
effects.

REJECT rules would be a better emulation of a machine disconnecting, or
an only-TCP REJECT rule emulates a peer software crash, etc. That way
the ICMP signalling still happens similar to those types of failure.



> On the test machine, I use:
> $ export http_proxy=http://squid-proxy.mycompany.com:3128/
> $ export https_proxy=http://squid-proxy.mycompany.com:3128/[http://squid-proxy.mycompany.com:3128/]
> $ export HTTP_PROXY=http://squid-proxy.mycompany.com:3128/[http://squid-proxy.mycompany.com:3128/]
> $ export HTTPS_PROXY=http://squid-proxy.mycompany.com:3128/[http://squid-proxy.mycompany.com:3128/]
>
> Trying to download a resource:
> $ wget https://repository.apache.org/content/groups/snapshots/org/apache/karaf/apache-karaf/4.1.1-SNAPSHOT/apache-karaf-4.1.1-20170315.084054-35.tar.gz[https://repository.apache.org/content/groups/snapshots/org/apache/karaf/apache-karaf/4.1.1-SNAPSHOT/apache-karaf-4.1.1-20170315.084054-35.tar.gz]
>
> The download hangs for 2 minutes until it gets started. A retry shows the same results, the download starts after 2 minutes showing:
> --2017-03-16 09:31:26-- https://repository.apache.org/content/groups/snapshots/org/apache/karaf/apache-karaf/4.1.1-SNAPSHOT/apache-karaf-4.1.1-20170314.154157-34.tar.gz[https://repository.apache.org/content/groups/snapshots/org/apache/karaf/apache-karaf/4.1.1-SNAPSHOT/apache-karaf-4.1.1-20170314.154157-34.tar.gz]
> Resolving squid-proxy.mycompany.com (squid-proxy.mycompany.com)... 10.152.132.41
> Connecting to squid-proxy.mycompany.com (squid-proxy.mycompany.com)|10.152.132.41|:3128... connected.
>
> cache.log:
>
...
> 2017/03/16 10:17:48 kid1| Starting Squid Cache version 3.5.12 for x86_64-pc-linux-gnu...
> 2017/03/16 10:17:48 kid1| Service Name: squid
> 2017/03/16 10:17:48| pinger: Initialising ICMP pinger ...
> 2017/03/16 10:18:09.579 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths: Find IP destination for: http://proxy.materna.de:8080/squid-internal-dynamic/netdb'[http://proxy.materna.de:8080/squid-internal-dynamic/netdb'] via proxy.materna.de
> 2017/03/16 10:18:09.579 kid1| 44,2| peer_select.cc(280) peerSelectDnsPaths: Found sources for 'http://proxy.materna.de:8080/squid-internal-dynamic/netdb'[http://proxy.materna.de:8080/squid-internal-dynamic/netdb']

These can be avoided by adding no-netdb-exchange option to the
cache_peer config lines. But it is probably a good idea to keep them for
production use as they will be the way of detecting a peer recovery to
live status.

...
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths: Find IP destination for: repository.apache.org:443' via proxy.materna.de
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths: Find IP destination for: repository.apache.org:443' via proxy.materna.de
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths: Find IP destination for: repository.apache.org:443' via roxy.materna.de
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths: Find IP destination for: repository.apache.org:443' via roxy.materna.de
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(280) peerSelectDnsPaths: Found sources for 'repository.apache.org:443'
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(281) peerSelectDnsPaths: always_direct = DENIED
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(282) peerSelectDnsPaths: never_direct = ALLOWED
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(292) peerSelectDnsPaths: cache_peer = local=0.0.0.0 remote=139.2.1.3:8080 flags=1
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(292) peerSelectDnsPaths: cache_peer = local=0.0.0.0 remote=139.2.1.3:8080 flags=1
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(292) peerSelectDnsPaths: cache_peer = local=0.0.0.0 remote=139.2.1.4:8080 flags=1
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(292) peerSelectDnsPaths: cache_peer = local=0.0.0.0 remote=139.2.1.4:8080 flags=1
> 2017/03/16 10:18:37.951 kid1| 44,2| peer_select.cc(295) peerSelectDnsPaths: timedout = 0
>

Hmm. Something is going wrong with our logic to ensure unique IP:port
entries in the list of selected paths. It should not be affecting your
issue much though.

> access.log
>
> 1489656077.628 159679 10.30.216.160 TCP_TUNNEL/200 26328966 CONNECT repository.apache.org:443 - ANY_OLD_PARENT/139.2.1.4 -
>

Uhm. One thing to be very wary of is that transactions are not logged
until they are completed. So things like their full duration and bytes
can be recorded.

When CONNECT are involved some people who are not fully aware of the
meanings of that request method can be surprised by lack of log entries.
It is a tunnel and whole *weeks* worth of various traffic can happen
inside before it reaches that complete state for logging.
You might see nothing actually happening except CONNECT lines being
logged with zero sizes, or huge amounts of https:// URLs being fetched
without a single access.log line occuring ... or any mix of behaviour in
between.

This connection had 26MB transferred over it. The 'connect' stage (TCP
SYN / SYN-ACK exchange) may have been successful within the first 11
seconds (5sec timeout on first two cache_peer in that cache.log list,
then immediate success on the third) and just nothing visibly happening
on it at the HTTP level for a bit while the TLS crypto did things.

If things are breaking or going slowly at the TLS layer or higher, then
there is nothing you can do in this Squid. As far as this Squid is
concerned the TCP tunnel was setup fine and working. What is inside it
is opaque.

I have just done a test of those two peers from here to see how the
setup goes, and there is an over 2min 10-12sec delay before my ISPs NAT
system cuts the connection. Something is very broken with those
particular peers or the network they reside in. That whole process
should have taken under 350ms and been terminated by their end.

Amos

_______________________________________________
squid-users mailing list
squid-users at lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users[http://lists.squid-cache.org/listinfo/squid-users]


More information about the squid-users mailing list