[squid-dev] [RFC] dns_wait_for_all

Alex Rousskov rousskov at measurement-factory.com
Wed Sep 14 20:15:17 UTC 2016


Hello,

    Currently, when connecting to an origin server, Squid sends
concurrent DNS A and AAAA queries and waits for both answers before
proceeding with the HTTP transaction. If the authoritative DNS server
(or something on its path) breaks or significantly delays IPv6 (AAAA)
transactions, then Squid waits until timeout, even if Squid already has
a usable IPv4 address from a successful A query. This naturally leads to
admins disabling IPv6, willingly or under external pressure.

As Happy Eyeballs algorithms and related discussions/measurements have
established, the best migration path to IPv6 requires making sure that
enabling IPv6 does not create [user-visible] problems. Once that is
accomplished, software can and should prefer IPv6 over IPv4 by default.
I hope that we do not need to revisit those discussions to accept that
principle.

Currently, Squid violates that principle -- enabling IPv6 leads to
user-visible problems, and those problems lead to IPv6 being disabled.


This will not fix all IPv6 problems, but I propose to modify Squid so
that it starts connecting after receiving the first usable DNS response:

> dns_wait_for_all <on|off>
> 
> Determines whether Squid resolves domain names of all possible
> destinations in all supported address families before deciding which
> IP address to try first when contacting an origin server or cache_peer.
> 
> Before Squid can connect to a peer, it needs an IP address. Obtaining an
> IP address often requires a DNS lookup. Squid often makes two concurrent
> DNS lookups: An "A" query for an IPv4 address and an "AAAA" query for an
> IPv6 address. This directive does not affect the number of DNS queries
> sent or the side-effects of those queries (e.g., IP cache updates), but
> if two concurrent lookups are initiated, and this directive is off, then
> Squid proceeds immediately after receiving the first usable DNS answer.
> 
> This directive does not affect forwarding retries. For example, if
> dns_wait_for_all is off, and Squid gets an IPv4 address first, but the
> TCP connection to that IPv4 address fails, Squid will wait for the IPv6
> address resolution to complete (if it has not yet) and will then connect
> to an IPv6 address (if possible).
> 
> Furthermore, this directive does not affect the number of peer domain
> names that Squid will attempt to resolve or peer addresses that Squid
> may connect to. If Squid is allowed to forward a transaction to two
> peers, then Squid will resolve both peer names and, if failures make it
> necessary, will connect to all IP addresses of both peers (subject to
> other restrictions such as connect_retries).
> 
> See also: dns_v4_first

I suggest to enable this option by default because it will help with
IPv6 adoption, but I certainly do not insist on that default.


While we call both queries "concurrent", Squid sends the AAAA query just
before sending the A query. All other factors being equal, IPv6 will
usually win the DNS race. However, even if AAAA loses, Squid will use
IPv6 the next time it needs to connect to the same server.


>From development point of view, support this feature properly means
creating an AsyncJob that will initiate DNS queries and update the
destinations list as the answers come in while informing the caller (if
it is still alive) of any new answers. Today, FwdState does
approximately this:

  1. Call peerSelect(&serverDestinations, fwdPeerSelectionComplete)
     and wait for the fwdPeerSelectionComplete callback.

  2. When fwdPeerSelectionComplete is called,
     start iterating over pre-filled serverDestinations.

To support, dns_wait_for_all, FwdState will do _approximately_ this:

  1. Call peerSelect(serverDestinations, fwdPeerSelected)
     and wait for the first fwdPeerSelected subscription callback.

  2. Every time fwdPeerSelected is called,
     start or resume iterating still-unused serverDestinations
     if we were actually waiting for the next destination to try.

The DNS code dealing with concurrent A and AAAA queries will need to be
adjusted to report the first answer while waiting for the second one.

It is questionable whether the new AsyncJob should continue running
(i.e., resolving more peer names) after FwdState is gone (or no more
retries are needed). However, I do not want to complicate this
discussion by introducing that side effect. We can decide to optimize
that later, with or without another configuration option to control the
behavior.

Once this new infrastructure is in place, we can accommodate IPv6
further by experimenting with limited semi-concurrent TCP connections
and other Happy Eyeballs-inspired tricks (with proxy specifics in mind).


Any better ideas or objections to adding dns_wait_for_all?


Thank you,

Alex.


More information about the squid-dev mailing list