[squid-users] 3.5.4 Can't access Google or Yahoo SSL pages

Wed May 6 15:48:05 UTC 2015

On 6/05/2015 7:20 p.m., Chris Palmer wrote:
>>>> There has been a change in behaviour in 3.5.4. It now really does
>>>> prefer to contact a site using an ipv6 address rather than a v4. The
>>>> network stack here doesn't permit v6 so the traffic to sites such as
>>>> google was failing. Setting the following restored the previous
>>>> behaviour:
>>>>
>>>> dns_v4_first on
>>>
>>> As far as I'm aware squid won't try to use ipv6 unless your server has a
>>> Global address, so that shouldn't be needed? Also, wouldn't squid simply
>>> treat that as a DNS name that resolves to a bunch of addresses, so as
>>> long as the IPv6 addresses fail to connect at all, it should have still
>>> ended up succeeding with ipv4 addresses?
>>>
> 
> The description of dns_v4_first (squid.conf.documented) says
>  With the IPv6 Internet being as fast or faster than IPv4 Internet
>  for most networks Squid prefers to contact websites over IPv6.
> 
>  This option reverses the order of preference to make Squid contact
>  dual-stack websites over IPv4 first. Squid will still perform both
>  IPv6 and IPv4 DNS lookups before connecting.
> 
> This does indicate specific treatment of IPv6 addresses.

Yes. RFC 6540:

"
   o  IPv6 support must be equivalent or better in quality and
      functionality when compared to IPv4 support in a new or updated IP
      implementation.
"

We do this in Squid by the default connection being to attempt to use
IPv6 addresses first, then IPv4.

On a machine properly supporting IPv6 - including the case where IPv6
addresses have not been assigned. There is no latency.

On a machine where IPv6 has been mangled or broken by attempts to
disable it (which is actually not possible in modern systems). Then by
default IPv4 suffers as much as IPv6.

> 
>>> Finally, I'm running squid-3.5.4, don't have ipv6 (just like everyone
>>> else, I still do have the standard fe80:xxx ipv6 link local address) and
>>> google.com works just fine without "dns_v4_first" - which implies my
>>> statements above are correct
>>>
>>> ie this smells like you actually do have ipv6 enabled, but it's broken
>>> in some subtle way (like the pmtu issue Amos mentioned)
>>>
>>
>>
>> The tunnel.cc code producing that read/write error is one of the bits
>> still broken in regards to errno usage. So I dont entirely trust that
>> "endpoint not connected" detail, it seems right but could be something
>> subtly different.
>>
>> Ayways, to get the output at all the TCP SYN/ACK handshake has to have
>> already setup the IPv6 connection with no errors. Then a (first? TLS?)
>> read/write operation attempted on the IPv6 socket fails and does that
>> log message
>>
>> There is supposed to be callback event protections preventing closed
>> sockets being used for read/write (adding to suspicion about the log
>> message), which is where I'm currently trying to figure out what state
>> things could be in.
>>
>> Amos
>>
> 
> I've done some more investigation. The problem is not SSL-related, but
> merely IPv6-related.
> A network trace confirms that, in the failing IPv6 case, no traffic
> leaves the squid host. This is
> expected, as there is no v6 routing. I had the default IPv6 link-local
> address on each interface,
> but as an experiment removed all IPv6 addresses from all interfaces, and
> the problem is still the
> same. Similarly a traceroute6 correctly reports no route.
> 
> In the "configure" help, there is an option to not build v6 support. The
> comment indicates that
> squid probes the kernel to determine if it is v6-capable. The problem
> arises when the kernel is compiled
> with v6 support, but v6 is not operational on any interface - no
> addresses, no routes etc.
> 
> With 3.5.3 I didn't "see" the default "dns_first_v4 off" actually doing
> anything. So for a destination host
> that had a v4 address it would try to use those, and it worked. I don't
> know whether it never tried the
> v6 address, or maybe it did, realised it had failed, and then
> successfully tried one of the other v4
> addresses. With 3.5.4 it definitely goes for the v6 address and then
> fails hard (transport not connected)
> without attempting to use any of the v4 addresses.
> 
> My hunch is that 3.5.4 has introduced a problem with error handling in
> the v6 code, causing it to fail and
> never try any other addresses.
> 
> For the majority of installations that do not have v6 capability, I
> suspect that the default setting
> of "dns_first_v4 off" is inefficient for sites with both v4 and v6
> addresses as it always tries the v6 addresses
> and fails, then goes for the v4 ones that work?? If that is the case, a
> better default for v4
> installations might be "dns_first_v4 on". It would obviously fail on
> v6-only destinations but that is to
> be expected.

On a properly configured machine the inefficiency is measured in
nanoseconds or less. Its one syscall per IP that fails.
This also helps meet the RFC 6540 criteria, since the inefficiency is
added as prefix to IPv4 connections usage (enhancing IPv6 relative
performance).

> 
> There is a warning in the documentation about using dns_first_v4 though
> which I don't really understand.

This bug itself is a good example about what the warning is about.

If you run Squid with "dns_v4_first on", or if it was the default this
fairy major bug would not have been found for a long time. The same
thing applies for regular routing problems, and IPv6 connectivity
issues, etc, etc. all of them are hidden away out of sight but still
very much existing when dns_v4_first is used.

They *will* come up to bite you later on. Its best to lets you know
where the problems are so they can be solved quickly.

> I'd like to know what the implications are - and whether I would be
> better simply building squid
> without v6 support at all.

I think the bug was introduced by the fix to bug 4234. If you can assist
by building and running Squid with that bug patch reversed it would help.

Amos