[squid-users] forward_max_tries 1 has no effect

Fri Nov 24 07:43:39 UTC 2017

On 24/11/17 10:03, Ivan Larionov wrote:
>>
>> On Nov 23, 2017, at 12:32 AM, Amos Jeffries <squid3 at treenet.co.nz> wrote:
>>
>> On 23/11/17 14:20, Ivan Larionov wrote:
>>> Hello.
>>> We have an issue with squid when it tries to re-forward / retry failed request even when forward_max_tries is set to 1. The situation when it happens is when there's no response, parent just closes the connection.
>> ...
>>> It doesn't happen 100% times. Sometimes squid returns 502 after the 1st try, sometimes it retries once. Also I haven't seen more than 1 retry.
>>
>> Please enable debug_options 44,2 to see what destinations your Squid is actually finding.
> 
> I'll check this on Monday.
> 
>>
>> max_forward_tries is just a rough cap on the number of server names which can be found when generating that list. The actual destinations count can exceed it if one or more of the servers happens to have multiple IPs to try.
>>
>> The overall transaction can involve retries if one of the other layers (TCP or HTTP) contains retry semantics to a single server.
>>
>>
>>
>>> Could it be a bug? We'd really like to disable these retries.
>>
>> Why are trying to break HTTP?
>> What is the actual problem you are trying to resolve here?
>>
> 
> Why do you think I'm trying to break HTTP?
> 
> squid forwards the request to parent but parent misbehaves and just closes the connection after 40 seconds. I'm trying to prevent retry of request in such situation. Why squid retries if I never asked him to do it and specifically said "forward_max_tries 1".
> 
> And this is not a connection failure, squid successfully establishes the connection and sends the request, parent ACKs it, just never responses back and proactively closes the connection.
> 

This is not misbehaviour on the part of either Squid nor the parent.
<https://tools.ietf.org/html/rfc7230#section-6.3.1>
"Connections can be closed at any time, with or without intention."

As has been discussed in other threads recently there are servers out 
there starting to greylist TCP connections, closing the first one some 
time *after* SYN+ACK regardless of what the proxy sends and accepting 
any followup connection attempts.

NP: That can result in exactly the behaviour you describe from the peer 
as Squid does not wait for a FIN to arrive before sending its upstream 
HTTP request - Squid will "randomly" get a FIN or a RST depending on 
whether the FIN or the DATA packet wins the race into the Squid machines 
TCP stack. FIN and RST have different retry properties which might 
explain your "sometimes retries" behaviour.

Also, TCP connections fail quite often for many other reasons anyway. 
Anything from power fluctuations at a router to BGP switching the packet 
route dropping packets. They are most often a short-term situation which 
is resolved by the time the repeat is attempted.

What you are trying to do will result in Squid being unable to cope with 
any of these transitory restrictions from the TCP environment and force 
the client to receive a terminal error page.
  That will greatly slow down detection and recovery from the slightly 
longer-lived TCP issues in Squid itself and may result in N other 
clients also unnecessarily receiving the same error response as bad 
connection attempts gets spread between many clients (all getting 
errors) instead of isolated to the one/few who hit it when the issue 
initially occurs.

Expanding the retries to large numbers (ie the recent default change to 
25), or to low numbers (eg the old default of 5) are reasonable things 
to do depending on the network stability to your upstreams. But going 
all the way to 0 retries is guaranteed to lead to more client visible 
problems than necessary.

All that asside I phrased it as a question because you might have had a 
good reason for increasing the visible failure rates.

> We're already fixing parent behavior, but still want to disable retries on squid side.
> 

Since you describe this as peer misbehaviour, then treating it to Squids 
normal TCP failure recovery is the best behaviour. Retry is the intended 
correct behaviour for a proxy to perform on any non-idempotent requests. 
In your case up to a value of 1 retry before declaring non-temporary 
route failure.

NP: idempotent vs non-idempotent may be another reason behind the 
observed behaviour of retry happening only sometimes.

If you are doing this due to overall latency/delay on the affected 
client traffic you would be better off reducing the timeouts involved 
(cache_peer connect-timeout= parameter AFAICS) than aiming at a retry 
count of 0. Perhapse also requiring a few standby=N persistent 
connections to be maintained if the peer is HTTP/1.1 capable.

Amos