[squid-users] forward_max_tries 1 has no effect

Ivan Larionov xeron.oskom at gmail.com
Wed Nov 29 21:15:33 UTC 2017


Thanks Alex.

Unfortunately I don't have enough C/C++ skills to fix it.

I've created a bug report –
https://bugs.squid-cache.org/show_bug.cgi?id=4788

We've also changed parent behavior so it will not silently close the
connection but will return 502 in this exact situation and seems like it
fixes unexpected squid re-forward.

On Tue, Nov 28, 2017 at 7:12 PM, Alex Rousskov <
rousskov at measurement-factory.com> wrote:

> On 11/28/2017 02:27 PM, Ivan Larionov wrote:
>
> > Another interesting fact is that I can't reproduce this issue if squid
> > has no other traffic except my testing requests. But it's easy to
> > reproduce when server has other traffic.
>
> I did not check your logs carefully, but I believe that (when things do
> not work the way you expect) your Squid is retrying a failed persistent
> connection (rather than re-forwarding the request after receiving a bad
> response). See the "pconn race happened" line below:
>
> > 1_retry.log:2017/11/28 12:55:12.731| 17,3| FwdState.cc(416) fail:
> ERR_ZERO_SIZE_OBJECT "Bad Gateway"
> > 1_retry.log:2017/11/28 12:55:12.731| 17,5| FwdState.cc(430) fail: pconn
> race happened
> > 1_retry.log:2017/11/28 12:55:12.731| 93,5| AsyncJob.cc(84) mustStop:
> HttpStateData will stop, reason: HttpStateData::continueAfterParsingHeader
> > 1_retry.log:2017/11/28 12:55:12.731| 17,3| FwdState.cc(618) retryOrBail:
> re-forwarding (1 tries, 40 secs)
> > 1_retry.log:2017/11/28 12:55:12.731| 17,4| FwdState.cc(621) retryOrBail:
> retrying the same destination
>
>
> Assuming you tested with forward_max_tries set to 1, the retryOrBail
> lines above confirm the off-by-one problem I was describing in my
> previous response.
>
> AFAICT, compensating by setting forward_max_tries to zero will _not_
> work (for reasons unrelated to the problem at hand).
>
>
> FWIW, your current options include those outlined at
> http://wiki.squid-cache.org/SquidFaq/AboutSquid#How_to_
> add_a_new_Squid_feature.2C_enhance.2C_of_fix_something.3F
>
>
> Alex.
>
>
> > On Tue, Nov 28, 2017 at 7:32 AM, Alex Rousskov wrote:
> >
> >     On 11/27/2017 05:19 PM, Ivan Larionov wrote:
> >
> >     > I see retries only when squid config has 2 parents. If I comment
> out
> >     > everything related to "newproxy" I can't reproduce this behavior
> anymore.
> >
> >     The posted logs are not detailed enough to confirm or deny that IMO,
> but
> >     I suspect that you are dealing with at least one bug.
> >
> >
> >     > https://wiki.squid-cache.org/SquidFaq/InnerWorkings#When_
> does_Squid_re-forward_a_client_request.3F
> >     <https://wiki.squid-cache.org/SquidFaq/InnerWorkings#When_
> does_Squid_re-forward_a_client_request.3F>
> >     >
> >     >> Squid does not try to re-forward a request if at least one of the
> following conditions is true:
> >     >>
> >     >> The number of forwarding attempts exceeded forward_max_tries. For
> >     >> example, if you set forward_max_tries to 1 (one), then no requests
> >     >> will be re-forwarded.
> >
> >
> >     AFAICT, there is an off-by-one bug in Squid that violates the above:
> >
> >     >     if (n_tries > Config.forward_max_tries)
> >     >         return false;
> >
> >     The n_tries counter is incremented before Squid makes a request
> >     forwarding attempt. With n_tries and Config.forward_max_tries both
> set
> >     to 1, the quoted FwdState::checkRetry() code will not prevent
> >     re-forwarding. There is a similar problem in FwdState::reforward().
> This
> >     reasoning needs confirmation/testing.
> >
> >     Please note that simply changing the ">" operator to ">=" may break
> >     other things in a difficult-to-detect-by-simple-tests ways. The
> correct
> >     fix may be more complex than it looks and may involve making policy
> >     decisions regarding forward_max_tries meaning. The best fix would
> remove
> >     checkRetry() and reforward() duplication. This code is difficult to
> work
> >     with; many related code names are misleading.
> >
> >
> >     >> Squid has no alternative destinations to try. Please note that
> >     >> alternative destinations may include multiple next hop IP
> addresses
> >     >> and multiple peers.
> >
> >     The fact that Squid sends two requests to the same peer with only one
> >     peer address selected suggests that Squid is retrying a failed
> >     persistent connection rather than re-forwarding after receiving a bad
> >     response. Again, the logs are not detailed enough to distinguish the
> two
> >     cases. I can only see that a single peer/destination address was
> >     selected (not two), which is correct/expected behavior. I cannot see
> >     what happened next with sufficient detail.
> >
> >     Going forward, you have several options, including:
> >
> >     A. Post a link to compressed ALL,7+ logs to confirm bug(s).
> >     B. Fix the broken condition(s) in FwdState. See above.
> >
> >     HTH,
> >
> >     Alex.
> >
> >
> >     > 2017/11/27 15:53:40.542| 5,2| TcpAcceptor.cc(220) doAccept: New
> connection on FD 15
> >     > 2017/11/27 15:53:40.542| 5,2| TcpAcceptor.cc(295) acceptNext:
> connection
> >     > on local=0.0.0.0:3128 <http://0.0.0.0:3128> <http://0.0.0.0:3128>
> >     remote=[::] FD 15 flags=9
> >     > 2017/11/27 15:53:40.543| 11,2| client_side.cc(2372)
> parseHttpRequest:
> >     > HTTP Client local=127.0.0.1:3128 <http://127.0.0.1:3128>
> >     <http://127.0.0.1:3128>
> >     > remote=127.0.0.1:53798 <http://127.0.0.1:53798>
> >     <http://127.0.0.1:53798> FD 45 flags=1
> >     > 2017/11/27 15:53:40.543| 11,2| client_side.cc(2373)
> parseHttpRequest:
> >     > HTTP Client REQUEST:
> >     > ---------
> >     > GET http://HOST:12345/ HTTP/1.1
> >     > Host: HOST:12345
> >     > User-Agent: curl/7.51.0
> >     > Accept: */*
> >     > Proxy-Connection: Keep-Alive
> >     >
> >     >
> >     > ----------
> >     > 2017/11/27 15:53:40.543| 85,2| client_side_request.cc(745)
> >     > clientAccessCheckDone: The request GET http://HOST:12345/ is
> ALLOWED;
> >     > last ACL checked: localhost
> >     > 2017/11/27 15:53:40.543| 85,2| client_side_request.cc(721)
> >     > clientAccessCheck2: No adapted_http_access configuration. default:
> ALLOW
> >     > 2017/11/27 15:53:40.543| 85,2| client_side_request.cc(745)
> >     > clientAccessCheckDone: The request GET http://HOST:12345/ is
> ALLOWED;
> >     > last ACL checked: localhost
> >     > 2017/11/27 15:53:40.543| 17,2| FwdState.cc(133) FwdState:
> Forwarding
> >     > client request local=127.0.0.1:3128 <http://127.0.0.1:3128>
> >     <http://127.0.0.1:3128>
> >     > remote=127.0.0.1:53798 <http://127.0.0.1:53798>
> >     <http://127.0.0.1:53798> FD 45 flags=1,
> >     > url=http://HOST:12345/
> >     > 2017/11/27 15:53:40.543| 44,2| peer_select.cc(258)
> peerSelectDnsPaths:
> >     > Find IP destination for: http://HOST:12345/' via 127.0.0.1
> >     > 2017/11/27 15:53:40.543| 44,2| peer_select.cc(280)
> peerSelectDnsPaths:
> >     > Found sources for 'http://HOST:12345/'
> >     > 2017/11/27 15:53:40.543| 44,2| peer_select.cc(281)
> peerSelectDnsPaths:
> >     >  always_direct = DENIED
> >     > 2017/11/27 15:53:40.543| 44,2| peer_select.cc(282)
> peerSelectDnsPaths:
> >     >   never_direct = ALLOWED
> >     > 2017/11/27 15:53:40.543| 44,2| peer_select.cc(292)
> peerSelectDnsPaths:
> >     >     cache_peer = local=127.0.0.3 remote=127.0.0.1:18070 <
> http://127.0.0.1:18070>
> >     > <http://127.0.0.1:18070> flags=1
> >     > 2017/11/27 15:53:40.543| 44,2| peer_select.cc(295)
> peerSelectDnsPaths:
> >     >       timedout = 0
> >     > 2017/11/27 15:53:40.543| 11,2| http.cc(2229) sendRequest: HTTP
> Server
> >     > local=127.0.0.3:57091 <http://127.0.0.3:57091>
> >     <http://127.0.0.3:57091> remote=127.0.0.1:18070 <
> http://127.0.0.1:18070>
> >     > <http://127.0.0.1:18070> FD 40 flags=1
> >     > 2017/11/27 15:53:40.543| 11,2| http.cc(2230) sendRequest: HTTP
> Server
> >     > REQUEST:
> >     > ---------
> >     > GET http://HOST:12345/ HTTP/1.1
> >     > User-Agent: curl/7.51.0
> >     > Accept: */*
> >     > Host: HOST:12345
> >     > Cache-Control: max-age=259200
> >     > Connection: keep-alive
> >     >
> >     >
> >     > ----------
> >     >
> >     > [SKIPPED 40 seconds until parent closes TCP connection with
> FIN,ACK]
> >     >
> >     > 2017/11/27 15:54:20.627| 11,2| http.cc(1299)
> continueAfterParsingHeader:
> >     > WARNING: HTTP: Invalid Response: No object data received for
> >     > http://HOST:12345/ AKA HOST/
> >     > 2017/11/27 15:54:20.627| 17,2| FwdState.cc(655)
> >     > handleUnregisteredServerEnd: self=0x3e31838*2 err=0x409b338
> >     > http://HOST:12345/
> >     > 2017/11/27 15:54:20.627| 11,2| http.cc(2229) sendRequest: HTTP
> Server
> >     > local=127.0.0.3:41355 <http://127.0.0.3:41355>
> >     <http://127.0.0.3:41355> remote=127.0.0.1:18070 <
> http://127.0.0.1:18070>
> >     > <http://127.0.0.1:18070> FD 40 flags=1
> >     > 2017/11/27 15:54:20.627| 11,2| http.cc(2230) sendRequest: HTTP
> Server
> >     > REQUEST:
> >     > ---------
> >     > GET http://HOST:12345/ HTTP/1.1
> >     > User-Agent: curl/7.51.0
> >     > Accept: */*
> >     > Host: HOST:12345
> >     > Cache-Control: max-age=259200
> >     > Connection: keep-alive
> >     >
> >     >
> >     > ----------
> >     >
> >     > [SKIPPED 40 seconds again until parent closes TCP connection with
> FIN,ACK]
> >     >
> >     > 2017/11/27 15:55:00.728| ctx: enter level  0: 'http://HOST:12345/'
> >     > 2017/11/27 15:55:00.728| 11,2| http.cc(719) processReplyHeader:
> HTTP
> >     > Server local=127.0.0.3:41355 <http://127.0.0.3:41355>
> >     <http://127.0.0.3:41355>
> >     > remote=127.0.0.1:18070 <http://127.0.0.1:18070>
> >     <http://127.0.0.1:18070> FD 40 flags=1
> >     > 2017/11/27 15:55:00.728| 11,2| http.cc(720) processReplyHeader:
> HTTP
> >     > Server REPLY:
> >     > ---------
> >     > HTTP/1.0 502 Bad Gateway
> >     > Cache-Control: no-cache
> >     > Connection: close
> >     > Content-Type: text/html
> >     >
> >     > <html><body><h1>502 Bad Gateway</h1>
> >     > The server returned an invalid or incomplete response.
> >     > </body></html>
> >     >
> >     > ----------
> >     > 2017/11/27 15:55:00.728| ctx: exit level  0
> >     > 2017/11/27 15:55:00.728| 20,2| store.cc(996) checkCachable:
> >     > StoreEntry::checkCachable: NO: not cachable
> >     > 2017/11/27 15:55:00.728| 20,2| store.cc(996) checkCachable:
> >     > StoreEntry::checkCachable: NO: not cachable
> >     > 2017/11/27 15:55:00.728| 88,2| client_side_reply.cc(2073)
> >     > processReplyAccessResult: The reply for GET http://HOST:12345/ is
> >     > ALLOWED, because it matched (access_log
> stdio:/var/log/squid/access.log
> >     > line)
> >     > 2017/11/27 15:55:00.728| 11,2| client_side.cc(1409)
> sendStartOfMessage:
> >     > HTTP Client local=127.0.0.1:3128 <http://127.0.0.1:3128>
> >     <http://127.0.0.1:3128>
> >     > remote=127.0.0.1:53798 <http://127.0.0.1:53798>
> >     <http://127.0.0.1:53798> FD 45 flags=1
> >     > 2017/11/27 15:55:00.728| 11,2| client_side.cc(1410)
> >     sendStartOfMessage:
> >     > HTTP Client REPLY:
> >     > ---------
> >     > HTTP/1.1 502 Bad Gateway
> >     > Date: Mon, 27 Nov 2017 23:54:20 GMT
> >     > Cache-Control: no-cache
> >     > Content-Type: text/html
> >     > X-Cache: MISS from ip-172-23-18-130
> >     > X-Cache-Lookup: MISS from ip-172-23-18-130:3128
> >     > Transfer-Encoding: chunked
> >     > Connection: keep-alive
> >     >
> >     >
> >     > ----------
> >
> >
> >
> >
> > --
> > With best regards, Ivan Larionov.
>
>


-- 
With best regards, Ivan Larionov.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20171129/91eb60c6/attachment-0001.html>


More information about the squid-users mailing list