[squid-users] stale-if-error returning a 502
Alex Rousskov
rousskov at measurement-factory.com
Mon Feb 12 16:06:22 UTC 2024
On 2024-02-12 10:13, Robin Carlisle wrote:
> I have been having success so far with the config workaround.. config
> snippet :-
>
> /max_stale 31536000 seconds
> refresh_pattern . 0 20% 4320 max-stale=31536000/
>
> When an object has expired due to max-age and the PC is offline
> (ethernet unplugged), squid attempts an origin refresh and gives me :
>
> /0 ::1 TCP_REFRESH_FAIL_OLD/200 35965 GET
> https://widgets.api.labs.dev.framestoresignage.com/api/v1/instagram/labs/posts.json <https://widgets.api.labs.dev.framestoresignage.com/api/v1/instagram/labs/posts.json> - HIER_NONE/- application/json/
>
> Previously it had been passing the 502 through to the client application.
Glad this workaround helps. Just keep in mind that the configuration
snippet above changes max-stale for _all_ responses.
> I am continuing to test this - but it looks like I have a working solution.
Meanwhile, the fix for the underlying Squid bug was officially accepted
and should become a part of v6.8 release (at least).
Thank you,
Alex.
> On Fri, 9 Feb 2024 at 14:31, Alex Rousskov wrote:
>
> On 2024-02-09 08:53, Robin Carlisle wrote:
>
> > I am trying the config workaround approach.
>
> Please keep us posted on your progress.
>
> > Below is the config snippet I have added. I made the
> > assumption that for the /refresh_pattern, max-stale=NN /config,
> the NN
> > is in minutes as per the rest of that config directive.
>
> That assumption is natural but incorrect: Unlike the anonymous
> positional min and max parameters (that use minutes), refresh_pattern
> max-stale=NN uses seconds. Documentation improvements are welcome.
>
> Said that, the workaround should still prevent the application of the
> broken default refresh_pattern max-stale=0 rule, so you should still
> see
> positive results for the first NN seconds of the response age.
>
> Instead of specifying max-stale=NN, consider adding refresh_pattern
> rules recommended by squid.conf.documented (and included in
> squid.cond.default). Those rules do not have max-stale options at all,
> and, hence, Squid will use (explicit or default) max_stale directive
> instead.
>
> HTH,
>
> Alex.
>
>
> > I am testing this right now
> >
> > # this should allow stale objects up to 1 year if allowed by
> > Cache-Control repsonseheaders ...
> >
> > # ... setting both options just in case
> >
> > max_stale 525600 minutes
> >
> > refresh_pattern . 0 20% 4320 max-stale=525600
> >
> >
> > Thanks again for your help
> >
> >
> > Robin
> >
> >
> >
> >
> > On Thu, 8 Feb 2024 at 17:42, Alex Rousskov
> > <rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>
> > <mailto:rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>>> wrote:
> >
> > Hi Robin,
> >
> > AFAICT from the logs you have privately shared and your
> > squid.conf
> > that you have posted earlier, your Squid overwrites
> > stale-if-error=31536000 in the response with "refresh_pattern
> > max-stale=0" default. That 0 value is wrong. The correct value
> > should be
> > taken from max_stale directive that defaults to 1 week, not zero:
> >
> > refresh_pattern
> > ...
> > max-stale=NN provide a maximum staleness factor. Squid
> won't
> > serve objects more stale than this even if it failed to
> > validate the object. Default: use the max_stale global
> limit.
> >
> > This wrong default is a Squid bug AFAICT. I posted an
> _untested_ fix as
> > Squid PR 1664: https://github.com/squid-cache/squid/pull/1664
> <https://github.com/squid-cache/squid/pull/1664>
> > <https://github.com/squid-cache/squid/pull/1664
> <https://github.com/squid-cache/squid/pull/1664>>
> >
> > If possible, please test the corresponding patch:
> >
> https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch <https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch> <https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch <https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch>>
> >
> > AFAICT, you can also work around that bug by configuring an
> explicit
> > refresh_pattern rule with an explicit max-stale option (see
> > squid.conf.documented for examples). I have not tested that
> theory
> > either.
> >
> >
> > HTH,
> >
> > Alex.
> >
> >
> > On 2024-02-07 13:45, Robin Carlisle wrote:
> > > Hi,
> > >
> > > I have just started my enhanced logging journey and have a
> small
> > snippet
> > > below that might illuminate the issue ...
> > >
> > > /2024/02/07 17:06:39.212 kid1| 88,3| client_side_reply.cc(507)
> > > handleIMSReply: origin replied with error 502, forwarding to
> > client due
> > > to fail_on_validation_err/
> > >
> > > A few lines below in the log it looks like squid sent :-
> > >
> > > /2024/02/07 17:06:39.212 kid1| 11,2| Stream.cc(280)
> > sendStartOfMessage:
> > > HTTP Client REPLY:
> > > ---------
> > > HTTP/1.1 502 Bad Gateway
> > > Server: squid/5.7
> > > Mime-Version: 1.0
> > > Date: Wed, 07 Feb 2024 17:06:39 GMT
> > > Content-Type: text/html;charset=utf-8
> > > Content-Length: 3853
> > > X-Squid-Error: ERR_READ_ERROR 0
> > > Vary: Accept-Language
> > > Content-Language: en
> > > X-Cache: MISS from labs-maul-st-15
> > > X-Cache-Lookup: HIT from labs-maul-st-15:3129
> > > Via: 1.1 labs-maul-st-15 (squid/5.7)
> > > Connection: close/
> > >
> > >
> > > The rest of the logs are quite large and contain URLs I
> cannot put
> > > here. The logs were generated with debug_options to ALL,3.
> > >
> > > Any ideas? Or should I generate more detailed logs and
> send them
> > > privately?
> > >
> > > Thanks again,
> > >
> > > Robin
> > >
> > >
> > >
> > >
> > > On Fri, 2 Feb 2024 at 11:20, Robin Carlisle
> > > <robin.carlisle at framestore.com
> <mailto:robin.carlisle at framestore.com>
> > <mailto:robin.carlisle at framestore.com
> <mailto:robin.carlisle at framestore.com>>
> > <mailto:robin.carlisle at framestore.com
> <mailto:robin.carlisle at framestore.com>
> > <mailto:robin.carlisle at framestore.com
> <mailto:robin.carlisle at framestore.com>>>>
> > > wrote:
> > >
> > > Hi, thanks for your reply.
> > >
> > > I have been looking at :
> > >
> >
> https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control>> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control>>>
> > >
> > > /The stale-if-error response directive indicates that the
> > cache can
> > > reuse a stale response when an upstream server
> generates an
> > error,
> > > or when the error is generated locally. Here, an error is
> > considered
> > > any response with a status code of 500, 502, 503, or 504.
> > >
> > > Cache-Control: max-age=604800, stale-if-error=86400
> > > In the example above, the response is fresh for 7 days
> (604800s).
> > > Afterwards, it becomes stale, but can be used for an
> extra 1 day
> > > (86400s) when an error is encountered.
> > >
> > > After the stale-if-error period passes, the client will
> > receive any
> > > error generated/
> > >
> > > Given what you have said and what the above docs say -
> I am still
> > > confused as it looks like (in my test cases) the cached
> > response can
> > > be used for 3600 secs (this works), after which the cached
> > response
> > > can still be used for an additional 31536000 seconds
> on an error
> > > (this doesnt work).
> > >
> > > I am going to dig into the error logging you suggested
> to see
> > if I
> > > can make sense of that - and will send on if I can't.
> > >
> > > Thanks v much for your help again,
> > >
> > > Robin
> > >
> > >
> > >
> > >
> > >
> > > On Thu, 1 Feb 2024 at 18:27, Alex Rousskov
> > > <rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>
> > <mailto:rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>>
> > > <mailto:rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>
> > <mailto:rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>>>> wrote:
> > >
> > > On 2024-02-01 12:03, Robin Carlisle wrote:
> > > > Hi, I am having trouble with stale-if-error
> response.
> > >
> > > If I am interpreting Squid code correctly, in
> primary use
> > cases:
> > >
> > > * without a Cache-Control:stale-if-error=X in the
> original
> > > response,
> > > Squid sends a stale object if revalidation results
> in a
> > 5xx error;
> > >
> > > * with a Cache-Control:stale-if-error=X and object
> age at
> > most
> > > X, Squid
> > > sends a stale object if revalidation results in a
> 5xx error;
> > >
> > > * with a Cache-Control:stale-if-error=X and object age
> > exceeding X,
> > > Squid forwards the 5xx error response if revalidation
> > results in
> > > a 5xx
> > > error;
> > >
> > > In other words, stale-if-error=X turns on a "fail on
> > validation
> > > errors"
> > > behavior for stale objects older than X. It has no
> other
> > effects.
> > >
> > > In your test case, the stale objects are much
> younger than
> > > stale-if-error value (e.g., Age~=3601 vs.
> > stale-if-error=31536000).
> > > Thus, stale-if-error should have no relevant effect.
> > >
> > > Something else is probably preventing your Squid
> from serving
> > > the stale
> > > response when facing a 5xx error. I do not know
> what that
> > > something is.
> > >
> > > I recommend sharing (privately if you need to protect
> > sensitive
> > > info) a
> > > pointer to a compressed ALL,9 cache.log collected
> while
> > > reproducing the
> > > problem (using two transactions similar to the
> ones you
> > have shared
> > > below -- a successful stale hit and a problematic
> one):
> > >
> >
> https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction> <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction>> <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction> <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction>>>
> > >
> > > Alternatively, you can try to study cache.log
> yourself after
> > > setting
> > > debug_options to ALL,3. Searching for "refresh" and
> > > "handleIMSReply" may
> > > yield enough clues.
> > >
> > >
> > > HTH,
> > >
> > > Alex.
> > >
> > >
> > >
> > >
> > > > # /etc/squid/squid.conf :
> > > >
> > > > acl to_aws dstdomain .amazonaws.com
> <http://amazonaws.com>
> > <http://amazonaws.com <http://amazonaws.com>>
> <http://amazonaws.com <http://amazonaws.com> <http://amazonaws.com
> <http://amazonaws.com>>>
> > > <http://amazonaws.com <http://amazonaws.com>
> <http://amazonaws.com <http://amazonaws.com>>
> > <http://amazonaws.com <http://amazonaws.com>
> <http://amazonaws.com <http://amazonaws.com>>>>
> > > >
> > > > acl from_local src localhost
> > > >
> > > > http_access allow to_aws
> > > >
> > > > http_access allow from_local
> > > >
> > > > cache allow all
> > > >
> > > > cache_dir ufs /var/cache/squid 1024 16 256
> > > >
> > > > http_port 3129 ssl-bump cert=/etc/squid/maul.pem
> > > > generate-host-certificates=on
> > dynamic_cert_mem_cache_size=4MB
> > > >
> > > > sslcrtd_program
> /usr/lib/squid/security_file_certgen -s
> > > > /var/lib/squid/ssl_db -M 4MB
> > > >
> > > > acl step1 at_step SslBump1
> > > >
> > > > ssl_bump bump step1
> > > >
> > > > ssl_bump bump all
> > > >
> > > > sslproxy_cert_error deny all
> > > >
> > > > cache_store_log stdio:/var/log/squid/store.log
> > > >
> > > > logfile_rotate 0
> > > >
> > > > shutdown_lifetime 3 seconds
> > > >
> > > >
> > > > # /usr/bin/proxy-test :
> > > >
> > > > #!/bin/bash
> > > >
> > > > curl --proxy http://localhost:3129
> <http://localhost:3129>
> > <http://localhost:3129 <http://localhost:3129>>
> <http://localhost:3129 <http://localhost:3129>
> <http://localhost:3129 <http://localhost:3129>>>
> > > <http://localhost:3129 <http://localhost:3129>
> <http://localhost:3129 <http://localhost:3129>>
> > <http://localhost:3129 <http://localhost:3129>
> <http://localhost:3129 <http://localhost:3129>>>> \
> > > >
> > > > --cacert /etc/squid/stuff.pem \
> > > >
> > > > -v
> > "https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
> > <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>
> > >
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
> > <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>>
> > > >
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
> > <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>
> > >
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
> > <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
> <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>>>" \
> > > >
> > > > -H "Authorization: token MYTOKEN" \
> > > >
> > > > -H "Content-Type: application/json" \
> > > >
> > > > --output "/tmp/stuff.json"
> > > >
> > > >
> > > >
> > > > Tests ..........
> > > >
> > > >
> > > > At this point in time the network cable is
> > unattached. Squid
> > > returns
> > > > the cached object it got when the network was
> online
> > earlier.
> > > The Age of
> > > > this object is just still under the max_age of
> 3600.
> > > Previously I
> > > > was using offline_mode but I found that it did
> not try to
> > > revalidate
> > > > from the origin after the object expired
> (defined via
> > max-age
> > > response).
> > > > My understanding is that stale-if-error
> should work
> > under my
> > > > circumstances.
> > > >
> > > >
> > > > # /var/log/squid/access.log
> > > >
> > > > 1706799404.440 6 127.0.0.1 NONE_NONE/200 0
> CONNECT
> > > > stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>> <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>>>
> > > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443 <http://stuff.amazonaws.com:443>>
> > > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>>>> - HIER_NONE/- -
> > > >
> > > > 1706799404.440 0 127.0.0.1 TCP_MEM_HIT/200
> 20726 GET
> > > > https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>
> > > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>>
> > > > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>
> > > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>>> - HIER_NONE/-
> > > application/json
> > > >
> > > >
> > > > # extract from /usr/bin/proxy-test
> > > >
> > > > < HTTP/1.1 200 OK
> > > >
> > > > < Date: Thu, 01 Feb 2024 13:57:11 GMT
> > > >
> > > > < Content-Type: application/json
> > > >
> > > > < Content-Length: 20134
> > > >
> > > > < x-amzn-RequestId:
> 3a2d3b26-df73-4b30-88cb-1a9268fa0df2
> > > >
> > > > < Last-Modified: 2024-02-01T13:00:45.000Z
> > > >
> > > > < Access-Control-Allow-Origin: *
> > > >
> > > > < x-amz-apigw-id: SdZwpG7qiYcERUQ=
> > > >
> > > > < Cache-Control: public, max-age=3600,
> > stale-if-error=31536000
> > > >
> > > > < ETag: "cec102b43372840737ab773c2e77858b"
> > > >
> > > > < X-Amzn-Trace-Id:
> > Root=1-65bba337-292be751134161b03555cdd6
> > > >
> > > > < Age: 3573
> > > >
> > > > < X-Cache: HIT from labs-maul-st-31
> > > >
> > > > < X-Cache-Lookup: HIT from labs-maul-st-31:3129
> > > >
> > > > < Via: 1.1 labs-maul-st-31 (squid/5.7)
> > > >
> > > > < Connection: keep-alive
> > > >
> > > >
> > > >
> > > >
> > > > Below .. the curl script executes again. The
> Age has gone
> > > over the
> > > > max-age so squid attempted to refresh from the
> > origin. The
> > > machine is
> > > > still offline so the refresh failed. I
> expected that the
> > > > stale-if-error response would instruct squid to
> return the
> > > cached object
> > > > as a 200.
> > > >
> > > >
> > > > # /var/log/squid/access.log
> > > >
> > > > 1706799434.464 5 127.0.0.1 NONE_NONE/200 0
> CONNECT
> > > > stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>> <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>>>
> > > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443 <http://stuff.amazonaws.com:443>>
> > > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>
> > <http://stuff.amazonaws.com:443
> <http://stuff.amazonaws.com:443>>>> - HIER_NONE/- -
> > > >
> > > > 1706799434.464 0 127.0.0.1
> > TCP_REFRESH_FAIL_ERR/502 4235
> > > GET
> > > > https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>
> > > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>>
> > > > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>
> > > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>
> > <https://stuff.amazonaws.com/stuff.json
> <https://stuff.amazonaws.com/stuff.json>>>> - HIER_NONE/- text/html
> > > >
> > > >
> > > > # extract from /usr/bin/proxy-test
> > > >
> > > > < HTTP/1.1 502 Bad Gateway
> > > >
> > > > < Server: squid/5.7
> > > >
> > > > < Mime-Version: 1.0
> > > >
> > > > < Date: Thu, 01 Feb 2024 14:57:14 GMT
> > > >
> > > > < Content-Type: text/html;charset=utf-8
> > > >
> > > > < Content-Length: 3853
> > > >
> > > > < X-Squid-Error: ERR_READ_ERROR 0
> > > >
> > > > < Vary: Accept-Language
> > > >
> > > > < Content-Language: en
> > > >
> > > > < X-Cache: MISS from labs-maul-st-31
> > > >
> > > > < X-Cache-Lookup: HIT from labs-maul-st-31:3129
> > > >
> > > > < Via: 1.1 labs-maul-st-31 (squid/5.7)
> > > >
> > > > < Connection: close
> > > >
> > > >
> > > >
> > > > Hope someone can help me with this. All the best,
> > > >
> > > >
> > > > Robin Carlisle
>
More information about the squid-users
mailing list