[squid-users] stale-if-error returning a 502

Alex Rousskov rousskov at measurement-factory.com
Fri Feb 9 14:31:08 UTC 2024


On 2024-02-09 08:53, Robin Carlisle wrote:

> I am trying the config workaround approach.

Please keep us posted on your progress.

>  Below is the config snippet I have added.    I made the 
> assumption that for the /refresh_pattern, max-stale=NN /config, the NN 
> is in minutes as per the rest of that config directive.

That assumption is natural but incorrect: Unlike the anonymous 
positional min and max parameters (that use minutes), refresh_pattern 
max-stale=NN uses seconds. Documentation improvements are welcome.

Said that, the workaround should still prevent the application of the 
broken default refresh_pattern max-stale=0 rule, so you should still see 
positive results for the first NN seconds of the response age.

Instead of specifying max-stale=NN, consider adding refresh_pattern 
rules recommended by squid.conf.documented (and included in 
squid.cond.default). Those rules do not have max-stale options at all, 
and, hence, Squid will use (explicit or default) max_stale directive 
instead.

HTH,

Alex.


> I am testing this right now
> 
> # this should allow stale objects up to 1 year if allowed by 
> Cache-Control repsonseheaders ...
> 
> # ... setting both options just in case
> 
> max_stale 525600 minutes
> 
> refresh_pattern . 0  20% 4320 max-stale=525600
> 
> 
> Thanks again for your help
> 
> 
> Robin
> 
> 
> 
> 
> On Thu, 8 Feb 2024 at 17:42, Alex Rousskov 
> <rousskov at measurement-factory.com 
> <mailto:rousskov at measurement-factory.com>> wrote:
> 
>     Hi Robin,
> 
>           AFAICT from the logs you have privately shared and your
>     squid.conf
>     that you have posted earlier, your Squid overwrites
>     stale-if-error=31536000 in the response with "refresh_pattern
>     max-stale=0" default. That 0 value is wrong. The correct value
>     should be
>     taken from max_stale directive that defaults to 1 week, not zero:
> 
>           refresh_pattern
>           ...
>           max-stale=NN provide a maximum staleness factor. Squid won't
>           serve objects more stale than this even if it failed to
>           validate the object. Default: use the max_stale global limit.
> 
>     This wrong default is a Squid bug AFAICT. I posted an _untested_ fix as
>     Squid PR 1664: https://github.com/squid-cache/squid/pull/1664
>     <https://github.com/squid-cache/squid/pull/1664>
> 
>     If possible, please test the corresponding patch:
>     https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch <https://github.com/squid-cache/squid/commit/571973589b5a46d458311f8b60dcb83032fd5cec.patch>
> 
>     AFAICT, you can also work around that bug by configuring an explicit
>     refresh_pattern rule with an explicit max-stale option (see
>     squid.conf.documented for examples). I have not tested that theory
>     either.
> 
> 
>     HTH,
> 
>     Alex.
> 
> 
>     On 2024-02-07 13:45, Robin Carlisle wrote:
>      > Hi,
>      >
>      > I have just started my enhanced logging journey and have a small
>     snippet
>      > below that might illuminate the issue ...
>      >
>      > /2024/02/07 17:06:39.212 kid1| 88,3| client_side_reply.cc(507)
>      > handleIMSReply: origin replied with error 502, forwarding to
>     client due
>      > to fail_on_validation_err/
>      >
>      > A few lines below in the log it looks like squid sent :-
>      >
>      > /2024/02/07 17:06:39.212 kid1| 11,2| Stream.cc(280)
>     sendStartOfMessage:
>      > HTTP Client REPLY:
>      > ---------
>      > HTTP/1.1 502 Bad Gateway
>      > Server: squid/5.7
>      > Mime-Version: 1.0
>      > Date: Wed, 07 Feb 2024 17:06:39 GMT
>      > Content-Type: text/html;charset=utf-8
>      > Content-Length: 3853
>      > X-Squid-Error: ERR_READ_ERROR 0
>      > Vary: Accept-Language
>      > Content-Language: en
>      > X-Cache: MISS from labs-maul-st-15
>      > X-Cache-Lookup: HIT from labs-maul-st-15:3129
>      > Via: 1.1 labs-maul-st-15 (squid/5.7)
>      > Connection: close/
>      >
>      >
>      > The rest of the logs are quite large and contain URLs I cannot put
>      > here.   The logs were generated with debug_options to ALL,3.
>      >
>      > Any ideas?   Or should I generate more detailed logs and send them
>      > privately?
>      >
>      > Thanks again,
>      >
>      > Robin
>      >
>      >
>      >
>      >
>      > On Fri, 2 Feb 2024 at 11:20, Robin Carlisle
>      > <robin.carlisle at framestore.com
>     <mailto:robin.carlisle at framestore.com>
>     <mailto:robin.carlisle at framestore.com
>     <mailto:robin.carlisle at framestore.com>>>
>      > wrote:
>      >
>      >     Hi, thanks for your reply.
>      >
>      >     I have been looking at :
>      >
>     https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control> <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control <https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control>>
>      >
>      >     /The stale-if-error response directive indicates that the
>     cache can
>      >     reuse a stale response when an upstream server generates an
>     error,
>      >     or when the error is generated locally. Here, an error is
>     considered
>      >     any response with a status code of 500, 502, 503, or 504.
>      >
>      >     Cache-Control: max-age=604800, stale-if-error=86400
>      >     In the example above, the response is fresh for 7 days (604800s).
>      >     Afterwards, it becomes stale, but can be used for an extra 1 day
>      >     (86400s) when an error is encountered.
>      >
>      >     After the stale-if-error period passes, the client will
>     receive any
>      >     error generated/
>      >
>      >     Given what you have said and what the above docs say - I am still
>      >     confused as it looks like (in my test cases) the cached
>     response can
>      >     be used for 3600 secs (this works), after which the cached
>     response
>      >     can still be used for an additional 31536000 seconds on an error
>      >     (this doesnt work).
>      >
>      >     I am going to dig into the error logging you suggested to see
>     if I
>      >     can make sense of that - and will send on if I can't.
>      >
>      >     Thanks v much for your help again,
>      >
>      >     Robin
>      >
>      >
>      >
>      >
>      >
>      >     On Thu, 1 Feb 2024 at 18:27, Alex Rousskov
>      >     <rousskov at measurement-factory.com
>     <mailto:rousskov at measurement-factory.com>
>      >     <mailto:rousskov at measurement-factory.com
>     <mailto:rousskov at measurement-factory.com>>> wrote:
>      >
>      >         On 2024-02-01 12:03, Robin Carlisle wrote:
>      >          > Hi, I am having trouble with stale-if-error response.
>      >
>      >         If I am interpreting Squid code correctly, in primary use
>     cases:
>      >
>      >         * without a Cache-Control:stale-if-error=X in the original
>      >         response,
>      >         Squid sends a stale object if revalidation results in a
>     5xx error;
>      >
>      >         * with a Cache-Control:stale-if-error=X and object age at
>     most
>      >         X, Squid
>      >         sends a stale object if revalidation results in a 5xx error;
>      >
>      >         * with a Cache-Control:stale-if-error=X and object age
>     exceeding X,
>      >         Squid forwards the 5xx error response if revalidation
>     results in
>      >         a 5xx
>      >         error;
>      >
>      >         In other words, stale-if-error=X turns on a "fail on
>     validation
>      >         errors"
>      >         behavior for stale objects older than X. It has no other
>     effects.
>      >
>      >         In your test case, the stale objects are much younger than
>      >         stale-if-error value (e.g., Age~=3601 vs.
>     stale-if-error=31536000).
>      >         Thus, stale-if-error should have no relevant effect.
>      >
>      >         Something else is probably preventing your Squid from serving
>      >         the stale
>      >         response when facing a 5xx error. I do not know what that
>      >         something is.
>      >
>      >         I recommend sharing (privately if you need to protect
>     sensitive
>      >         info) a
>      >         pointer to a compressed ALL,9 cache.log collected while
>      >         reproducing the
>      >         problem (using two transactions similar to the ones you
>     have shared
>      >         below -- a successful stale hit and a problematic one):
>      >
>     https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction> <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction <https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction>>
>      >
>      >         Alternatively, you can try to study cache.log yourself after
>      >         setting
>      >         debug_options to ALL,3. Searching for "refresh" and
>      >         "handleIMSReply" may
>      >         yield enough clues.
>      >
>      >
>      >         HTH,
>      >
>      >         Alex.
>      >
>      >
>      >
>      >
>      >          > # /etc/squid/squid.conf :
>      >          >
>      >          > acl to_aws dstdomain .amazonaws.com
>     <http://amazonaws.com> <http://amazonaws.com <http://amazonaws.com>>
>      >         <http://amazonaws.com <http://amazonaws.com>
>     <http://amazonaws.com <http://amazonaws.com>>>
>      >          >
>      >          > acl from_local src localhost
>      >          >
>      >          > http_access allow to_aws
>      >          >
>      >          > http_access allow from_local
>      >          >
>      >          > cache allow all
>      >          >
>      >          > cache_dir ufs /var/cache/squid 1024 16 256
>      >          >
>      >          > http_port 3129 ssl-bump cert=/etc/squid/maul.pem
>      >          > generate-host-certificates=on
>     dynamic_cert_mem_cache_size=4MB
>      >          >
>      >          > sslcrtd_program /usr/lib/squid/security_file_certgen -s
>      >          > /var/lib/squid/ssl_db -M 4MB
>      >          >
>      >          > acl step1 at_step SslBump1
>      >          >
>      >          > ssl_bump bump step1
>      >          >
>      >          > ssl_bump bump all
>      >          >
>      >          > sslproxy_cert_error deny all
>      >          >
>      >          > cache_store_log stdio:/var/log/squid/store.log
>      >          >
>      >          > logfile_rotate 0
>      >          >
>      >          > shutdown_lifetime 3 seconds
>      >          >
>      >          >
>      >          > # /usr/bin/proxy-test :
>      >          >
>      >          > #!/bin/bash
>      >          >
>      >          > curl --proxy http://localhost:3129
>     <http://localhost:3129> <http://localhost:3129 <http://localhost:3129>>
>      >         <http://localhost:3129 <http://localhost:3129>
>     <http://localhost:3129 <http://localhost:3129>>> \
>      >          >
>      >          >    --cacert /etc/squid/stuff.pem \
>      >          >
>      >          >    -v
>     "https://stuff.amazonaws.com/api/v1/stuff/stuff.json
>     <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
>      >         <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
>     <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>
>      >          > <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
>     <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>
>      >         <https://stuff.amazonaws.com/api/v1/stuff/stuff.json
>     <https://stuff.amazonaws.com/api/v1/stuff/stuff.json>>>" \
>      >          >
>      >          >    -H "Authorization: token MYTOKEN" \
>      >          >
>      >          >    -H "Content-Type: application/json" \
>      >          >
>      >          >    --output "/tmp/stuff.json"
>      >          >
>      >          >
>      >          >
>      >          > Tests  ..........
>      >          >
>      >          >
>      >          > At this point in time the network cable is
>     unattached.  Squid
>      >         returns
>      >          > the cached object it got when the network was online
>     earlier.
>      >         The Age of
>      >          > this object is just still under the max_age of 3600.
>      >         Previously I
>      >          > was using offline_mode but I found that it did not try to
>      >         revalidate
>      >          > from the origin after the object expired (defined via
>     max-age
>      >         response).
>      >          >    My understanding is that stale-if-error should work
>     under my
>      >          > circumstances.
>      >          >
>      >          >
>      >          > # /var/log/squid/access.log
>      >          >
>      >          > 1706799404.440      6 127.0.0.1 NONE_NONE/200 0 CONNECT
>      >          > stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443> <http://stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443>>
>      >         <http://stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443>
>      >         <http://stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443>>> - HIER_NONE/- -
>      >          >
>      >          > 1706799404.440      0 127.0.0.1 TCP_MEM_HIT/200 20726 GET
>      >          > https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>
>      >         <https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>>
>      >          > <https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>
>      >         <https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>>> - HIER_NONE/-
>      >         application/json
>      >          >
>      >          >
>      >          > # extract from /usr/bin/proxy-test
>      >          >
>      >          > < HTTP/1.1 200 OK
>      >          >
>      >          > < Date: Thu, 01 Feb 2024 13:57:11 GMT
>      >          >
>      >          > < Content-Type: application/json
>      >          >
>      >          > < Content-Length: 20134
>      >          >
>      >          > < x-amzn-RequestId: 3a2d3b26-df73-4b30-88cb-1a9268fa0df2
>      >          >
>      >          > < Last-Modified: 2024-02-01T13:00:45.000Z
>      >          >
>      >          > < Access-Control-Allow-Origin: *
>      >          >
>      >          > < x-amz-apigw-id: SdZwpG7qiYcERUQ=
>      >          >
>      >          > < Cache-Control: public, max-age=3600,
>     stale-if-error=31536000
>      >          >
>      >          > < ETag: "cec102b43372840737ab773c2e77858b"
>      >          >
>      >          > < X-Amzn-Trace-Id:
>     Root=1-65bba337-292be751134161b03555cdd6
>      >          >
>      >          > < Age: 3573
>      >          >
>      >          > < X-Cache: HIT from labs-maul-st-31
>      >          >
>      >          > < X-Cache-Lookup: HIT from labs-maul-st-31:3129
>      >          >
>      >          > < Via: 1.1 labs-maul-st-31 (squid/5.7)
>      >          >
>      >          > < Connection: keep-alive
>      >          >
>      >          >
>      >          >
>      >          >
>      >          > Below .. the curl script executes again.  The Age has gone
>      >         over the
>      >          > max-age so squid attempted to refresh from the
>     origin.  The
>      >         machine is
>      >          > still offline so the refresh failed.   I expected that the
>      >          > stale-if-error response would instruct squid to return the
>      >         cached object
>      >          > as a 200.
>      >          >
>      >          >
>      >          > # /var/log/squid/access.log
>      >          >
>      >          > 1706799434.464      5 127.0.0.1 NONE_NONE/200 0 CONNECT
>      >          > stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443> <http://stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443>>
>      >         <http://stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443>
>      >         <http://stuff.amazonaws.com:443
>     <http://stuff.amazonaws.com:443>>> - HIER_NONE/- -
>      >          >
>      >          > 1706799434.464      0 127.0.0.1
>     TCP_REFRESH_FAIL_ERR/502 4235
>      >         GET
>      >          > https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>
>      >         <https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>>
>      >          > <https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>
>      >         <https://stuff.amazonaws.com/stuff.json
>     <https://stuff.amazonaws.com/stuff.json>>> - HIER_NONE/- text/html
>      >          >
>      >          >
>      >          > # extract from /usr/bin/proxy-test
>      >          >
>      >          > < HTTP/1.1 502 Bad Gateway
>      >          >
>      >          > < Server: squid/5.7
>      >          >
>      >          > < Mime-Version: 1.0
>      >          >
>      >          > < Date: Thu, 01 Feb 2024 14:57:14 GMT
>      >          >
>      >          > < Content-Type: text/html;charset=utf-8
>      >          >
>      >          > < Content-Length: 3853
>      >          >
>      >          > < X-Squid-Error: ERR_READ_ERROR 0
>      >          >
>      >          > < Vary: Accept-Language
>      >          >
>      >          > < Content-Language: en
>      >          >
>      >          > < X-Cache: MISS from labs-maul-st-31
>      >          >
>      >          > < X-Cache-Lookup: HIT from labs-maul-st-31:3129
>      >          >
>      >          > < Via: 1.1 labs-maul-st-31 (squid/5.7)
>      >          >
>      >          > < Connection: close
>      >          >
>      >          >
>      >          >
>      >          > Hope someone can help me with this.  All the best,
>      >          >
>      >          >
>      >          > Robin Carlisle


More information about the squid-users mailing list