[squid-dev] [PATCH] Retry cache peer DNS failures more frequently

Thu Jun 23 22:50:57 UTC 2016

Hello,

Did anyone have any thoughts on the issues I had with this? I don't want
this to slip through the cracks :)

Thank you,

Nathan.

On 17 May 2016 at 15:57, Nathan Hoad <nathan at getoffmalawn.com> wrote:

> Hello,
>
> Attached is a patch which makes the changes recommended by Amos - each
> peer now gets its own event for retrying resolution, dependent on the DNS
> TTL. This should also fix up the concerns up by Alex. A few caveats though:
>
>  - the cache manager shows generic "peerRefreshDNS" names for each event.
> I can't find any examples that give it a dynamic name, e.g. I'd like
> something like "peerRefreshDNS(example.com)", but I can't think of how
> I'd do that without leaking memory or making some significant changes to
> the event handler system.
>
> - I can't figure out how to reproduce the second failure case, where a
> result comes back but it has no IP addresses. I _think_ using the TTL would
> be valid instead of negative_dns_ttl would be valid in that situation, but
> I can't be sure. I figured this was the safest option.
>
>  - eventDelete does not appear to be clearing out events as I expect it
> to, so if you reconfigure Squid you end up with some dead events, like so:
>
> [root at xxx ~]# squidmgr events | grep peerRefresh
> Last event to run: peerRefreshDNS
> peerRefreshDNS                  0.331 sec           1    yes
> peerRefreshDNS                  0.679 sec           1    yes
> peerRefreshDNS                  47.649 sec          1    yes
> peerRefreshDNS                  61.619 sec          1    yes
> peerRefreshDNS                  207.682 sec         1    yes
> peerRefreshDNS                  207.682 sec         1    yes
> peerRefreshDNS                  207.682 sec         1    yes
> peerRefreshDNS                  207.682 sec         1    yes
> peerRefreshDNS                  207.682 sec         1    yes
> [root at xxx ~]# squid -k
> reconfigure
> [root at xxx ~]# squidmgr events | grep peerRefresh
> Last event to run: peerRefreshDNS
> peerRefreshDNS                  0.763 sec           1    yes
> peerRefreshDNS                  0.763 sec           1    yes
> peerRefreshDNS                  41.755 sec          1    yes
> peerRefreshDNS                  55.755 sec          1    yes
> peerRefreshDNS                  56.187 sec          1    no
> peerRefreshDNS                  202.250 sec         1    no
> peerRefreshDNS                  202.250 sec         1    no
> peerRefreshDNS                  3599.758 sec        1    yes
> peerRefreshDNS                  3599.758 sec        1    yes
> peerRefreshDNS                  3599.758 sec        1    yes
> peerRefreshDNS                  3599.758 sec        1    yes
> peerRefreshDNS                  3599.758 sec        1    yes
>
> If I run squid -k reconfigure again, then the events with invalid callback
> data are cleared out, so it doesn't grow indefinitely at least. I'm not
> sure how or if I should fix this.
>
> Thank you,
>
> Nathan.
>
>
> On 10 May 2016 at 18:13, Alex Rousskov <rousskov at measurement-factory.com>
> wrote:
>
>> On 05/10/2016 01:50 AM, Amos Jeffries wrote:
>>
>> > Then each peer gets its own re-lookup event scheduled
>>
>> If applied correctly, this approach would also solve the misapplication
>> problem I described in my concurrent review. Unfortunately, it requires
>> serious work. Fortunately, you have already converted CachePeer from
>> being a POD into a proper class. That will help!
>>
>>
>> Thank you,
>>
>> Alex.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20160624/bb92588c/attachment.html>