[squid-dev] [PATCH] Reuse reserved Negotiate and NTLM helpers after an idle timeout.
Amos Jeffries
squid3 at treenet.co.nz
Sun Jul 30 03:48:33 UTC 2017
On 27/07/17 18:52, Christos Tsantilas wrote:
> The patch.
>
> Στις 26/07/2017 12:37 μμ, ο Christos Tsantilas έγραψε:
>> Squid can be killed or maimed by enough clients that start multi-step
>> connection authentication but never follow up with the second HTTP
>> request while keeping their HTTP connection open. Affected helpers
>> remain in the "reserved" state and cannot be reused for other clients.
>> Observed helper exhaustion has happened without any malicious intent.
>>
>> To address the problem, we add a helper reservation timeout. Timed out
>> reserved helpers may be reused by new clients/connections. To minimize
>> problems with slow-to-resume-authentication clients, timed out
>> reserved helpers are not reused until there are no unreserved running
>> helpers left. The reservations are tracked using unique integer IDs.
>>
Since NTLM and Negotiate are both very stateful security protocols this
re-use is only possible if the helper is using concurrency in its
communications with Squid. To do so otherwise would randomly allow
replay attacks to succeed - in a way that would be extremely nasty to
troubleshoot. Attackers are fully able to flood an auth backend with
traffic outside of Squid and slow it down sufficiently for this attack
to become a problem.
Note that the type-1 tokens where the TCP connection can be closed
also should not reserve a helper - if that is happening it is a bug
regardless of this patch.
To reach the desired behaviour it would be better to actually implement
concurrency support for the stateful helpers interfaces. So we can
ensure the helper is fully aware of the separation between clients auth
sessions regardless of whether any given token gets replayed.
It should not be much of a step to use the concurrency channel-ID as
part of the reservationId. Helpers that support concurrency should not
have trouble servicing auth on other channel-ID until the total number
of reserved channels gets extremely high. At that point restarting the
helper is sane and the existing on-persistent-overload logics can be
applied to whether Squid dies or simply kills the blocked helper (ERR).
This patch is going most of the way towards that already by using
reservation-ID to replace hardcoded class dependencies, and moving a lot
of code to the Base class. But it does not go far enough to make the
change safe to add to Squid IMO.
NP: I have only briefly checked the code to be sure this was not
actually doing the concurrency change yet without mentioning it.
However, one other major issue stands out;
StatefulGetFirstAvailable() is a C-style accessor method for the
stateful_helper, we simply have not managed to make it one properly yet.
The value it is given is the context within/for which the caller needs
it to find a usable *_server. Changing that context in order to find a
different server result (for some other context) is not an appropriate
thing to be doing.
The only thing in the new code for StatefulGetFirstAvailable() which
might lead to it needing non-const is the HelperServerBase::reserved()
method being indirectly called. That method should be "virtual bool
reserved() const = 0". Fixing that should make the
StatefulGetFirstAvailable() API const'ness change unnecessary.
-1 for now, and please do not apply this until the replay attack problem
is fully resolved.
Amos
More information about the squid-dev
mailing list