[squid-dev] Squid 5.6 leaking memory when peeking for an origin with an invalid certificate

Hamilton Coutinho hamilton.coutinho at gmail.com
Mon Mar 27 19:47:31 UTC 2023


Hello all,

We moved one step closer to the root cause. We bisected the code and found
out that the commit that introduced the leak is:

$ git show --stat  800967afde7921eb22a2711c5a75b5cd9c9d7178
commit 800967afde7921eb22a2711c5a75b5cd9c9d7178 (HEAD)
Author: Christos Tsantilas <christos at chtsanti.net>
Date:   Mon Feb 22 21:15:32 2021 +0000

    Handling missing issuer certificates for TLSv1.3 (#766)

    Prior to TLS v1.3 Squid could detect and fetch missing intermediate
    server certificates by parsing TLS ServerHello. TLS v1.3 encrypts the
    relevant part of the handshake, making such "prefetch" impossible.

    Instead of looking for certificates in TLS ServerHello, Squid now waits
    for the OpenSSL built-in certificate validation to fail with an
    X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY error. Squid-supplied
    verify_callback function tells OpenSSL to ignore that error. Squid
    SSL_connect()-calling code detects that the error was ignored and, if
    possible, fetches the missing certificates and orchestrates certificate
    chain validation outside the SSL_connect() sequence. If that validation
    is successful, Squid continues with SSL_connect(). See comments inside
    Security::PeerConnector::negotiate() for low-level details.

    In some cases, OpenSSL is able to complete SSL_connect() with an ignored
    X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY error. If Squid validation
    fails afterwards, the TLS connection is closed (before any payload is
    exchanged). Ideally, the negotiation with an untrusted server should not
    complete, but complexity BIO changes required to prevent such premature
    completion is probably not worth reaching that ideal, especially since
    all of this is just a workaround.

    The long-term solution is adding SSL_ERROR_WANT_RETRY_VERIFY to OpenSSL,
    giving an application a chance to download the missing certificates
    during SSL_connect() negotiations. We assist OpenSSL team with that
    change, but it will not be available at least until OpenSSL v3.0.

    This description and changes are not specific to SslBump code paths.

    This is a Measurement Factory project.

 acinclude/lib-checks.m4       |   6 ++
 compat/openssl.h              |  30 +++++++++
 src/security/Handshake.cc     |  46 --------------
 src/security/Handshake.h      |   5 +-
 src/security/Io.cc            |  27 ++++++++
 src/security/Io.h             |  14 +++-
 src/security/PeerConnector.cc | 211
++++++++++++++++++++++++++++++++++++++++++++-----------------
 src/security/PeerConnector.h  |  46 ++++++++++----
 src/security/Session.h        |   4 ++
 src/ssl/bio.cc                |   7 --
 src/ssl/bio.h                 |   8 ---
 src/ssl/gadgets.h             |   1 +
 src/ssl/support.cc            | 355
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------
 src/ssl/support.h             |  57 +++++++++++++++--
 src/tests/stub_libsecurity.cc |   4 +-
 15 files changed, 582 insertions(+), 239 deletions(-)

On Thu, Mar 23, 2023 at 4:56 PM Hamilton Coutinho <
hamilton.coutinho at gmail.com> wrote:

> Hi all,
>
> We found a way to reproduce the leak: set up squid to run in intercept
> mode + SSL description + memory pools off (to ease identifying the leak)
> and then generate requests to sites with invalid certs (eg, CA not
> installed), for instance: curl -k https://slscr.update.microsoft.com.
> squidclient mgr:mem should show ever increasing HttpRequest instances.
>
> As far as I can tell, the HttpRequest object created
> in ConnStateData::buildFakeRequest() is never freed because its refcount >
> 0.
>
> Any ideas where an HTTPMSGUNLOCK() might be missing?
>
> Thanks!
>
> On Wed, Jan 18, 2023 at 11:28 AM Hamilton Coutinho <
> hamilton.coutinho at gmail.com> wrote:
>
>> Hi Alex,
>>
>> Thanks for the prompt reply! Thanks also for the clarifications.
>>
>> Agreed, I just realized the requests seem to be failing
>> with Http::scServiceUnavailable, so my focus turned
>> to Security::PeerConnector::sslCrtvdHandleReply() and friends.
>>
>> Best.
>>
>> On Wed, Jan 18, 2023 at 11:11 AM Alex Rousskov <
>> rousskov at measurement-factory.com> wrote:
>>
>>> On 1/18/23 13:46, Hamilton Coutinho wrote:
>>> > Hi all,
>>> >
>>> > We are observing what seems to be several objects leaking in the
>>> output
>>> > mgr:mem, to the tune of 10s of 1000s
>>> >
>>> of HttpRequest, HttpHeaderEntry, Comm::Connection, Security::ErrorDetail, cbdata
>>> PeekingPeerConnector (31), etc.
>>> >
>>> > We dumped a core and managed to find some HttpRequest objects and they
>>> > all seem to have failed in the same way, with an
>>> ERR_SECURE_CONNECT_FAIL
>>> > category, for a site that has a certificate signed by a CA authority
>>> not
>>> > available to squid.
>>> >
>>> > If I would guess, the origin of the problem might be in
>>> > Ssl::PeekingPeerConnector::checkForPeekAndSpliceMatched():
>>> >
>>> >      if (finalAction == Ssl::bumpTerminate) {
>>> >          bail(new ErrorState(ERR_SECURE_CONNECT_FAIL,
>>> Http::scForbidden,
>>> > request.getRaw(), al));
>>> >          clientConn->close();
>>> >          clientConn = nullptr;
>>> >
>>> > Wondering if assigning null to clientConn there would be premature.
>>>
>>>
>>> FWIW, that connection pointer reset itself looks OK to me. ConnStateData
>>> and/or others should have a connection closure handler attached to the
>>> clientConn descriptor. That handler should be notified by Comm and
>>> initiate cleanup of the objects responsible for client-Squid
>>> communication.
>>>
>>> The bail() call above should inform the requestor about the
>>> error/termination and terminate this AsyncJob. That requestor should
>>> then close the Squid-server connection and clean up associated state.
>>>
>>> While there may be bugs in those "should..." sequences, please note that
>>> the pasted code is not related to handling of untrusted origin servers
>>> (unless your ssl_bump rules specifically activate the terminate action
>>> upon discovering such an origin server). The pasted code is reacting to
>>> an "ssl_bump terminate" rule matching.
>>>
>>>
>>> Cheers,
>>>
>>> Alex.
>>>
>>> _______________________________________________
>>> squid-dev mailing list
>>> squid-dev at lists.squid-cache.org
>>> http://lists.squid-cache.org/listinfo/squid-dev
>>>
>>
>>
>> --
>> Hamilton
>>
>
>
> --
> Hamilton
>


-- 
Hamilton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20230327/bab43fe4/attachment.htm>


More information about the squid-dev mailing list