[squid-dev] [PATCH] switch session/connection for OpenSSL

Sun Jul 9 10:30:35 UTC 2017

On 09/07/17 15:28, Alex Rousskov wrote:
> On 06/10/2017 06:27 AM, Amos Jeffries wrote:
> 
>> Yes this is far from complete, and intentionally much smaller that
>> the previous patch.
> 
> I believe this patch is a huge step in the right direction. It has one
> serious flaw (namespace misuse), but I hope that can be addressed
> without too much controversy. The details are further below.
> 
> 
>> On 27/04/17 05:24, Alex Rousskov wrote:
>>> Needless to say, I would be happy if we can come up with better
>>> definitions or even better concepts. The above is a starting point.
> 
>> I do not think it is up to us to define these things. So I have taken a
>> much longer reading of all the RFCs since SSLv3.0 through to current
>> TLS/1.3 and isolated what are the authoritative definitions AFAICT.
> 
> For the record, I am happy to adopt any "authoritative" definitions that
> suite Squid specifics well, but, IMO, it is up to _us_ to decide
> suitability and, if necessary, to define and/or clarify things within
> Squid context. Quoting RFCs is often useful, but it is not a panacea for
> any problem, including the terminology problem.
> 
> For example, a TLS Connection exists, as a concept, whether any RFC
> defines or not. An RFC may not need that definition, may assume that
> every reader already understands that concept, or may just be sloppy. We
> should reuse what we can and fill the gaps with our own definitions as
> needed.
> 
> 
>> * moves the pieces that are doing what is defined as solely TLS
>> Connection things to security/TlsConnection.* files.
> 
> Please do not misinterpret the namespace comments below as a
> disagreement with the above change. Moving TLS Connection code to
> security/TlsConnection.* files is fine with me.
> 
> 
>> * adds a Security::TlsConnection::Pointer type for use by code dealing
>> with TLS Connection logic.
> 
> This is also great.
> 
> 
>>   - SessionPointer still exists for code performing TLS Session logic.
>> see PeerConnector description for the distinction.
> 
> Sure, this is fine. TLS Session is a perfectly valid concept as well.
> 
> 
>>   - I have not gone through and renamed uses of SessionPointer beyond
>> those directly involved with the above code shuffle.
> 
> Can you give an example or two of old uses that should be fixed? Do you
> plan on doing this renaming as a part of this series of patches? It
> would be nice to get rid of the inconsistencies in one sweep or, if
> necessary, several close-in-time changes...
> 

class fde "ssl" member. This is a pointer to the TLS session state for 
the TCP connection whose FD the fd_table entry is storing.

> 
>> +namespace TlsConnection {
> 
> We should not use namespaces for what is a class. Very roughly speaking:
> If you can create multiple instances of X, then X is (or can be) a
> class. It is not a namespace. In this context:
> 
> * I can have five "TLS connections". A "TLS connection" is (or can be) a
> class.
> 
> * All TLS-related functions, classes, constants, and other interfaces
> can be grouped under a single "Tls" namespace. I cannot have five TLSs.
> "Tls" is (or can be) a namespace.

They might be one day. That is out of scope of this patch which 
concentrates solely on documenting the border between TLS connection and 
TLS session.

> 
> N.B. I do not suggest adding a TLS connection class now (and we already
> have library-specific types that currently fulfill that role,
> essentially). Using Security::ConnectionPointer is enough.

The classes in Squid which most accurately match the "TLS connection" 
definition are actually class fde and class Comm::Connection. Since fde 
stores the FD value, I/O handlers and pointer to an active SSL session 
(indirectly via the SSL* pointer), and Comm::Connection stores the FD 
and related state/session data.

> 
> 
>> +    if (Security::TlsConnection::CreateServer(ctx, conn, "client https start")) {
> 
> I would replace TlsConnection::CreateServer, which reads like nonsense

Take a look at the naming of this code in Squid-3. These are the names 
you have spent most of a year trying to get me to undo. Now you are 
saying they were nonsense.

"
SSL *
Ssl::CreateClient(SSL_CTX *sslContext, const int fd, const char *squidCtx)
{
     return SslCreate(sslContext, fd, Ssl::Bio::BIO_TO_SERVER, squidCtx);
}

SSL *
Ssl::CreateServer(SSL_CTX *sslContext, const int fd, const char *squidCtx)
{
     return SslCreate(sslContext, fd, Ssl::Bio::BIO_TO_CLIENT, squidCtx);
}
"

> IMHO, with something like:
> 
> * Accept
> * AcceptConnection
> * PrepareToAccept
> * CreateConnection(ToSquid)
> 
> And the corresponding names for the Squid-to-peer TLS connections:
> 
> * Connect
> * InitiateConnection
> * PrepareToConnect
> * CreateConnection(FromSquid)
> 
> "Accept" and "connect" are common networking terms that can be applied
> to TLS connections well. The last two versions in each group are more
> precise because these functions only _prepare_ the TLS connection for
> future negotiations; they do not actually start those negotiations.

The CreateX() functions are the constructors of the _data structure_ for 
storing TLS context data. They have zero properties and similarity with 
accept(2) or connect(2) in TCP connections.

*IF* these functions actually did connection semantics then all of the 
logic in the class PeerConnector hierarchy is completely wrong.
  (you may recall that one of the major mistakes I made very early on 
which led down this whole muckup was thinking that these did the 
'connect' and PeerConnector did the encryption).

This distinction is supported directly by the fact that this logic in 
these functions is identical for clients and servers.

** In no way does a TLS client code in PeerConnector accept(2) a 
connection. Therefore this logic is not doing accept(2).

b) Likewise, it is also not performing connect(2) equivalent semantics.

This is where your insistence that TLS connection and TCP connection are 
equivalent falls down completely. Because a TLS connection _exists_ as a 
result of these functions, but cannot be used for I/O.
  In order to use as a transport channel a session must also be 
negotiated. That session negotiation is the equivalent of connect(2) - 
yet it does *not* create a "TLS connection" - it creates a "TLS session".

In fact the more I think about it now, these functions are somewhat a 
TLS equivalent of socket(2). Since a socket/FD has to be allocated 
before a connection can begin to be actively started, and lots of 
behaviour flags etc get applied to the socket/FD after creation and 
before first use to manage the connections eventual behaviour.

For lack of anything actually suitable I bowed to your request to retain 
the Squid-3 terminology here. That being the words Create and 
Client/Server rather than connect/accept/listen/prepare etc.

> 
> All these functions should be inside the existing Security namespace. If
> you think we will have non-SSL/TLS things inside Security, then we can
> also add a Tls namespace, but I do not see the need for that right now.
> 

Er, they are in Security::

Lets talks a short side-step here and consider the Comm::Connection 
object design. It is supposed to be a generic abstraction for *any* 
transport (including TLS encrypted ones) such that the code using it 
does not need to be aware of whether it was TCP, UDP, SCTP, TLS, DTLS or 
whatever - just pass the Conn object to read/write API with a buffer and 
get that buffer filled or emptied.

With "TLS connection" being defined as being a pair [transport 
connection, TLS session]. The TLS connection "object" is most correctly 
typed as either std::pair<Comm::Connection, Security::SessionPointer>, 
OR adding a Comm::Connection with member pointing to the TLS session.

(ignoring for now that SessionPointer is currently borked).

Right now there is also a fuzzy line in the comm code between 
Comm::Connection and class fde, so IMO fde holding the SessionPointer is 
wrong, but that is where it currently has to be placed due to the CommIO 
handling logics. I'm intentionally leaving that tangled web out of scope 
for this patch.

I did consider making these functions methods of a class TlsConnection, 
but that would easily lead to confusion of thinking that class *was* 
representing a TLS connection and get in the way of future cleanup of 
the above tangle.

So for this patch I am simply adding a namespace placeholder for 
whatever the class will be and ensuring the static functions are in 
there. In the long term end product they may stay as statics like this 
in a class, or be renamed to non-static methods.

"Tls" is being used here to distinguish from Ssl:: things, but as you 
point out we have not gone far enough to warrant a Tls:: namespace yet. 
I do fully expect that there will be other security types before too 
much longer. Specifically some Security::SslFoo things. Then there is 
SCTP etc.

So making the name fully descriptive and avoiding even more renames 
seems to be a good choice. There should be no harm in documenting that 
these were created to be TLS specific things.

> 
>> +/// Create TLS state objects and associate with a TLS connection to form a TLS connection.
>> +/// Client connection structures and TLS/SSL I/O (Comm and BIO) are initialized.
> 
> 
>> +/// Create TLS state objects and associate with a TLS connection to form a TLS connection.
>> +/// Server connection structures and TLS/SSL I/O (Comm and BIO) are initialized.
> 
> 
> I am guessing something went wrong in the first sentences during
> find-and-replace ("TLS connection to form a TLS connection"). See below
> for a specific fix suggestion.

Oops. Yes. That first should be "TCP". Though using "transport" now as 
per below change.

> 
> Also, please avoid the "client connection" and "server connection"
> terminology, especially in high-level little-context descriptions like
> these, because such terms are very confusing in Squid context. As you
> know, we already have "client side" that has "server agents", and
> "server side" that has "client agents". To reduce confusion, use
> "from-client connection" and "to-server connection" phrasing instead
> (or, if you prefer, the verbose "client-to-Squid connection" and
> "Squid-to-server connection" variants.
> 

The Client/Server words are following our current Squid-oriented 
(Squid-as-Client and Squid-as-Server) terminology.

ie. "Client connection structures" refers to TLS connection structures 
where Squid is the client.

The TLS terminology, Squid terminology, library APIs seem to align. Only 
the BIO enum naming uses server/client the other way around, and those 
explicitly state "_TO_" for clarity. So I don't seen much in the way of 
confusion here.

> Please consider using these replacements:
> 
> /// Creates a TLS Connection and
> /// associates it with the given from-client transport connection.
> /// \returns false on errors (and logs details using DBG_IMPORTANT)
> 
> /// Creates a TLS Connection and
> /// associates it with the given to-server transport connection.
> /// \returns false on errors (and logs details using DBG_IMPORTANT)
> 

Done the \return change. The other doc changes are wrong because the 
CreateClient uses a to-server transport connection to generate a 
to-server / from-Squid TLS connection.

> 
> The other comments are about relatively minor problems:
> 
> 
>> + * Initializes TLS data structures and associates with a TCP Connection
>> + * to form a Client or Server TLS Connection.
> 
>> + * Initiates encryption on a TCP Connection to peers or servers.
> 
>> + * The caller must monitor the connection for TCP closure because this job will
> 
> 
> s/TCP Connection/transport connection/g
> 
> s/the connection for TCP/the transport connection for/
> 
> because when we start supporting SslBump for HTTPS proxies and/or HTTP/2
> for HTTPS proxies, that transport will not be TCP. There is nothing
> specific to TCP in that code IIRC.

Done.

> 
> Please apply similar adjustments to other "TCP" phrase added by the patch.
> 
> 
>> + * Initializes TLS data structures and associates with a TCP Connection
>> + * to form a Client or Server TLS Connection.
> 
> s/associates with/associates them with/

Done.

>> + * Initializes TLS data structures and associates with a TCP Connection
>> + * to form a Client or Server TLS Connection.
> 
> I would replace "to form a Client or Server TLS Connection" with "to a
> TLS Connection with a Client or Server".
> 

Nouns. The TLS data structures are not being associated with a TLS 
connection, they *are* the "Client TLS" or "Server TLS" part that makes 
the transport turn into a "Client or Server TLS connection" 
collective/aggregate thing.

> 
>> ... The primary operational
>> + * difference between the server and client is that the server is
>> + * generally authenticated, while the client is only optionally
>> + * authenticated."
> 
> I recommend removing that text -- that "primary operational difference"
> has little to do with the CreateTlsConnection code.
> 
> Overall, quoting fairly generic "client" and "server" definitions feels
> like noise to me, but I am not going to fight you about that.
> 

That is the distinguishing factor for client vs server at this level of 
TLS. It relates directly to what is correct handling of the BIO 
TO_SERVER/TO_CLIENT parameter sent to this function.

Sadly that is as detailed as they are. Quoting in full in part documents 
where the line is between the fuzzy formal definitions and anything we 
tack on to, as you said earlier, 'fill the gaps'.

> 
>> -    // Temporary ssl for getting X509 certificate from SSL_CTX.
>> -    Security::SessionPointer ssl(Security::NewSessionObject(ctx));
>> +    // Temporary Pointer<SSL> for getting X509 certificate from SSL_CTX.
>> +    Security::TlsConnection::Pointer ssl(Security::TlsConnection::NewSslObject(ctx));
> 
> s/Pointer<SSL>/SSL object/
> 
> ... because the comment is about the temporary OpenSSL SSL object, not
> the pointer to it (the local pointer variable is also temporary, but
> that is not the point here -- it is the temporary nature of the
> underlying SSL object that is rather unusual and worth commenting on).

Okay, done.

> 
> 
>> PS: Applying the definitions to PeerConnector, it has become clear that
>> it (and children) not following a MUST requirement about the underlying
>> TCP transport connection being terminated in the case where Handshake
>> negotiation failed due to a Record protocol violation. They are leaving
>> this closure to the caller which is a layering violation
> 
> There is no layering violation in letting the transport-aware caller to
> do transport-specific things. I seriously doubt separating handshake
> failure detection code from the code reacting to that failure violates
> any RFC MUSTs! There are certainly interface problems between
> PeerConnector and its users, but they are mostly software problems
> unrelated to RFCs/terminology and are outside this thread scope AFAICT.
> 

Maybe I was not clear enough.

* The layering violation is in requiring the caller / semantics layer to 
be aware of TLS syntax and how issues there need to be handled.

* The RFC violation is that the caller does *not* perform the close due 
to the protocol syntax errors. It seems to instead die on semantic 
interpretations of the bad syntax, which might be leaving nasty security 
holes on some garbage.

The TLS layer already is aware that there is a transport (relatively 
opaque in type though). So it is already in a position to close() that 
transport as required in the spec and let the caller do whatever its 
handler was for that.

The caller already has a close handler in place to do its own "stuff" 
when TLS layer aborts with the close().

Thank you
Amos