I tried to implement a protocol that can run TLS over TLS using twisted.protocols.tls
, an interface to OpenSSL using a memory BIO.
I implemented this as a protocol wrapper that mostly looks like a regular TCP transport, but which has startTLS
and stopTLS
methods for adding and removing a layer of TLS respectively. This works fine for the first layer of TLS. It also works fine if I run it over a "native" Twisted TLS transport. However, if I try to add a second TLS layer using the startTLS
method provided by this wrapper, there's immediately a handshake error and the connection ends up in some unknown unusable state.
The wrapper and the two helpers that let it work looks like this:
from twisted.python.components import proxyForInterface
from twisted.internet.error import ConnectionDone
from twisted.internet.interfaces import ITCPTransport, IProtocol
from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol
from twisted.protocols.policies import ProtocolWrapper, WrappingFactory
class TransportWithoutDisconnection(proxyForInterface(ITCPTransport)):
"""
A proxy for a normal transport that disables actually closing the connection.
This is necessary so that when TLSMemoryBIOProtocol notices the SSL EOF it
doesn't actually close the underlying connection.
All methods except loseConnection are proxied directly to the real transport.
"""
def loseConnection(self):
pass
class ProtocolWithoutConnectionLost(proxyForInterface(IProtocol)):
"""
A proxy for a normal protocol which captures clean connection shutdown
notification and sends it to the TLS stacking code instead of the protocol.
When TLS is shutdown cleanly, this notification will arrive. Instead of telling
the protocol that the entire connection is gone, the notification is used to
unstack the TLS code in OnionProtocol and hidden from the wrapped protocol. Any
other kind of connection shutdown (SSL handshake error, network hiccups, etc) are
treated as real problems and propagated to the wrapped protocol.
"""
def connectionLost(self, reason):
if reason.check(ConnectionDone):
self.onion._stopped()
else:
super(ProtocolWithoutConnectionLost, self).connectionLost(reason)
class OnionProtocol(ProtocolWrapper):
"""
OnionProtocol is both a transport and a protocol. As a protocol, it can run over
any other ITransport. As a transport, it implements stackable TLS. That is,
whatever application traffic is generated by the protocol running on top of
OnionProtocol can be encapsulated in a TLS conversation. Or, that TLS conversation
can be encapsulated in another TLS conversation. Or **that** TLS conversation can
be encapsulated in yet *another* TLS conversation.
Each layer of TLS can use different connection parameters, such as keys, ciphers,
certificate requirements, etc. At the remote end of this connection, each has to
be decrypted separately, starting at the outermost and working in. OnionProtocol
can do this itself, of course, just as it can encrypt each layer starting with the
innermost.
"""
def makeConnection(self, transport):
self._tlsStack = []
ProtocolWrapper.makeConnection(self, transport)
def startTLS(self, contextFactory, client, bytes=None):
"""
Add a layer of TLS, with SSL parameters defined by the given contextFactory.
If *client* is True, this side of the connection will be an SSL client.
Otherwise it will be an SSL server.
If extra bytes which may be (or almost certainly are) part of the SSL handshake
were received by the protocol running on top of OnionProtocol, they must be
passed here as the **bytes** parameter.
"""
# First, create a wrapper around the application-level protocol
# (wrappedProtocol) which can catch connectionLost and tell this OnionProtocol
# about it. This is necessary to pop from _tlsStack when the outermost TLS
# layer stops.
connLost = ProtocolWithoutConnectionLost(self.wrappedProtocol)
connLost.onion = self
# Construct a new TLS layer, delivering events and application data to the
# wrapper just created.
tlsProtocol = TLSMemoryBIOProtocol(None, connLost, False)
tlsProtocol.factory = TLSMemoryBIOFactory(contextFactory, client, None)
# Push the previous transport and protocol onto the stack so they can be
# retrieved when this new TLS layer stops.
self._tlsStack.append((self.transport, self.wrappedProtocol))
# Create a transport for the new TLS layer to talk to. This is a passthrough
# to the OnionProtocol's current transport, except for capturing loseConnection
# to avoid really closing the underlying connection.
transport = TransportWithoutDisconnection(self.transport)
# Make the new TLS layer the current protocol and transport.
self.wrappedProtocol = self.transport = tlsProtocol
# And connect the new TLS layer to the previous outermost transport.
self.transport.makeConnection(transport)
# If the application accidentally got some bytes from the TLS handshake, deliver
# them to the new TLS layer.
if bytes is not None:
self.wrappedProtocol.dataReceived(bytes)
def stopTLS(self):
"""
Remove a layer of TLS.
"""
# Just tell the current TLS layer to shut down. When it has done so, we'll get
# notification in *_stopped*.
self.transport.loseConnection()
def _stopped(self):
# A TLS layer has completely shut down. Throw it away and move back to the
# TLS layer it was wrapping (or possibly back to the original non-TLS
# transport).
self.transport, self.wrappedProtocol = self._tlsStack.pop()
I have simple client and server programs for exercising this, available from launchpad (bzr branch lp:~exarkun/+junk/onion
). When I use it to call the startTLS
method above twice, with no intervening call to stopTLS
, this OpenSSL error comes up:
OpenSSL.SSL.Error: [('SSL routines', 'SSL23_GET_SERVER_HELLO', 'unknown protocol')]
Why do things go wrong?
There are at least two problems with
OnionProtocol
:TLSMemoryBIOProtocol
becomes thewrappedProtocol
, when it should be the outermost;ProtocolWithoutConnectionLost
does not pop anyTLSMemoryBIOProtocol
s offOnionProtocol
's stack, becauseconnectionLost
is only called after aFileDescriptor
sdoRead
ordoWrite
methods return a reason for disconnection.We can't solve the first problem without changing the way
OnionProtocol
manages its stack, and we can't solve the second until we figure out the new stack implementation. Unsurprisingly, the correct design is a direct consequence of how data flows within Twisted, so we'll start with some data flow analysis.Twisted represents an established connection with an instance of either
twisted.internet.tcp.Server
ortwisted.internet.tcp.Client
. Since the only interactivity in our program happens instoptls_client
, we'll only consider the data flow to and from aClient
instance.Let's warm up with a minimal
LineReceiver
client that echoes back lines received from a local server on port 9999:Once the established connection's established, a
Client
becomes ourLineReceiver
protocol's transport and mediates input and output:New data from the server causes the reactor to call the
Client
'sdoRead
method, which in turn passes what it's received toLineReceiver
'sdataReceived
method. Finally,LineReceiver.dataReceived
callsLineReceiver.lineReceived
when at least one line is available.Our application sends a line of data back to the server by calling
LineReceiver.sendLine
. This callswrite
on the transport bound to the protocol instance, which is the sameClient
instance that handled incoming data.Client.write
arranges for the data to be sent by the reactor, whileClient.doWrite
actually sends the data over the socket.We're ready to look at the behaviors of an
OnionClient
that never callsstartTLS
:OnionClient
s are wrapped inOnionProtocol
s, which are the crux of our attempt at nested TLS. As a subclass oftwisted.internet.policies.ProtocolWrapper
, an instance ofOnionProtocol
is a kind of protocol-transport sandwich; it presents itself as a protocol to a lower-level transport and as a transport to a protocol it wraps through a masquerade established at connection time by aWrappingFactory
.Now,
Client.doRead
callsOnionProtocol.dataReceived
, which proxies the data through toOnionClient
. AsOnionClient
's transport,OnionProtocol.write
accepts lines to send fromOnionClient.sendLine
and proxies them down toClient
, its own transport. This is the normal interaction between aProtocolWrapper
, its wrapped protocol, and its own transport, so naturally data flows to and from each without any trouble.OnionProtocol.startTLS
does something different. It attempts to interpose a newProtocolWrapper
— which happens to be aTLSMemoryBIOProtocol
— between an established protocol-transport pair. This seems easy enough: aProtocolWrapper
stores the upper-level protocol as itswrappedProtocol
attribute, and proxieswrite
and other attributes down to its own transport.startTLS
should be able to inject a newTLSMemoryBIOProtocol
that wrapsOnionClient
into the connection by patching that instance over its ownwrappedProtocol
andtransport
:Here's the flow of data after the first call to
startTLS
:As expected, new data delivered to
OnionProtocol.dataReceived
is routed to theTLSMemoryBIOProtocol
stored on the_tlsStack
, which passes the decrypted plaintext toOnionClient.dataReceived
.OnionClient.sendLine
also passes its data toTLSMemoryBIOProtocol.write
, which encrypts it and sends the resulting ciphertext toOnionProtocol.write
and thenClient.write
.Unfortunately this scheme fails after a second call to
startTLS
. The root cause is this line:Each call to
startTLS
replaces thewrappedProtocol
with the innermostTLSMemoryBIOProtocol
, even though the data received byClient.doRead
was encrypted by the outermost:The
transport
s, however, are nested correctly.OnionClient.sendLine
can only call its transport'swrite
— that is,OnionProtocol.write
— soOnionProtocol
should replace itstransport
with the innermostTLSMemoryBIOProtocol
to ensure writes are successively nested inside additional layers of encryption.The solution, then, is to ensure that data flows through the first
TLSMemoryBIOProtocol
on the_tlsStack
to the next one in turn, so that each layer of encryption is peeled off in the reverse order it was applied:Representing
_tlsStack
as a list seems less natural given this new requirement. Fortunately, representing the incoming data flow linearly suggests a new data structure:Both the buggy and correct flow of incoming data resemble a singly-linked list, with
wrappedProtocol
serving asProtocolWrapper
s next links andprotocol
serving asClient
's. The list should grow downward fromOnionProtocol
and always end withOnionClient
. The bug occurs because that ordering invariant is violated.A singly-linked list is fine for pushing protocols onto the stack but awkward for popping them off, because it requires traversal downwards from its head to the node to remove. Of course, this traversal happens every time data's received, so the concern is the complexity implied by an additional traversal rather than worst-case time complexity. Fortunately, the list is actually doubly linked:
The
transport
attribute links each nested protocol with its predecessor, so thattransport.write
can layer on successively lower levels of encryption before finally sending the data across the network. We have two sentinels to aid in managing the list:Client
must always be at the top andOnionClient
must always be at the bottom.Putting the two together, we end up with this:
(This is also available on GitHub.)
The solution to the second problem lies in
PopOnDisconnectTransport
. The original code attempted to pop off a TLS session from the stack viaconnectionLost
, but because only a closed file descriptor causesconnectionLost
to be called, it failed to remove stopped TLS sessions that didn't close the underlying socket.At the time of this writing,
TLSMemoryBIOProtocol
calls its transport'sloseConnection
in exactly two places:_shutdownTLS
and_tlsShutdownFinished
._shutdownTLS
is called on active closes (loseConnection
,abortConnection
,unregisterProducer
and afterloseConnection
and all pending writes have been flushed), while_tlsShutdownFinished
is called on passive closes (handshake failures, empty reads, read errors, and write errors). This all means that both sides of a closed connection can pop stopped TLS sessions off the stack duringloseConnection
.PopOnDisconnectTransport
does this idempotently becauseloseConnection
is generally idempotent, andTLSMemoryBIOProtocol
certainly expects it to be.The downside to putting stack management logic in
loseConnection
is that it depends on the particulars ofTLSMemoryBIOProtocol
's implementation. A generalized solution would require new APIs across many levels of Twisted.Until then, we're stuck with another example of Hyrum's Law.