Change History (30)

This probably needs a bit more testing but I think the work is basically done.

The code in the branch also addresses #4888 and one other ticket that exarkun filed about abstraction problems in the new TLS stuff. Fixing these abstraction problems turned out to be a blocker for fixing #4888 so that's why the work is also in this branch; I can split it up if the reviewer would prefer.

I'm not entirely sure of the utility of passing just a TLSMemoryBioProtocol to the IOpenSSLServerConnectionCreator and IOpenSSLClientConnectionCreator. Is the intent to reach into it to look at the host/port of the wrapped protocol's transport? Or, is it assumed that the provider of these interfaces will already have the hostname passed to it? The implementation currently does nothing with it but attach it to the OpenSSL.Connection. It does look like that would be useful for implementing handshake callbacks, though. Perhaps it's to split the interface into requesting a client or a server? Two of three implementers run the same code for client and server, and the third only implements the client part.

Why not make OpenSSLCertificateOptions a subclass of OpenSSLCertificateLegacyOptions instead of the other way around with the __class__ hack? I have a feeling there's a reason, but I don't see it.

twisted.web.client line 854: this doesn't seem right. How are users supposed to specify non-ASCII hostnames? IDNA-encode them before passing them to Agent?

I'm not entirely sure of the utility of passing just a TLSMemoryBioProtocol to the IOpenSSLServerConnectionCreator and IOpenSSLClientConnectionCreator.

This is the whole point of this branch :-).

The intent is to preserve a reference from the SSL.Connection (which is what the info_callback receives) to the TLSMemoryBIOProtocol (which is the object with the failVerification method that allows info_callback to terminate the handshake). Furthermore the intent is to push the responsibility for this reference into the connection creator, where it belongs, rather than the TLSMemoryBIOProtocol itself, which is where it had to be before.

It does look like that would be useful for implementing handshake callbacks, though.

In fact, it is. Necessary, even ;-).

Perhaps it's to split the interface into requesting a client or a server? Two of three implementers run the same code for client and server, and the third only implements the client part.

Right now, OpenSSLCertificateOptions is just a gigantic grab bag of TLS options that you can (still, despite our best efforts) set in a variety of nonsensical configurations. However, clients and servers have very different security properties and verification code potentially needs to know which is which to supply a better API than that. Modeling this with a boolean sucks, as you can see throughout the twisted codebase, because sometimes you want isServer and sometimes you want isClient and sometimes you want (as in the case of ITLSTransport.startTLS) a beNormal flag. Having two separate interfaces with two separate method names made sense to me, since in the future hopefully we'll have specific classes which only know how to handle one side of the connection and will just complain when you use them in the wrong place. (For example: this hostname argument is actually nonsense for a server, it's just that the giant grab-bag approach we've taken with CertificateOptions makes that difficult to untangle.)

Why not make OpenSSLCertificateOptions a subclass of OpenSSLCertificateLegacyOptions instead of the other way around with the __class__ hack? I have a feeling there's a reason, but I don't see it.

I talked to dreid about this and I'm going to remove the hack entirely. The goal of said hack was to display a reference to CertificateOptions from each interface's API documentation, and thereby have the interfaces imply that they are security-critical and that applications should only very rarely implement them, and instead use the implementation.

Instead, I should have used those actual words instead of trying to imply it, and explicitly typed a @see instead of relying on "implementers of".

twisted.web.client line 854: this doesn't seem right. How are users supposed to specify non-ASCII hostnames? IDNA-encode them before passing them to Agent?

This is definitely a flaw in the way Agent models URLs, but it's out of scope to fix it in this branch. You had to IDNA-encode the relevant portion of your URL anyway to call request(); see IAgent.request's documentation. If you're treating a URL as bytes, the hostname portion must be ASCII bytes or it's not a domain name.

Sounds like you should improve the comments or documentation in this code to explain the intent a bit more based on your responses to habnabit's review.

Also,

multiple cases of clientConnectionForTLS and serverConnectionForTLS should have docstrings.

I do think that the web client changes should be in a different branch, because I am not comfortable reviewing all of it at once.

Along the lines of my general point at the beginning of this comment, maybe you should update some of the documentation in the core TLS code to mention the word "hostname" somewhere, since these changes seem to be in support of hostname verification? I'm not sure if that's true, but there's definitely some intent missing from some documentation somewhere.

Are your changes in twisted.internet.tcp relevant to this change? (no I think)

but anyway you should use requireModule instead of import.

I like the word "wrap" instead of "convert" for describing what an adapter does. (_ConnectionFactory docstring)

I'm confused about the various implementations of clientConnectionForTLS. I see two calls: one in WebClientConnectionCreator, specifically on a CertificateOptions instance, and CertificationOptions.serverConnectionForTLS calling it on self. I see no way for _ConnectionFactory.clientConnectionForTLS or _WebToNormalContextFactory.clientConnectionForTLS to be called.

There are no callers for any of the implementations of serverConnectionForTLS, anywhere.

This code is really incomprehensible. There's a strong lack of intent for the new interfaces, and it's unclear how things are being called. The documentation needs more intent, and I think there may be bugs in the code as well.

Sounds like you should improve the comments or documentation in this code to explain the intent a bit more based on your responses to habnabit's review.

The problem with adding stuff to the branch to explain its reasoning is that
it's a factoring fix for being unable to get the hostname from where it is to
where it needs to be, and yet, the branch itself has nothing to do with
hostnames.

multiple cases of clientConnectionForTLS and serverConnectionForTLS should have docstrings.

Should they really? Guidance I've received previously is "if you're just
implementing an interface, let it inherit its interface docstring unless it's
doing something special". I mean, I'll do it if you think it'll help, I just
think maybe we need to have a discussion about how to go about this.

I do think that the web client changes should be in a different branch, because I am not comfortable reviewing all of it at once.

OK, done. Sorry for the large branch, but if that hadn't worked there would
ahve been no point in doing this, so I wanted to make sure. Reasonable to land
them separately though, I will re-apply them in a new branch once this is
landed.

Along the lines of my general point at the beginning of this comment, maybe you should update some of the documentation in the core TLS code to mention the word "hostname" somewhere, since these changes seem to be in support of hostname verification? I'm not sure if that's true, but there's definitely some intent missing from some documentation somewhere.

The problem is that, while yes, that's true, this change is simply an internal
factoring issue. The previous implementation of hostname verification was just
buggy, but its documentation is still accurate and it's also in the right
place: CertificateOptions explains its hostname argument and literally
everything else eventually points to CertificateOptions via a @see or a
L{}, along with appropriately stern cautionary language.

Are your changes in twisted.internet.tcp relevant to this change? (no I think)

but anyway you should use requireModule instead of import.

I've reverted them.

I like the word "wrap" instead of "convert" for describing what an adapter does. (_ConnectionFactory docstring)

OK.

I'm confused about the various implementations of

clientConnectionForTLS. I see two calls: one in WebClientConnectionCreator,
specifically on a CertificateOptions instance, and
CertificationOptions.serverConnectionForTLS calling it on self. I see no way
for _ConnectionFactory.clientConnectionForTLS or
_WebToNormalContextFactory.clientConnectionForTLS to be called.

There are no callers for any of the implementations of

serverConnectionForTLS, anywhere.

So, for those playing along with our home game, there was a line that converted
an Interface into a list, got the 0th index of that list, and then used that
with getattr to produce a method. This works because zope interfaces, when
iterated, yield each of their interface elements identifiers (i.e. method or
attribute names) in turn. Since I knew that each interface here (either
IOpenSSLClientConnectionCreator or IOpenSSLServerConnectionCreator) had
exactly one method, that meant that I could pull that method out and retrieve
the method from its respective provider.

I've removed this brilliantly elegant one-line hack in
twisted.internet.tls.TLSMemoryBIOFactory.__init__ and replaced it with six
lines of tedious conditional boilerplate in
twisted.internet.tls.TLSMemoryBIOFactory._createConnection, since none of you
plebians can appreciate my transcendant genius.

This code is really incomprehensible. There's a strong lack of intent for the
new interfaces, and it's unclear how things are being called. The
documentation needs more intent, and I think there may be bugs in the code as
well.

You definitely said "intent" four times in this review, but upon reflection I
don't know what you intended. I don't know what else I can put into the code
that would satisfy your requirement here.

There's a motivation for the change here, of course, but recording every
historical detail of the way old, bad things used to work in the code seems
like it would be a pretty big mistake that would be confusing; users already
get enough of that by just seeing lots of code lying around in our various
modules. Also, users don't want to read our Kalevala when they're trying to
figure out how to do things. So I guess I'll the motivations for the change in
detail here, and you can tell me how much of this you think belongs in the
code, and what parts exactly comprise my "intent" with these interfaces.

Previously, the contract between the various ways to initiate TLS (startTLS,
TLSMemoryBIOFactory directly, connectSSL, listenSSL) and their respective
contextFactory arguments was almost completely unspecified. The only
guidance provided was that it should subclass either
twisted.internet.ssl.ClientContextFactory or
twisted.internet.ssl.Contextfactory. Except that the main implementation
that was suggested for use, twisted.internet.ssl.CertificateOptions, does not
actually do either of those things. And in fact, it doesn't even provide the
isClient or method attributes which these base-classes suggest might be
required, but are, in fact, not.

So IOpenSSLServerConnectionCreator and IOpenSSLClientConnectionCreator are
just specified, abstract interfaces for doing the job that ContextFactory and
its stand-ins were performing.

Hostname verification in OpenSSLCertificateOptions requires that, when
validating the hostname against the certificate, feedback be provided to the
TLSMemoryBIOProtocol. Specifically, that the connection be terminated when
the verification of the service identity fails.

However, to get a reference to the TLSMemoryBIOProtocol to the
CertificateOptions, the info_callback (set on the OpenSSL.SSL.Context)
needed to be able to get it through the OpenSSL.SSL.Connection passed to it.
The only mutable field suitable for this purpose is the set_app_data /
get_app_data field.

The problem is that if we just started doing that everywhere, then applications
which had been working around this problem for themselves and might have been
using set_info_callback / set_app_data would break. So it is important to
only set those fields when the user requested something new in this
release, i.e. the hostname.

The branch which added the hostname specification feature did this poorly, by
having TLSMemoryBIOProtocol specifically check to see if its contextFactory
was specifically a CertificateOptions and had had hostname passed before
calling set_app_data with self. That was super gross, and made it
impossible to fix #4888, because twisted.web.client.WebClientContextFactory
was trying to be a good citizen and compose rather than inherit, and
composition breaks when you start putting isinstance checks in places.

So this branch corrects that factoring.

There are new interfaces, with new behavior, which are opt-in, but also specify
a more sensible API, delegating the construction of the
OpenSSL.SSL.Connection to the thing responsible for configuring the actual
connection, in other words, what we currently call a ContextFactory.

Since they are opt-in, there is, of course, backwards compatibility, so
existing objects with a getContext method will continue to work.

This means that OpenSSLCertificateOptions explicitly opts-in to the new
interface by using directlyProvidesonly when it knows that the user has
requested new behavior by passing along the hostname argument. In the
future, I'd like to transition this to being the default behavior (perhaps by
splitting up OpenSSLCertificateOptions into a couple of different
classes, one for clients and one for servers. It is, after all, nonsensical to
call methods like set_tlsext_host_name on a server.

Ultimately all of this behavior is documented, in the sense that it's all
pretty simple: twisted's TLS support has simply gone from calling getContext
to calling either clientConnectionForTLS or serverConnectionForTLS.
Connection creators that are appropriate only for servers or clients
respectively - and hopefully we'll have that distinction in the future - can
implement only the appropriate side of the connection. So it's important to
communicate which side of the conversation a connection object is for. I chose
two different interfaces with differently-named methods rather than a boolean
because the booelean is used somewhat inconsistently throughout Twisted.
Sometimes it's isClient, sometimes it's isServer, sometimes it's
isNormal. I really wanted something neutral where clients were not more
"true" than servers, and two methods involves less magic than, say, using
Constants.

(exarkun helped with this review)

Thanks to both of you. I hope the above clears up any confusion, and I'm happy
to put any of that into docstrings if you think it's helpful.

Yes, I definitely think some of these methods need docstrings. For example, serverConnectionForTLS should explicitly state in its documentation that it is a simple passthrough to clientConnectionForTLS. It also seems that the behavior of clientConnectionForTLS has a public effect (putting the protocol into the "app data" of the connection), so that should be documented.

I realized that you don't have documentation anywhere for the fact that OpenSSLCertificateOptions becomes an IOpenSSLClientConnectionCreator when hostname is passed. That should probably be explicitly documented, since it's not statically discernable.

So it's clear that server and clients have different requirements of SSL, but I'm not sure that the distinction you're making them is meaningful (in fact, it's clearly not in any of the implementations I've seen so far). The *construction* interface of the certificate options is certainly a place where you want a difference: clients want "hostname" and "bundle" and "client cert", servers want a certificate chain (or more, if you're using SNI). After the construction is done, you don't need different APIs for getting *connections*: they both just take no arguments and do the right thing to give you a connection object. So I am not seeing any good evidence for clientConnectionForTLS and serverConnectionForTLS to be different. It's reasonable for "connectionForTLS" to do different things and to return different objects from the different interfaces, but you don't need two different names for that method.

(you will need some different APIs between servers and clients for starting a connection with session resumption, but that would require different signatures entirely, so they will be different methods).

While reflecting on some of the factoring issues here, it occurred to me that the interface we have in trunk, with twisted.protocols.tls doing an isinstance check, there are gnarly user-visible consequences to this API which we probably do not want to live with long term. This branch is going to change that and give us an interface which does not expose that very unfortunate detail to applications. So sadly I think this needs to be a release blocker.

On the more fortunate side I should have something viable to finish this up within the next couple of hours.

example, serverConnectionForTLS should explicitly state in its documentation
that it is a simple passthrough to clientConnectionForTLS. It also seems that
the behavior of clientConnectionForTLS has a public effect (putting the
protocol into the "app data" of the connection), so that should be
documented.

I've added docstrings to all of those.

I realized that you don't have documentation anywhere for the fact that

OpenSSLCertificateOptions becomes an IOpenSSLClientConnectionCreator when
hostname is passed. That should probably be explicitly documented, since it's
not statically discernable.

While I was trying to address this point, I realized that the way that
hostname= was added to CertificateOptions required lots of
backwards-compatibility hacks and general god-object shenanigans. So instead I
changed the implementation strategy for #5190 to be a new function that
constructs a private type that implements onlyIOpenSSLClientConnectionCreator. The hostname attribute doesn't even live
on OpenSSLCertificateOptions any more; it's on a hidden type that only
implements the relevant interface. And does not implement
IOpenSSLServerConnectionCreator.

This involved completely re-tooling the documentation, so I'm sorry it's
exploded the size of the branch somewhat. On the plus side, ssl.rst is no
longer a huge rambling excuse for not verifying certificate information; it
begins with an explanation of why you might want to do TLS correctly and
then proceeds to just do things in more or less the right way.

However, I wrote a pretty significant portion of this documentation at 5AM, so
it may end in the middle of a sentence. I don't think I'm an effective
proofreader in this case so please pay special attention there.

So it's clear that server and clients have different requirements of SSL,

but I'm not sure that the distinction you're making them is meaningful (in
fact, it's clearly not in any of the implementations I've seen so far). The
*construction* interface of the certificate options is certainly a place
where you want a difference: clients want "hostname" and "bundle" and "client
cert", servers want a certificate chain (or more, if you're using SNI). After
the construction is done, you don't need different APIs for getting
*connections*: they both just take no arguments and do the right thing to
give you a connection object. So I am not seeing any good evidence for
clientConnectionForTLS and serverConnectionForTLS to be different. It's
reasonable for "connectionForTLS" to do different things and to return
different objects from the different interfaces, but you don't need two
different names for that method. (you will need some different APIs between
servers and clients for starting a connection with session resumption, but
that would require different signatures entirely, so they will be different
methods).

I want the method to break when it is called for the wrong side of the
connection, in the regular python way that things break, i.e. AttributeError.
For common applications, things should only implement one or the other. Some
weird peer-to-peer protocols might be able to re-use the same context factory
for client and server; in that case they could happily implement both
interfaces. However, if you implement one but not the other, someone
delegating to you shouldn't need to forward lots of magical Zope Interface
provider-checking junk just to make sure they're not doing anything unsafe.

same argument: “This may be a L{Certificate}” – is that true? shouldn’t this be a “bag of certs”? A single cert seldom makes sense except for cert pinning? please explain; couldn't check what it actually does because of the next point. :)

the doc string says it’s an adapter of old-style context classes into new connection creators. It should have a less generic name to reflect that. please give good reasons or fix.

it does that getContext() thing in the constructor and I assume that it’s because of the quick-check of parameters that has been removed from Port. That should be documented like it was there because it’s absolutely not clear why anybody would do that.

serverConnectionForTLS and clientConnectionForTLS: doc strings speak about SSL, in these cases it’s in fact OpenSSL connections

The “Twisted provides SSL support as a transport --- that is, as an alternative to TCP.” paragraph still speaks about SSL; probably because of the APIs that all have SSL in their name? I would suggest to speak consistently about TLS and excuse for the legacy API names.

I believe the :api:twisted.internet.ssl.CertificateOptions <twisted.internet.ssl.CertificateOptions> syntax is only required is you’re aliasing?

“Work is ongoing to make” uses hyphens.

maybe mention homebrew?

“*SSL* echo server and client”

“*SSL* echo server”

“listen for *SSL* traffic”

“TCP echo server *-* even going so far” has an hyphen.

“Assuming that you can buy your own *SSL* certificate from a certificate authority, this is a fairly realistic *SSL* server.”

“*SSL* echo client”

“It is invoked at an agreed-upon” has a stray, non-semantic linefeed.

"PrivateCertificate.options ," should have no space before the comma.

“a common requirement for *SSL* servers”

“The client creates an uncustomized CertificateOptions which is all that's necessary for an *SSL* client to interact with an *SSL* server.”

same argument: “This may be a L{Certificate}” – is that true? shouldn’t this be a “bag of certs”? A single cert seldom makes sense except for cert pinning? please explain; couldn't check what it actually does because of the next point. :)

Actually, this is true. The "bag of certs" functionality is implemented by the caCerts argument, and unfortunately there's no registered adapter from list to IOpenSSLTrustRoot. There probably ought to be, but that's not this ticket (and it's not documented).

the doc string says it’s an adapter of old-style context classes into new connection creators. It should have a less generic name to reflect that. please give good reasons or fix.

Just as a meta-review comment here, this should be in a separate section.
Making life easier for maintainers is always a good thing, but fretting about
the non-goodness of a private name is very much bikeshedding :).

Nevertheless, addressed.

it does that getContext() thing in the constructor and I assume that it’s because of the quick-check of parameters that has been removed from Port. That should be documented like it was there because it’s absolutely not clear why anybody would do that.

I've put that in the constructor's docstring.

serverConnectionForTLS and clientConnectionForTLS: doc strings speak about SSL, in these cases it’s in fact OpenSSL connections

TLSMemoryBIOFactory

_connectionCreator also talks about SSL instead of OpenSSL

_createConnection same for @return

Sure.

the 4888.bugfix top file is part of the branch

Removed.

docs/projects/core/howto/ssl.rst

This was a team effort; cyli dealt with the doc issues, so I can't speak to them directly. But I think she hit everything.

Oops, reviewing my comment there, I guess I didn't hit the bare except:. My plan to address that is to add a decorator like @_somePyOpenSSLVersionsAreBadAtErrorHandling around the whole function just for diagnostic value (to call log.err if it fails so we can see it and fix it), then change the actual verification logic to just be except VerificationError. It shouldn't impact the functional behavior of this branch much though.

Looks generally good to me! An assortment of minor nits and documentation bugs:

echoserv_ssl.py got updated to use react, but echoclient_ssl.py did not. Intentional? Additionally, the new ssl_clientauth_{client,server}.py files don't use react either.

ssl_clientauth_server.py does print(reason) without from __future__ import print_function.

Including public.pem seems redundant, considering server.pem has the certificate in it too.

In _sslverify.py: _hostnameASCII doesn't seem like a good name to me. It's a unicode string, so it's not really in ASCII encoding. It is encodable to ASCII, presumably, since it's decoded punycode. Unfortunately I can't come up with anything better.

I'm not sure what's going on with the topfile situation. You edit another branch's topfile, which seems odd but does also clarify which API should be used. There's no topfile for #7098, thouggh.

In ssl.rst:

"In a realistic server, it's very important that these names match; in a realistic client, they should always be the same." -- I'm not entirely sure what this is trying to convey. It probably should be reworded.

Perhaps the STARTTLS client/server should be broken out into separate listings files like the rest of the code.

The STARTTLS client/server reference a keys directory which used to be included in the documentation but got edited out. Maybe they should use server.pem like the rest of the examples?

The STARTTLS client also does print(...) without the __future__ import.

"The ciphers are activated by default, however it is necessary ..." reads awkwardly to me. "The ciphers are activated by default. However, it is necessary ..." ?

"CertificateOptions also supports the older SSLv3 protocol (which may be required interoperate with an existing service or piece of software), just pass OpenSSL.SSL.SSLv3_METHOD to its initializer:" is a comma splice! Two separate sentences please.

In test_sslverify.py: serviceIdentitySetup needs documentation for each new parameter.

In twisted.internet.interfaces: IOpenSSLClientConnectionCreator suggests using twisted.internet.ssl.CertificateOptions. Should this not be suggesting settingsForClientTLS?

In twisted.protocols.tcp: maybe TLSMemoryBIOFactory should be more explicit that IOpenSSL{Client,Server}ConnectionCreator are preferred to a ContextFactory, and that isClient determines which interface is checked?

Looks generally good to me! An assortment of minor nits and documentation bugs:

It would be nice if you had explicitly called out whether you felt that these
nits and bugs required re-review or not :). As you did not I am obliged to
conservatively assume that you did.

Although, oops, I just discovered YET ANOTHER SERIOUS SECURITY PROBLEM in this
branch (the default None trust root was not actually being changed to
peerTrust) so I would have had to do that anyway.

But I fixed it, with tests even. So EVERYTHING IS GREAT NOW and we can LAND THIS IMMEDIATELY, right?

echoserv_ssl.py got updated to use react, but echoclient_ssl.py did not. Intentional? Additionally, the new ssl_clientauth_{client,server}.py files don't use react either.

OK, fine. I'll update it to use react, and to use endpoints…

ssl_clientauth_server.py does print(reason) without from __future__ import print_function.

In this case I'm not going to change it. The __future__ import isn't always necessary, and if you don't need it, it's distracting (especially in example code where brevity is priceless); best is to just only ever print single expressions, which these examples do.

Including public.pem seems redundant, considering server.pem has the certificate in it too.

Yes, but I want to draw a very clear distinction between the certificates that
would, in real life, be safe to share around everywhere and the ones you have
to protect very carefully. In real life the client would not have access to
server.pem and someone trying to write a secure application shouldn't be
copying it around.

In _sslverify.py: _hostnameASCII doesn't seem like a good name to me. It's a unicode string, so it's not really in ASCII encoding. It is encodable to ASCII, presumably, since it's decoded punycode. Unfortunately I can't come up with anything better.

This is why it's got an underscore in front of it. We can always change it later.

I'm not sure what's going on with the topfile situation. You edit another branch's topfile, which seems odd but does also clarify which API should be used. There's no topfile for #7098, thouggh.

So, the other branch's topfile will make the release notes look wrong, since this adjusts the behavior added later so that the user-visible change is different.

I added a .misc to at least indicate that this bug is being fixed.

In ssl.rst:

"In a realistic server, it's very important that these names match; in a realistic client, they should always be the same." -- I'm not entirely sure what this is trying to convey. It probably should be reworded.

Oops, this was really trying to just talk about the client, I think the mention of the "server" was to evoke the DN present in the certificate itself.

I think the new wording is clear.

Perhaps the STARTTLS client/server should be broken out into separate listings files like the rest of the code.

I'm going to do this, because you're right, and I'm going to be spending some time on this anyway, but once again, it's important to split out required feedback where some code doesn't meet our standards and things that you've noticed which should be optional.

The STARTTLS client/server reference a keys directory which used to be included in the documentation but got edited out. Maybe they should use server.pem like the rest of the examples?

Yup. I moved them to the examples directory so that they can be just like the other examples, too.

"The ciphers are activated by default, however it is necessary ..." reads awkwardly to me. "The ciphers are activated by default. However, it is necessary ..." ?

Not going to touch this. I just moved that paragraph because I was editing the order of the document, I didn't actually change it at all.

"CertificateOptions also supports the older SSLv3 protocol (which may be required interoperate with an existing service or piece of software), just pass OpenSSL.SSL.SSLv3_METHOD to its initializer:" is a comma splice! Two separate sentences please.

Again: I didn't change this. But in this case, fixed anyway.

In test_sslverify.py: serviceIdentitySetup needs documentation for each new parameter.

Right you are.

In twisted.internet.interfaces: IOpenSSLClientConnectionCreator suggests using twisted.internet.ssl.CertificateOptions. Should this not be suggesting settingsForClientTLS?

Right you are. Good catch.

In twisted.protocols.tcp:

I think you mean twisted.protocols.tls.

maybe TLSMemoryBIOFactory should be more explicit that IOpenSSL{Client,Server}ConnectionCreator are preferred to a ContextFactory, and that isClient determines which interface is checked?

"@see: L{twisted.internet.ssl}" --- consider explaining why the reader
may want to follow that link eg "for a full overview of the Twisted
TLS APIs".

From reading the note "Before implementing this interface yourself,
consider using L{twisted.internet.ssl.CertificateOptions}." I expected
to find that t.i.s.CertificateOptions implemented this interface, but
it doesn't. Should it? If not then perhaps this comment is not clear.

I know this is a private module, but perhaps this function and
SimpleVerificationError should also be private since they are
implementation details and assigned to the already public attributes
verifyHostname and VerificationError.

Is there any way to document the dynamically assigned verifyHostname and
VerificationError attributes so that they appear in the API
documentation? Consider adding a paragraph to the module docstring
explaining how these two are assigned based on the availability and
version of pyOpenSSL and service_identity. Or perhaps that explanation
should be included in the module API docs for t.i.ssl.

_tolerateErrors

It's not clear to me why this wrapper is needed. Maybe add a note
explaining why this is needed in addition to the try:...except: in
"_identityVerifyingInfoCallback" below. eg What sort of errors would
you expect to be logged here? Why not just add this as a catchall
except clause in "_identityVerifyingInfoCallback"?

"@rtype: L{callable}" I've been told in the past
(ticket:6700#comment:6) that L{callable} is not a valid type. I'm not
sure what to think. Consider changing or removing that and other such
rtype docs.

_ClientTLSSettings

Consider adding a short supplement to "@see: L{settingsForClientTLS} "
explaining why the reader may want to follow that link.

_ClientTLSSettings

Please add docstrings for the "hostname" arg and the "hostnameBytes"
and "hostnameASCII" attributes.

Requiring unicode domain names does make sense (I think) but elsewhere
in twisted (and in pyOpenSSL by the look of it), domain names are
expected to be pre-encoded bytes. So it would be good to read about
the rationale and especially the need for "_hostnameASCII" which isn't
clear to me. Consider adding such an explanation to the class
docstring.

I also didn't really understand what _idnaBytes does, above and beyond
the encoding provided in the stdlib. Consider adding a little extra
documentation to that function too.

_ClientTLSSettings.clientConnectionForTLS

The argument name "protocol" doesn't match the "tlsProtocol" of
IOpenSSLClientConnectionCreator.clientConnectionForTLS. Consider
bringing the two in line so that a keyword argument can be supplied
which matches both interface and implementation.

_ClientTLSSettings._identityVerifyingInfoCallback

It looks like this callback may be called at various states of TLS
establishment. Maybe add a note in the docstring explaining when this
callback may be called and add a link to the API docs for
"set_info_callback" if there are any.

OpenSSLCertificateOptions

In t.i.ssl you export OpenSSLCertificateOptions with the more generic
name "CertificateOptions". And the API docs seem to refer to that
generic name. That indicates to me that we're trying to break the
association between CertificateOptions and the OpenSSL
implementation. So why not just create the CertificateOptions alias
directly in this module (and use it in this module)?

source:branches/tls-abstraction-7098/twisted/internet/ssl.py

No comments

source:branches/tls-abstraction-7098/twisted/test/test_sslverify.py

ServiceIdentityTests.serviceIdentitySetup

"@param validClientCertificate" --- the defaults don't really need
documenting since they are obvious from the function signature, but if
you do choose to document the defaults consider using bool values
rather than 'yes' and 'no'. Same for other parameters in this func.

'serverHostname.encode("idna")' --- should you use
_sslverify._idnaBytes here?

"self._tlsConnection = self.factory._createConnection(self)" --- isn't
this an abstraction violation too? Albeit within the same module. But
you're calling a private method on an object supplied by the caller of
this public API. Consider adding a note about that or raising a ticket if
_createConnection ought to be made public.

TLSMemoryBIOProtocol.failVerification

"@param reason" seems to be incomplete.

ContextFactoryToConnectionFactory

This seems to be about enforcing stricter checking for a deprecated
API. Is there a ticket for that deprecation? Consider adding a link to
that ticket here.

...and that is kind of confirmed by the new docs for TLSMemoryBIOFactory.init.

As I've pointed out to the previous couple of reviewers, you haven't really
made it clear whether this is optional or mandatory feedback.

I appreciate your attention to detail, and, as a change author, I don't want to
ever tell a reviewer that serious issues should be put aside because "this
needs to happen fast". But I feel compelled to point out that this is (A) a
serious security problem for every Twisted program using client-side TLS and
(B) now a release blocker due to interface-incompatibility issues. If there
were ever a ticket where we should be very deliberate about separating out
inessential concerns to future tickets, this would be the one ;-).

Please generally keep this in mind for future reviews.

Points:

Merge Conflicts (using combinator) although I note that buildbot seems to
have merged the branch successfully, using git.

Yeah, git can apparently resolve the tree conflicts automatically, so, I've
merged forward so you won't see that problem again.

"@see: L{twisted.internet.ssl}" --- consider explaining why the reader
may want to follow that link eg "for a full overview of the Twisted
TLS APIs".

From reading the note "Before implementing this interface yourself,
consider using L{twisted.internet.ssl.CertificateOptions}." I expected
to find that t.i.s.CertificateOptions implemented this interface, but
it doesn't. Should it? If not then perhaps this comment is not clear.

There's a gnarly compatibility issue here.

(Subclassing, kids. Not even once.)

Bascially, if we were to make CertificateOptions actually implement this
interface, then TLSMemoryBIOProtocol would start calling
serverConnectionForTLS, which means that an old subclass of
CertificateOptions would have its getContext method short-circuited. As it
happens, CertificateOptions doesn't need any of the new interfaces to do
its job server-side, and unlike with settingsForClientTLS there's no new API
yet where a client might indicate to this API that a surprise in how it
manipulates the context would be OK.

I'm going to document this as "historical reasons" and just add a new
server-side API later (one that uses our convenient high-level objects like
Certificate and not having this confusing level-mixing where some arguments
are pyOpenSSL things and some aren't).

IOpenSSLClientConnectionCreator

See comments above about @see and @note

I don't see how this applies in this case. It's referencing a function that
returns a regular provider of a documented interface.

That is an implementation detail; if someone were to look at this docstring,
they should see that it raises twisted.internet.ssl.VerificationError, which
is the publicly exported name.

However, I'll refer more specifically to that name to make sure the reference
resolves correctly.

I know this is a private module, but perhaps this function and
SimpleVerificationError should also be private since they are
implementation details and assigned to the already public attributes
verifyHostname and VerificationError.

They are private, they're in a private module. Do you mean they should
additionally start with _? Actually this comment reminded me to remove
underscore prefixes from some other stuff that shouldn't need to be
double-secret-extra-private, which may have been your opposite intended effect
:-).

Is there any way to document the dynamically assigned verifyHostname and
VerificationError attributes so that they appear in the API
documentation? Consider adding a paragraph to the module docstring
explaining how these two are assigned based on the availability and
version of pyOpenSSL and service_identity. Or perhaps that explanation
should be included in the module API docs for t.i.ssl.

Hmm. The markers already in __all__ are technically the way to take care of
this. Although the fact that they're dynamically assigned is going to throw pydoctor for a loop, so...

There's actually a decorator hiding in a different branch which could help with this; I filed #7216 for extracting it.

_tolerateErrors

It's not clear to me why this wrapper is needed. Maybe add a note
explaining why this is needed in addition to the try:...except: in
"_identityVerifyingInfoCallback" below.

I added a comment.

eg What sort of errors would you expect to be logged here? Why not just add
this as a catchall except clause in "_identityVerifyingInfoCallback"?

Bikeshed bikeshed bikeshed :-).

One reason it's a separate function is that it's a totally separate concern.
This is dealing with error-reporting bugs in pyOpenSSL,
_identityVerifyingInfoCallback is dealing with verifying identity.

Another reason it's separate is that it's supposed to deal with any possible
implementation issues, not just the verify callback, so maintenance of
_identityVerifyingInfoCallback should never touch this. Since no
unidentified exceptions are *actually* raised, it would be very tempting for a
maintainer to just rip out that try suite.

Finally, it's separate to facilitate testing. We want the identity callback
itself to be buggy, not just one particular line in it. Mocking out parts
of a function is not really possible.

"@rtype: L{callable}" I've been told in the past
(ticket:6700#comment:6) that L{callable} is not a valid type. I'm not
sure what to think. Consider changing or removing that and other such
rtype docs.

Well, core Python doesn't have abstract interfaces, but callability is an
abstract interface :-(. I think I disagree with the comment you're
referencing, not because it's wrong, but because there's simply nothing better
to say.

_ClientTLSSettings

Consider adding a short supplement to "@see: L{settingsForClientTLS} "
explaining why the reader may want to follow that link.

OK, I made this into just some text with a link instead of a @see.

_ClientTLSSettings

Please add docstrings for the "hostname" arg and the "hostnameBytes"
and "hostnameASCII" attributes.

Oh yeah, and _ctx and so on...

Requiring unicode domain names does make sense (I think) but elsewhere
in twisted (and in pyOpenSSL by the look of it), domain names are
expected to be pre-encoded bytes. So it would be good to read about
the rationale and especially the need for "_hostnameASCII" which isn't
clear to me. Consider adding such an explanation to the class
docstring.

I also didn't really understand what _idnaBytes does, above and beyond
the encoding provided in the stdlib. Consider adding a little extra
documentation to that function too.

I've attempted to do so.

_ClientTLSSettings.clientConnectionForTLS

The argument name "protocol" doesn't match the "tlsProtocol" of
IOpenSSLClientConnectionCreator.clientConnectionForTLS. Consider
bringing the two in line so that a keyword argument can be supplied
which matches both interface and implementation.

Done.

_ClientTLSSettings._identityVerifyingInfoCallback

It looks like this callback may be called at various states of TLS
establishment. Maybe add a note in the docstring explaining when this
callback may be called and add a link to the API docs for
"set_info_callback" if there are any.

OpenSSLCertificateOptions

In t.i.ssl you export OpenSSLCertificateOptions with the more generic
name "CertificateOptions". And the API docs seem to refer to that
generic name. That indicates to me that we're trying to break the
association between CertificateOptions and the OpenSSL
implementation. So why not just create the CertificateOptions alias
directly in this module (and use it in this module)?

Not a new problem in this branch; not going to address it.

Just so you know though, the fact that the references to this thing all refer
to the generic name are because that's where pydoctor thinks it is, and
L{OpenSSLCertificateOptions} will fail. I'm pretty sure that attempts to
alias it in other ways will fail too.

source:branches/tls-abstraction-7098/twisted/test/test_sslverify.py

ServiceIdentityTests.serviceIdentitySetup

"@param validClientCertificate" --- the defaults don't really need
documenting since they are obvious from the function signature, but if
you do choose to document the defaults consider using bool values
rather than 'yes' and 'no'. Same for other parameters in this func.

I preferred to use 'yes' and 'no' because I am documenting the answer to the
question posed rather than the specific value.

'serverHostname.encode("idna")' --- should you use
_sslverify._idnaBytes here?

Yeah, you're right. And I added an inverse _idnaText operation so that the
tests never have the string "idna" in them.

I don't think that advice applies. These tests all failed pretty cleanly as I
was writing them, and there are multiple assertions in there because they are
asserting about two parts of the connection being lost. I might refactor these
tests later to

source:branches/tls-abstraction-7098/twisted/protocols/tls.py

TLSMemoryBIOProtocol.makeConnection

"self._tlsConnection = self.factory._createConnection(self)" --- isn't
this an abstraction violation too? Albeit within the same module. But
you're calling a private method on an object supplied by the caller of
this public API. Consider adding a note about that or raising a ticket if
_createConnection ought to be made public.

_createConnection oughtn't be public. Objects in the same module can be
"friends", I think :-). If I were doing all of this from scratch I probably
would have eschewed the self.factoryWrapperProtocol idiom, and had
meaningful attributes passed to the constructor, but working within the
existing structure here I think that this is fine.

TLSMemoryBIOProtocol.failVerification

"@param reason" seems to be incomplete.

Good catch, fixed.

ContextFactoryToConnectionFactory

This seems to be about enforcing stricter checking for a deprecated
API. Is there a ticket for that deprecation? Consider adding a link to
that ticket here.

It's not really about "stricter checking"; it's just the implementation
mechanism so we have one thing to remove when we remove the old interfaces
rather than semaring the compatibility code all inside the implementation. The
implementation talks to the interface it wants to talk to, and this wraps it up
so it still supports the old version.

_idnaText says it returns unicode but calls .encode('ascii') in the idna branch. Which type is correct?

I also get a bunch of errors building the docs:

/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:141: WARNING: Include file u'/Users/habnabit/src/git/Twisted/docs/core/examples/starttls_server.py' not found or reading it failed
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:148: WARNING: Include file u'/Users/habnabit/src/git/Twisted/docs/core/examples/starttls_client.py' not found or reading it failed
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:174: WARNING: Include file u'/Users/habnabit/src/git/Twisted/docs/core/examples/ssl_clientauth_server.py' not found or reading it failed
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:183: WARNING: Include file u'/Users/habnabit/src/git/Twisted/docs/core/examples/ssl_clientauth_client.py' not found or reading it failed
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:139: WARNING: download file not readable: /Users/habnabit/src/git/Twisted/docs/core/examples/starttls_server.py
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:146: WARNING: download file not readable: /Users/habnabit/src/git/Twisted/docs/core/examples/starttls_client.py
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:172: WARNING: download file not readable: /Users/habnabit/src/git/Twisted/docs/core/examples/ssl_clientauth_server.py
/Users/habnabit/src/git/Twisted/docs/core/howto/ssl.rst:181: WARNING: download file not readable: /Users/habnabit/src/git/Twisted/docs/core/examples/ssl_clientauth_client.py

I think this is related to the merging forward; these files got put in the wrong place.

My primary concern at this point is the lack of unit testing coverage. I did run coverage.py over the code and it looks like the new code is covered well. It's just not all covered very directly.

If you feel fine about it, and once the above points are addressed (which are simple things), I'd say at this point that the branch is good to land. It's gone through a number of different reviewers and I feel their feedback has been satisfactorily responded to.