Since this was mentioned to me at IETF 101, I managed to find the time to
look it up and review. Several design decisions have left me confused; most
notably the notion of a call-out to HTTPS in the first place. Much of the
document is unclear to me, despite having a background of both Internet
Mail and the application of PKI in application protocols, and it appears to
willfully ignore prior art in this area.
1) A History Lesson
First, some history: In XMPP-land, we faced a similar issue with
large-scale providers (most notably Google) not wishing to host or manage
the certificates for their customers, alongside an assertion that customers
would not wish to provide their private keys to their provider. Despite the
strong evidence to the contrary in the case of HTTPS, the community
nevertheless developed POSH (RFC 7711), and did so in a protocol-neutral
way. In POSH, an entity fetches a well-known document from an HTTPS server
in order to securely obtain information with which to validate a
certificate by means other than building a traditional chain etc.
Notably, this was against a backdrop of CAs which were not free, and
generally quite expensive - this has changed markedly over the years since,
and I should note that POSH's support for self-signed certificates is
probably no longer relevant.
However, it surprises me that the MTA-STS draft does not appear to note
this prior art at all, and this makes me wonder whether it was even on the
radar.
Importantly, POSH was never deployed very heavily - I can find only one
deployment (and "most users opt to just give us the cert"). This was in
part because of Google's withdrawal from standards-based IM, which
liberated the community from having to support a use-case only Google
really felt important, and also the rise of free CAs, which avoided the
cost issue associated with traditional PKIX.
For reference, the XMPP community has a high penetration of DANE records
(around 10% of the self-selected group who test their servers through
community tooling) and a very high penetration of CA-signed certificates
(mostly Let's Encrypt).
2) HTTPS Call-out
Given the policy is essentially trust-on-first-use, it's not clear to me
why much of the STS policy cannot be transferred within SMTP itself,
perhaps in response to the EHLO issued after STARTTLS completes. This is
good enough for HTTPS's STS variant, and feels intuitively simpler for MTAs
to implement.
3) DNSSEC or not?
The MTA-STS problem is reasonably well-defined in the document - SMTP
servers often do host numerous domains, and unfortunately operating one's
own server has become a rarity, so domains are concentrated on a few
servers. STARTTLS is, as the abstract notes, theoretically susceptible to a
downgrade attack - but this does require either active MITM or some fairly
tight hoops to jump through to actually exploit.
The draft then goes on to compare the solution to DANE, and notes that
DNSSEC is not required with MTA-STS - "at a cost of risking malicious
downgrade attacks". These would be performed by DNS spoofing, which has a
known history of occurring. In any case, what is distinctly unclear to me
is whether MTA-STS without DNSSEC is materially different from RFC 7672
without DNSSEC; if unsecured DNS is "good enough" for MTA-STS, my immediate
question is whether it might therefore be good enough for at least some
cases of DANE.
I'd note that in RFC 7672, the presence of a (DNSSEC-secure) TLSA record is
sufficient to mandate TLS - hence my question is whether an insecure TLSA
record stipulating a particular trust anchor and/or valid certificate
(PKIX-TA and PKIX-EE) might be sufficient to meet the same security
requirements here.
4) Wildcard on Wildcard Action.
It is deeply unfortunate that MTA-STS mandates a name match based on
dNSName SANs only. I would have thought that emulating an SRV, and matching
a corresponding sRVName, would be more useful - and overall, the idea that
a new matching algorithm has been included so as to match an "mx pattern"
to a dNSName wildcard just feels like an exploit waiting to happen. It
would feel considerably safer to do one of:
a) Make matches operate the same way as DANE, by being based on hashes of
SubjectPublicKeyInfo and/or the complete certificate. (Similar to POSH's
approach of a certificate hash).
b) Make matches operate the same way as RFC 6125 (unreferenced, I note).
c) Both/either of the above.
I assume the logic behind allowing a wildcard-to-wildcard match is to allow
a Google user to simply specify ".googlemail.com" and ".l.google.com" as
their MX identity patterns; however it feels as though Google could simply
use a known name within the certificate for all their receiving MTAs,
irrespective of the DNS records involved. This, of course, presupposes that
the administrator of the mail domain somehow does not know the precise
names of the MTAs used in their own DNS records.
I further assume the logic in mandating matches against dNSName SANs is
because these are readily available; however sRVName SANs, by restricting
their use to a particular service, have the advantage that customers giving
these to their service provider might be deemed more acceptable.
5) Terminology and Nomenclature.
It is well-known that naming things remains the hardest problem in
technology.
However, this draft appears to have taken bold strides in demonstrating
that coming up with new names for things needn't be so challenging.
Take §7.1, for example, which dictates that the SNI extension MUST contain
the "MX hostname" - this latter term does not appear anywhere else in the
document. I'm going to guess that it means the RHS of the MX record, as
defined in RFC 7672 (and informative reference), which is the same as RFC
7672. "MX host", which appears once in RFC 5321, also appears elsewhere in
this draft, including in §1.1, where it is in this definition:
o MTA-STS Policy: A commitment by the Policy Domain to support PKIX
[RFC5280] authenticated TLS for the specified MX hosts.
Impressively, by my reading, the commitment is for the Policy Domain to
support PKIX for all SMTP; and not just for specified hosts.
Using more common - and more uniform - terminology would be of huge
benefit: "Sending MTA", "Receiving MTA", and so on are well-known terms. If
a new term is needed, please do define it. If you mean to use terms from
other RFCs, these need to be Normative References and noted in the
Terminology section.
6) Reference Identifiers and derivation.
RFC 6125 provides a slew of terminology and best practise - from the same
UTA working group, as I recall. RFC 7672 also provides terminology and much
behaviour.
It feels as though this draft should at least attempt to use the same
terminology, and ideally the same behaviour, as RFC 7672 (and RFC 6125).
This is particularly noticeable in the difference between the reference
identifiers used within RFC 7672 (and used within the SNI discussed there)
compared with this draft, see for example this draft's §7.1, compared with
RFC 7672 §8.1
7) Trust Anchors
RFC 7672 suggests that MTAs cannot rely on a set of common trust anchors,
in Section 1.3.4. While I'm not actually convinced this is really the case,
I'm finding it odd that on the one hand, we have a consensus
standards-track document that makes this assertion, yet on the other this
draft makes - implicitly - the opposite assertion.
It would be useful to understand if circumstances have changed.
Dave.