Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow domains of length 254 with a trailing dot #73

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

zacknewman
Copy link

Fixes #72.

The test I added actually had length of 255.
Forgot to change my local rust version to the MSRV.

if input.len() > MAX_NAME_LENGTH {
return Err(InvalidDnsNameError);
/// The maximum lenght of a domain name is 253 when there is no trailing dot.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have chapter and verse from the relevant RFCs on this?

Copy link
Author

@zacknewman zacknewman Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what RFC you choose to adhere to. The complicated thing about this is there are so many variations of "representation format". The comment in code states that RFC 1035 is adhered to with the added allowance of underscores. First, does that comment mean § 2.3.1 (i.e., "preferred name syntax")? If so, then the code is wrong for allowing trailing dots at all. If that is indeed the goal, then the length check is fine; however an additional check that verifies there is not a trailing dot needs to be added.

Copy link
Author

@zacknewman zacknewman Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue this relates to explains the problem. 255 is the maximum length a domain is allowed to be per § 2.3.4, but that refers to the wire format. When dealing with a representation format, you have to calculate what the wire-format length is from it. For "simple" representation formats (i.e., where you don't de-escape "characters" and where each "character" is encoded in 1 byte), the algorithm is as simple as adding 2 to the length when the domain does not contain a trailing dot or adding 1 when it does. From that, the max length of the former is clearly 253 since 253 + 2 = 255; meanwhile the max length of the latter is 254 since 254 + 1 = 255.

Copy link
Author

@zacknewman zacknewman Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, this PR only addresses the incorrect/inconsistent treatment of trailing dots. Personally, I think using "preferred name syntax" is far too restrictive. The fact the code adds additional "characters" (specifically underscores) is indicative of this restriction. There is no real gain by being so restrictive since there are real domains that can never be accessed by clients that use rustls since rustls will incorrectly reject them as being invalid.

On the most extreme side, you could allow all ASCII domains. That is to say all domains whose labels consist of any and all ASCII sans . (since that is used as the label separator). A less extreme case would be allowing all "printable" ASCII. Or use what Servo's IDNA crate recommends and allow domains that conform to WHATWG's URL standard. All of those are better since you will allow rustls to be used by clients to access more valid domains than it incorrectly rejecting them. This is a tangential problem though; although answering the question "what domains in representation format do we want to allow?" is necessary to solve both the length bounds and the allowed "characters".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what problem are you actually solving here? It all seems pretty theoretical to me.

Copy link
Author

@zacknewman zacknewman Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djc,

My actual issue is that rustls does not allow one to customize how DnsName parses domains in representation format. This means you literally cannot use rustls as a client-side TLS software suite if you ever wanted to access domains that it does not deem valid but in fact are. If you don't want to modify how DnsName parses domains in representation format, can we add another method/trait that defers to downstream code? I want my code to accept more domains that rustls doesn't, but ServerName does not allow me to pass a DnsName that is valid because there is no way for me to construct it without relying on DnsName::try_from_str. For example, if Firefox ever wanted to use rustls for its TLS-handling, it literally couldn't without sacrificing the ability to access certain domains that it can now (e.g., Firefox happily accesses domains that contain ; in a label).

I would like to construct a DnsName that is a valid domain whose labels consist of any printable ASCII (sans space and .) or perhaps at a minimum what WHATWG says is allowed (and thus what popular web browsers accept since I don't know why rustls shouldn't accept something that web browsers do) since those are valid domains per RFC 1035 (but of course may not be valid per the section on "preferred syntax" which rustls has already (partially) deviated from), but I cannot since DnsName::try_from_str is overly restrictive. This means changing its behavior to allow more domains, live with the fact that my code is overly restrictive and can't be used to access legitimate domains, add additional functionality that defers to downstream code that allows for more valid domains, or use native-tls or other crate that does what I want.

Clearly, this PR is one part of the "change its behavior to allow more domains". I was planning to create another "issue" asking if we could allow more characters; but if there is push back on something as simple (and "theoretically"/technically correct) as this, then surely there will be push back on allowing more characters (e.g., printable ASCII). Thus perhaps adding another associated method/trait that allows calling code to dictate what is valid would be the best approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means you literally cannot use rustls as a client-side TLS software suite if you ever wanted to access domains that it does not deem valid but in fact are.

Do you have a practical example of such a name -- specifically one that is allowed by RFC4985, issued in a certificate under cabforum BR 7.1.2.7.12, etc?

DnsName::try_from_str does not allow " " which is a valid domain per that section.

To be clear the DnsName type we have here exists to support the ServerName type, and ServerName contains the name or address for a TLS server from the view of a client. Therefore, there is little to be gained to allow a client to express not-useful values like " ", even if they are technically allowed and useful in DNS software.

I'm definitely open to updating references here to (probably) refer to https://www.rfc-editor.org/rfc/rfc9499.html#name-names -- a BCP which reflects modern usage better than RFC1035

Copy link
Author

@zacknewman zacknewman Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a practical example of such a name -- specifically one that is allowed by RFC4985, issued in a certificate under cabforum BR 7.1.2.7.12, etc?

There was no mention of that RFC in the code which is why I keep citing RFC 1035. If the goal is to allow only those domains allowed by RFC 4985, then code should state that and importantly the validation function should be using it to determine what is a valid domain and not "preferred syntax" of RFC 1035 with the added allowance of underscores, added restriction that " " is forbidden, and unless this PR is accepted, an inconsistent stance on when trailing dots are allowed: normally yes despite violating "preferred syntax", but not allowed when the domain has length 254 despite being valid per RFC 1035. Since I'm not familiar with that RFC, I'll have to get back to you. I will say rustls should support "private" deployments. Thus if I can construct an X.509 v3 certificate for a domain that I control, I can create the appropriate AAAA/A record for said domain, and other very popular software (e.g., Firefox) have no problem accessing said domain and handling said certificate, then rustls should be able to as well, right? Point being, "practical example" is doing a lot of heavy-lifting. Do you require me to use a "popular" CA (e.g., Let's Encrypt) to issue the certificate?

Tangential, with popular CAs like Let's Encrypt planning to issue X.509 v3 certificates to IP addresses, do you know if rustls already/will support such certs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cursory read of RFC 4985 seems to suggest that IP addresses are not allowed; thus rustls intends to not support such certs despite popular CAs like Let's Encrypt issuing them?

Copy link
Member

@cpu cpu Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith
Copy link

For example, if Firefox ever wanted to use rustls for its TLS-handling, it literally couldn't without sacrificing the ability to access certain domains that it can now (e.g., Firefox happily accesses domains that contain ; in a label).

Hi, I wrote both Firefox's certificate DNS name parsing code and also (IIUC) this pki-types code. You can see the current Firefox source code with the same "253" limit here: https://searchfox.org/mozilla-central/source/security/nss/lib/mozpkix/lib/pkixnames.cpp#1916

Also, you are discussing the IP address stuff with the people who implemented IP address support in this library and/or for Let's Encrypt, just FYI.

@briansmith
Copy link

I would like to construct a DnsName that is a valid domain whose labels consist of any printable ASCII (sans space and .) or perhaps at a minimum what WHATWG says is allowed

It is intentional that the WHATWG how-to-deal-with-URLs-in-HTML specification is not used to decide how to parse domain names in certificates in either Firefox or any of these Rust libraries. I believe it is true that there are certain domain names that Firefox and other browsers will try to connect to, but they've become pretty uniform regarding rejecting certificates with names that differ from what's enforced here. So, there may be some domain names that you could connect to using http:// but not https:// because certificate validation would always fail. And, also, the CA would immediately get in trouble for issuing certificates with those domains, if it is a public CA.

@zacknewman
Copy link
Author

zacknewman commented Feb 9, 2025

Hi, I wrote both Firefox's certificate DNS name parsing code and also (IIUC) this pki-types code. You can see the current Firefox source code with the same "253" limit here: https://searchfox.org/mozilla-central/source/security/nss/lib/mozpkix/lib/pkixnames.cpp#1916

Interesting. So you purposefully allow trailing dots for domains up to length 253 but not for a domain of length 254? What is the reason for that? As the issue shows, a lot of actual domain-related software allow trailing dots consistently including for domains of length 254. I cannot find a single "official" RFC or spec that states how trailing dots are handled in this code (and apparently Firefox's certificate DNS name parsing code) is valid. The RFCs and other standards I am familiar with either forbid trailing dots (e.g., the RFC that is cited in this code) or allow them.

When I was talking about Firefox, I was talking about how Firefox access domains and what domains are allowed but not from a certificate enforcement perspective.

Also, you are discussing the IP address stuff with the people who implemented IP address support in this library and/or for Let's Encrypt, just FYI.

And? I never said the "IP address stuff" was wrong. As you can see, I prefaced my question with "tangential ...". I was genuinely curious what the process was for IP address handling since I was previously told in this thread that I should be using RFC 4985; however I (correctly) pointed out that RFC forbids IPs, so it doesn't seem correct to use that RFC as a guide of what to allow since that would prevent handling certificates issued to IPs. Not sure what you are getting it at with this point. I also "thumbs upped" the response I was given from the person I'm assuming you are referring to added the support to Let's Encrypt.

@zacknewman
Copy link
Author

zacknewman commented Feb 9, 2025

It is intentional that the WHATWG how-to-deal-with-URLs-in-HTML specification is not used to decide how to parse domain names in certificates in either Firefox or any of these Rust libraries. I believe it is true that there are certain domain names that Firefox and other browsers will try to connect to, but they've become pretty uniform regarding rejecting certificates with names that differ from what's enforced here. So, there may be some domain names that you could connect to using http:// but not https:// because certificate validation would always fail. And, also, the CA would immediately get in trouble for issuing certificates with those domains, if it is a public CA.

So rustls only supports "public CA"s? That's very unfortunate. If it is true that most browsers won't process X.509 v3 certificates issued to domains that are rejected by DnsName::try_from_str, then I suppose the CA being public does not matter. I haven't had time to test this claim though. I'm going by my experience of what domains are considered valid by a lot of software and official standards and RFCs, but only from an IP address resolution point-of-view.

If I am unable to get Chromium or Firefox to properly process an X.509 v3 certificate issued to an "invalid domain" per DnsName::try_from_str, then I suppose it does not matter that more characters are supported. I still claim that not allowing domains with a trailing dot when they are length 254 is incorrect (while allowing a trailing dot for shorter domains). I would love to see an official source that says otherwise.

@briansmith
Copy link

If I am unable to get Chromium or Firefox

I would say that Chromium probably is setting the upper bound on how lax they would be for non-public CA certificates, as Chrome and Edge actually have "enterprise" customers that they care about. You can look at the source code in Chromium to see how they configure their certificate verifier (which I think is in BoringSSL now). I doubt there is any appetite to get any laxer than that.

@zacknewman
Copy link
Author

zacknewman commented Feb 9, 2025

I would say that Chromium probably is setting the upper bound on how lax they would be for non-public CA certificates, as Chrome and Edge actually have "enterprise" customers that they care about. You can look at the source code in Chromium to see how they configure their certificate verifier (which I think is in BoringSSL now). I doubt there is any appetite to get any laxer than that.

I am curious about what "security" benefit there is to being so strict about what is deemed a "valid" domain. If I have a valid domain per RFC 1035 (not necessarily the "preferred syntax" section though) and I have a valid private key that corresponds to a valid X.509 v3 certificate that is signed by a CA that transitively links to a trusted CA cert, then why should it matter what is in the common name or subject alternative name if it matches the domain the certificate is being sent from? I would argue it's better to be more relaxed as you can simplify the parsing code which itself has benefits from a security perspective since there is a lower probability of having a bug. Not to mention, shouldn't we strive to be more inclusive about what services can utilize TLS and other X.509-based "technologies"? It's a good thing that public CAs like Let's Encrypt (will) support IP-based certificates since that is more services that can gain the benefits of TLS. Very bizarre to me to play "gatekeeper" and even more bizarre that the domains that have been "blessed" is based on a somewhat arbitrary algorithm seeing how neither RFC 1035 "preferred syntax" nor RFC 4985 is being adhered to.

I realize downstream applications may not be able to communicate with a service that uses certain domains or certain CAs will refuse to sign a certificate from certain domains, but that should be an application-based decision. Perhaps a more specific question is better: what are the security drawbacks of being able to process an X.509 v3 certificate issued to "foo-.com"?

@djc
Copy link
Member

djc commented Feb 11, 2025

Section 3 of RFC 6066 includes this language:

"HostName" contains the fully qualified DNS hostname of the server, as understood by the client. The hostname is represented as a byte string using ASCII encoding without a trailing dot.

As such, we could instead add a check that the given string does not include a trailing dot, but increasing the allowed length to accomodate a trailing dot is clearly incorrect for the stated purpose of representing a RFC 6066 HostName (which is what this type is intended for).

@zacknewman
Copy link
Author

zacknewman commented Feb 11, 2025

Section 3 of RFC 6066 includes this language:

"HostName" contains the fully qualified DNS hostname of the server, as understood by the client. The hostname is represented as a byte string using ASCII encoding without a trailing dot.

As such, we could instead add a check that the given string does not include a trailing dot, but increasing the allowed length to accomodate a trailing dot is clearly incorrect for the stated purpose of representing a RFC 6066 HostName (which is what this type is intended for).

That's now the third RFC I've been told this code is supposed to adhere to. Regardless, I love that RFC. That's the level of permissiveness I've been vouching for. RFC 6066 states that any ASCII domain that is not of the format of an IPv6 or IPv4 address is allowed. That's such an easy implementation too! Verify the last u8 is not b'.', split on b'.', verify u8::is_ascii for each u8, verify each label has length inclusively between 1 and 63, verify the total length does not exceed 253, and verify it's not also a valid IPv6 or IPv4 address (e.g., IpAddr::from_str fails). I do think you should re-consider the stance of forbidding a trailing dot though for three reasons:

  • rustls already treats domains with a trailing dot as equivalent to one without when matching what is in the X.509 v3 certificate, so I don't think it hurts to keep allowing a trailing dot
  • Trailing dots exist in the wild and can even be beneficial since it prevents a "search domain" from being appended
  • Disallowing a trailing dot is a breaking change in that domains that are currently successfully parsed by DnsName::try_from_str may no longer be

I understand the idea of forbidding a trailing dot since it's easier to justify (i.e., this code implements RFC 6066 with no exceptions). Also calling code can always trim it if they want to allow it, but it's still easy to justify the implementation as "RFC 6066- conforming with the amendment that a trailing dot is also allowed". In the case you do allow a trailing dot, then the length check needs to be such that 253 is the max iff there is no trailing dot; otherwise 254 is the max length allowed.

If you would like me to submit a PR that correctly and consistently implements RFC 6066, then I'd be happy to.

@briansmith
Copy link

When the user types "https://example.org." in the address bar, the client should give the certificate verification library the string "example.org." as the domain name. The certificate verification library should match a dNSName "example.org" in a certificate
with "example.org." as though it were "example.org". That is, for the purposes of certificate verification, the trailing dot in the reference identifier is ignored, even though it isn't ignored for cookie handling and other things.

The certificate verifier should reject a certificate with a dNSName of "example.org." as trailing dots in dNSName are not allowed, IIRC.

@zacknewman
Copy link
Author

zacknewman commented Feb 11, 2025

When the user types "https://example.org." in the address bar, the client should give the certificate verification library the string "example.org." as the domain name. The certificate verification library should match a dNSName "example.org" in a certificate with "example.org." as though it were "example.org". That is, for the purposes of certificate verification, the trailing dot in the reference identifier is ignored, even though it isn't ignored for cookie handling and other things.

The certificate verifier should reject a certificate with a dNSName of "example.org." as trailing dots in dNSName are not allowed, IIRC.

I think we should be careful about basing our decisions on what browsers do since there are other clients out there as well (e.g., this whole PR was motivated by TLS support when connecting to a database server). Regardless, I don't feel strongly about forbidding or allowing a trailing dot (so long as it's consistent (e.g., if some are allowed trailing dots, then 254-character domains should be allowed too)).

@ctz
Copy link
Member

ctz commented Feb 11, 2025

As such, we could instead add a check that the given string does not include a trailing dot, but increasing the allowed length to accomodate a trailing dot is clearly incorrect for the stated purpose of representing a RFC 6066 HostName (which is what this type is intended for).

Note that rustls already strips a trailing period if needed to meet this clause of RFC6066.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DnsName::try_from_str incorrectly disallows domains of length 254 that have a trailing .
5 participants