> They're obviously different things. The what that is working is not necessaril...

quesera · on Dec 27, 2017

As an aside, is there a name for this purposeful perspective of strict literalism?

I say "purposeful" because -- while you're obviously knowledgeable about the subject matter -- it can't have failed to occur to you that this approach cannot succeed outside of a very structured context.

(Discussion of whether our conversation was within or without that context omitted, though it might have been the only important discussion possible)

Is this a subcategory of the formal language / langsec efforts? Just standard standards-writing practice? Something else?

zAy0LfpBZLC8mAC · on Dec 27, 2017

> As an aside, is there a name for this purposeful perspective of strict literalism?

Correctness?

> I say "purposeful" because -- while you're obviously knowledgeable about the subject matter -- it can't have failed to occur to you that this approach cannot succeed outside of a very structured context.

Erm ... which is why I am applying it to the extraordinarily structured context of formal languages, protocol specifications, and computer software?!

> Is this a subcategory of the formal language / langsec efforts? Just standard standards-writing practice? Something else?

I would say the langsec efforts are an attempt to raise awareness that sloppy thinking about semantics is the root of a major proportion of vulnerabilities, to establish a label for this problem, and to try and establish some sort of best practices for avoiding such problems. Good standards-writing for protocols is, of course, extremely literal, as protocol implementations necessarily will be, so any ambiguity in the standard will result in interoperability problems and possibly vulnerabilities as a result, and in the long run to unnecessary complexity as people try to plaster over the differences in interpretation between implementations to improve interoperability, thus increasing the probability for vulnerabilities even further.

quesera · on Dec 27, 2017

But you're not correct. You're doggedly and dogmatically wrong, in the only context that matters, which is the one in which this conversation was spawned.

Again, if for the purposes of the spec, you want to apply the label "FQDN DNS name" to "8.8.8.8", that's fine and great. You can also call it a "finalized mapping token" (which has the advantage of being literally correct), or a "turtle" (which would be surprising but not misleading).

But applying a label to the data does not change the nature of the data. In the larger context, the data was created as a text representation of an IP address, was never used in a DNS context, and the concept of "fully-qualified" doesn't have a lot of meaning where there is no process by which to further qualify any partial tokens.

It remains a textual representation of an IP address, even if it is used in a different context. Just as "Alice" remains a first name even if it is mislabeled or misused.

Of course, data can fit the validation criteria for multiple types, and it can be misused. "Alice" is a first name, but it is also a valid hostname. It is not a hostname just by virtue of being validly parseable as such. And if I wanted to know the first names of people here at HN, but asked for their hostnames, I would generally not get the answer I wanted.

If some awful code somewhere misused first names as hostnames, the network guy with very limited context might say "I see a query for hostname 'Alice'", but the people with larger context would ask "Why is this firstname being misused as a hostname?".

This HN thread was never a SNI spec internal debate, and no one here benefits from assuming that highly restrictive context.

I have had discussions with some of the langsec folks in the past. I greatly respect their work, and they are wise enough to know that their context is not useful in general discussion.

Your initial statement was condescending and misleading. As the conversation continues, it becomes clear that this was intentional, and you are willing to die on the hill of tiny irrelevant context. Noted.

zAy0LfpBZLC8mAC · on Dec 27, 2017

> But applying a label to the data does not change the nature of the data.

This is not about applying a label, this is about type-tagging. The label is irrelevant, the type tag is not.

> In the larger context, the data was created as a text representation of an IP address, was never used in a DNS context

Except that it was. As per the SNI specification, putting something into the SNI hostname field is "using it in a DNS context", which is why doing so is a bug. It's the exact same bug as putting plain text into HTML. The fact that "<" represents a "less than sign" in plain text is irrelevant to what "<" means when it appears in HTML. The semantics of HTML are not governed by the spec of plain text. Using plain text where HTML is expected is a bug. The fact that a human with a larger context might be able to recognize that the string "3 < a / b = 42" appearing in an HTML document was probably not intended to contain a malformed HTML tag does not change that it in fact does.

> Just as "Alice" remains a first name even if it is mislabeled or misused.

Essentialism, anyone?

> "Alice" is a first name, but it is also a valid hostname. It is not a hostname just by virtue of being validly parseable as such.

Exactly! It is by virtue of the context in which it appears. And the context of the SNI host name field makes whatever appears in it a DNS FQDN.

> And if I wanted to know the first names of people here at HN, but asked for their hostnames, I would generally not get the answer I wanted.

Yep. And equally, when the server implementing SNI asks you for a DNS hostname, but you supply an IP address, you are not answering the question being asked, and you should expect whatever your reply is to be treated as a DNS hostname.

> If some awful code somewhere misused first names as hostnames, the network guy with very limited context might say "I see a query for hostname 'Alice'", but the people with larger context would ask "Why is this firstname being misused as a hostname?".

Sure, they might. That doesn't change the fact that as far as the protocol is concerned, it still is a hostname, which is precisely why it will fail to work.

> This HN thread was never a SNI spec internal debate, and no one here benefits from assuming that highly restrictive context.

You claimed that you could specify IP addresses in SNI messages. You still can't.

> I have had discussions with some of the langsec folks in the past. I greatly respect their work, and they are wise enough to know that their context is not useful in general discussion.

So, this is not a discussion about whether or not you can specify IP addresses in the SNI hostname field?

> Your initial statement was condescending and misleading. As the conversation continues, it becomes clear that this was intentional, and you are willing to die on the hill of tiny irrelevant context. Noted.

You are still wrong and apparently massively confused.

Let's assume we have a server that has a certificate for the host name 1.2.3.4. That is, a certificate with a subject alternative name of type dNSName, value "1.2.3.4". Now, an HTTP client is instructed to request the URI https://1.2.3.4/foobar, POSTing a valuable secret to that URI. This HTTP client puts "1.2.3.4" into the SNI dns hostname field, as you seem to be believe to be correct behaviour according to the RFC, right? Now, the server will correctly respond to that with said certificate, right?

What happens next? Should the client accept that certificate or not? Why should it? Why not?

quesera · on Dec 27, 2017

> So, this is not a discussion about whether or not you can specify IP addresses in the SNI hostname field?

In fact, no. This is a discussion about whether textual representations of IP addresses can be used as inputs to tools that speak SNI, be used to specify a particular cert on the server side, and be conveniently extracted out of the sniffed network traffic comprising that handshake.

As it happens, the input conversion, the processing for usage in code written for spec implementation, the network stack conversions, and the sniffer capture reconversion back to a text representation for human viewing are all manipulative of the data.

But all of these manipulations are predictable and reversible, and if you want to call one of those data stages a "DNS FQDN", that's great but it isn't inherently correct, outside of the context of the spec which deigns to treat all final mapping tokens as DNS FQDNs, and to label them such -- but does not actually make them fully-qualified, nor the results of DNS queries.

We might have different opinions about the context of this discussion, but I would suggest that if you were to reread the thread from the beginning, there's really not much opportunity for confusion.

In any event, it's clear that this discursion is not advancing anyone's understanding of anything. Good luck in your future endeavours.

zAy0LfpBZLC8mAC · on Dec 27, 2017

> In fact, no. This is a discussion about whether textual representations of IP addresses can be used as inputs to tools that speak SNI, be used to specify a particular cert on the server side, and be conveniently extracted out of the sniffed network traffic comprising that handshake.

Well, then your original description was pretty misleading. I'm just wondering why you didn't include, I dunno, street names? You can use most street names as input for the same purpose, right?

> But all of these manipulations are predictable and reversible

Nope, that's precisely the problem. You can not distinguish the hostname "1.2.3.4" from the IP address "1.2.3.4" after they have been encoded in that manner into the SNI request, hence the encoding is not reversible (and leads to collisions).

> We might have different opinions about the context of this discussion, but I would suggest that if you were to reread the thread from the beginning, there's really not much opportunity for confusion.

Well, there is, as it's trivially true and at the same completely pointless to point out that you can use any string that fits the syntactic requirements to put it into a field of a protocol message to produce a syntactically valid protocol message.

So, yes, you can put a string that was produced as a textual representation of an IP address into the SNI hostname field. It just so happens that that is essentially guaranteed to lead to certificate validation failure during connection establishment with any non-vulnerable TLS implementations.

Was that really the point that you were trying to make?

quesera · on Dec 27, 2017

Your literalism defeats you.

Of course you can store first names in a last name field. You can even choose to invert the label-semantic relationship between the two fields in your code. If "Bob" is stored in last_name, the data does not semantically become something it is not just because someone mislabeled it. It's a first name, no matter who wants to call it what.

"8.8.8.8" is not a DNS FQDN. Of course it could be, but it is not, in current practice. If you think of it as a DNS FQDN, you cannot assume it to have any relationship to IP address 8.8.8.8, which in all resolver libraries, it does. So it is not a DNS FQDN, and in fact is not even looked up when supplied. It is very precisely only ever used as a textual representation of an IP address, in dotted quad format. The direct representation of an IP address could not even be posted here. So it is as close and most familiar as we can get to an IP address. And I was always careful to qualify my comments that it was a textual representation of an IP address. Because I know the difference!

You also know that it is possible for "snitest" (or, more precisely, "snitest.") to be a DNS FQDN. But it almost never is. In practice, it is going to be a nodename, and it might not even be looked up in DNS. It's a token, a name, representing the final form of the query made by the client, that resulted in a returned address. And that is what I said all along.

The SNI spec does not get to redefine "IP address (textual representation of)" and "nodename" outside of its own context.

You could have been correct if you said, at the beginning, that while it might be a nodename or an IP address (text representation) in all ordinary contexts, the SNI spec defines anything that results in a valid mapping to an IP address as a "DNS FQDN". And isn't that interesting and not particularly useful? It could also call them "turtles", with equal usefulness.

But instead you accused me of not having read the spec, and you proclaimed that SNI only works with turtles.

You're wrong on both accounts. And we disagree about the definition of "working". Security exploits work. There is no deep epistemological problem created by the fact that the word is given meaning by its context.

Nodenames work in SNI. And textual representations of IP addresses work in SNI. Which is exactly what I wrote in my initial comment.

Since we can't agree on fundamental definitions, I don't think we're going to understand each other. I stand by my initial assertions.

zAy0LfpBZLC8mAC · on Dec 27, 2017

> Of course you can store first names in a last name field.

Nope.

> You can even choose to invert the label-semantic relationship between the two fields in your code.

In which case you aren't storing first names in the last name field, you simply are using an uncommon label for the first name field. Whether something is the first name field is not defined by its label, but by the declaration of its purpose, either explicitly in a specification or implicitly in your code. Having a descriptive, non-misleading label is simply helpful for maintainability, but not relevant to the discussion at hand.

> If "Bob" is stored in last_name, the data does not semantically become something it is not just because someone mislabeled it.

I am assuming you mean a field that is actually declared to contain last names (as opposed to a field that is declared to contain first names, but labeled "last_name"), as that was the premise in my previous arguments:

Are you saying that your software will not produce letters saying "Dear Mr. Bob" if it finds "Bob" in that last name field?

> "8.8.8.8" is not a DNS FQDN.

Well, yes it is. Saying '"8.8.8.8" is not a DNS FQDN' is like saying '"Martin" is not a last name'. Just because you will likely categorize "Martin" as referring to a first name without further context, does not mean it is not a last name.

> Of course it could be, but it is not, in current practice.

Except it isn't current practice.

> If you think of it as a DNS FQDN, you cannot assume it to have any relationship to IP address 8.8.8.8,

Well, yeah, that's my point!?

> which in all resolver libraries, it does.

No, not all resolver libraries, and none of the resolver libraries relevant to this discussion. Take a pure DNS library, feed in 8.8.8.8 as the DNS domain name, and get back NXDOMAIN (or possibly something else in an alternate root). What Unix host name resolver APIs do is irrelevant as that is not the convention referenced by the RFC, the RFC references DNS hostnames.

> So it is not a DNS FQDN, and in fact is not even looked up when supplied.

Just because Unix hostname resolution APIs happen to be unable to resolve the DNS FQDN 8.8.8.8, does not make it not a DNS FQDN. Are you telling me next that _service-name.example.com is not a DNS FQDN because underscores are not allowed in internet hostnames? That format is used for SRV and ACME lookups precisely because it is not an internet hostname, but still a DNS FQDN, so it cannot collide with hostnames, but still can be looked up in the DNS.

> It is very precisely only ever used as a textual representation of an IP address, in dotted quad format.

So ... four-number OIDs don't exist?

> The direct representation of an IP address could not even be posted here.

Which doesn't change that not every representation that takes the form "1.2.3.4" is a representation of an IP address, just as not every representation that takes the form "Martin" is a representation of a first name. The fact that you can not post Martin (the human being) here, does not change the fact that the string "Martin" can be a reference to Ms. Jane Martin.

> So it is as close and most familiar as we can get to an IP address.

And yet, if you find that string in the SNI hostname field, it does not represent an IP address.

> And I was always careful to qualify my comments that it was a textual representation of an IP address. Because I know the difference!

But syntax alone does not define semantics, context matters. The mere fact that some string can be read as a textual representation of an IP address does not actually make it a textual representation of an IP address.

> You also know that it is possible for "snitest" (or, more precisely, "snitest.") to be a DNS FQDN. But it almost never is.

Which is irrelevant to the semantics of the SNI hostname field.

> In practice, it is going to be a nodename, and it might not even be looked up in DNS.

As above: Feed it into a DNS resolver library and find that it can indeed look up the TLD "snitest". That "Martin" is used in practice as a first name is irrelevant when the field in a form that someone wrote in the string "Martin" is labeled "last name".

> The SNI spec does not get to redefine "IP address (textual representation of)" and "nodename" outside of its own context.

It doesn't. It explicitly references DNS hostnames. It never says anything about "nodenames". And it even explicitly states that literal IP addresses are forbidden (and mind you that "literal" here does not mean "four bytes", but dotted quad or hex-colon notation, which are commonly refered to as "IP literals" in RFCs).

Also, the RFC even explicitly says 'It is RECOMMENDED that clients include an extension of type "server_name" in the client hello whenever they locate a server by a supported name type.'. The spec in no way redefines anything, it simply says "if you happen to have located the server using a DNS hostname, you may put it here, if you used anything else, you cannot use this extension".

> You could have been correct if you said, at the beginning, that while it might be a nodename or an IP address (text representation) in all ordinary contexts, the SNI spec defines anything that results in a valid mapping to an IP address as a "DNS FQDN".

It doesn't, and thus that would have been incorrect, which is why I didn't say that.

> But instead you accused me of not having read the spec, and you proclaimed that SNI only works with turtles.

Well, at least you haven't understood the spec?

> Security exploits work.

Yep. But does a server that crashes when the security exploit works work?

> There is no deep epistemological problem created by the fact that the word is given meaning by its context.

Except that that is not the problem. The problem is that you are equivocating semantically different instances of "work".

"I have a car with a broken transmission that I live in. It works fine for keeping me warm. So, the car works. I want to sell the car. As the car works, people should pay me the price for a working car."

You see the flaw in that reasoning, right? That's the fallacy in your reasoning. Saying "the car works for keeping me warm" is perfectly fine, but it does not imply "the car works", because that would imply "... for doing whatever is commonly understood to be the primary purpose of cars".

Noone is denying that you can specify and implement a protocol where you can specify IP addresses for the selection of the TLS certificate to use. If you do so, that works. But that protocol is not SNI. So, if you ask whether that is a working SNI implementation: No, it's not.

> Nodenames work in SNI.

If you mean by that unqualified hostnames: No, they don't. You only can agree between clients and servers that are controlled by you on a method of encoding nodenames as DNS hostnames, and then use SNI with those.

> And textual representations of IP addresses work in SNI.

No, they don't. You can not express IP addresses in SNI.