Surprisingly short text for how long it sets up an analogy with little value.
The problem with XMPP has always been that it is too extensible with too few guarantees in interoperability. You'll know what this means if you ever set up OMEMO over 3 different client implementations. It also does too much, anything from Kafkaesque (pun intended) persistent pubsub to content labels. The quality of clients is simply not good enough, so most XEPs end up in specialized variants of clients or proprietary implementations (Facebook, GTalk, EA Origin) that lean more on ejabberd's scalability than other XMPP features like federation.
Also, XML is not a good serialization format, and some of the requirements in the protocol are pretty exotic and not trivial to implement in a lot of languages. Like setting up TLS on an existing connection (plus the absolute refusal of the XMPP WG to allow on-connect TLS). JSON over websocket or HTTP are just better at reaching a wide audience of developers.
I don't get the XML hate, to be honest. Everyone who drops XML to make it simpler to use eventually re-invents most of XML.
I find typing YAML to be much better than typing XML, but when it comes to serialization, I honestly don't see why JSON is that much better. To solve interoperability, there are three or four JSON schema standards and sometimes extensive written documentation just to explain the types of fields and when they can occur.
Now I need to find out which type of JSON an application uses (does it always use UTF8 or does it break the standard? Which version of the JSON Schema does it use? Is it using a schema at all? What happens when "$ref" appears in the content body? How should I deal with duplicate keys?) and how to properly encode it. It's the quick & dirty way to serialize data, and that's great for messing around with prototypes and getting started quick, but terrible for business critical applications and public-facing APIs.
All serialization formats are stupid and messy in some way but I think XML gets a bad rep because of the defaults many parsing libraries picked (and the vulnerabilities they introduced). Protobuf and friends are probably a much better serialization system for chat messages.
I don't see why developers fear XML so much. I think it has to do with the fact that everything is built by web developers now. Sockets have been replaced by sockets and concise protocols have been replaced by messy HTTP requests. I wonder how long it'll be before ISPs start blocking any traffic not directed towards port 80/443.
When do you put something in an attribute and when do you give it its own tag? What if semantically it belongs in an attribute but the kind of complicated parsing rules make putting it there cumbersome? Should you use <![CDATA[]]> or base64 to encode non-xml data? If you use CDATA what if the data has a ]]>? If you use tags for hierarchical information, what do you do if there happens to be some random non-whitespace/comment data between the tags as if it was html?
I don't really like interacting with json as a human user (mostly because restricting trailing commas and comments are both utterly awful design choices), but I'd take it a hundred times over the enormous ball of complexity and decision paralysis that xml imposes on you. There's just too much there there.
> When do you put something in an attribute and when do you give it its own tag? [...]
Oh, the same old red herrings around irrelevant details.
The simplest answer is: it doesn't matter. Let your serializer of choice handle it.
More in-depth answer: depends on what your schema tools and serializers support best. For example, I define my classes in C# and use dot net's built-in schema exporter to generate schemas (which in turn get compiled again into strongly-typed classes for other languages, e.g., Java). I chose DataContractSerializer, and its rules are simple: "user" data belongs to elements, serializer metadata belongs to attributes. Which makes sense because arbitrary attributes can appear on elements without breaking the schema (e.g., DCS uses xsi:type attribute for polymorphic deserialization). It also decides for you to use base64 for binary data.
Bottom line: use tooling and don't try to tweak the details of the XML form.
>The simplest answer is: it doesn't matter. Let your serializer of choice handle it.
Except, this absolutely matters! I've had serializer incompatibilities across platforms on both things like json and much simpler binary formats.
Letting the "serializer handle it" falls apart in reality, especially on extremely complicated formats like XML. The only reliable way to make complicated cross platform de/serializing work is to define a feature subset and adhere to that.
OK, but this approach is totally useless when it comes to a public API, which is the topic of this thread. Especially when it comes to backward compatibility.
The topic is about decision problems, not about backward compatibility. That said, JSON has corner cases too, similar to XML or worse. All those complaints are just rationalization of hipster propaganda.
JSON has corner cases but the advantage is that every JSON document has a single obvious mapping to language primitives — dicts, lists, strings, floats. Nobody has to agree beforehand how to load JSON data.
In contrast the generic mapping for XML is a
Tree[str (name), Dict[str, str] (attrs), Union[str, Tree] (body)] which maps so poorly between languages that people do one of two things — implement formats on top of XML to do serialization which leads to non-interoperability when different software does it differently, or parse to a database-like “Abstract XML
object that you query with xpath.
JSON maps well to javascript types, but not anything else. And float isn't an obvious mapping: the standard does have a concept of an integer number, and most numbers are indeed integers. Arrays do everything objects do, but have better performance and better defined behavior.
>Nobody has to agree beforehand how to load JSON data.
Such agreement is never necessary, it's up to the programmer what to write, and the standard doesn't specify behavior of JSON parsers anyway, it only defines JSON documents. For example there's no need to use hashtables, it's a random javascript artifact due to parsing JSON with the eval function.
>when different software does it differently
I assume you mean schemaless documents here. Those are always abstract databases, both XML and JSON. I suppose there's jq that can query abstract JSON databases.
> JSON maps well to JavaScript types, but not anything else.
I grant you that JSON might be equally as awkward as XML languages like C but pretty much every language -- Python, Ruby, Java have very sane mappings to and from JSON types. You don't ever really have to "query" a JSON object, you just `json.loads` and `for item in obj["key"]:`. Even in the cases with schemas you're still usually only working with primitive types.
> Such agreement is never necessary, it's up to the programmer what to write...
What I mean is that there's not weirdness like having to encode types in the base document. You don't have to do things like
where different projects / parsers might do it differently. The "abstract JSON types" are actually useful and expressive where in XML everyone has to carve out their own way to represent lists, mappings, and numbers out of trees because basically nobody works with just trees in day-to-day work.
I think we might be talking about two different use-cases. If what you want to do with XML / JSON is serialize arbitrary classes in $specific_language and then read it back then nothing really matters; the on-disk format is just an implementation detail. But abstract JSON works really really well as a schema everyone agrees on and supported by every language.
> You don't have to do things like [...] carve out their own way to represent lists, mappings, and numbers
I work with XML extensively and out of hundreds of classes and fields, I've needed an arbitrary dictionary maybe a handful of times. Mapping/dictionary is json's abysmal replacement for a class/struct in which case you'd have XML like
<MyClass>
<Field1>Value</Field1>
</MyClass>
IOW, _the tag is the key_ ! List? Simply repeated elements. Numbers? What are you talking about, they're directly representable in XML and XSD knows about integers, floats, etc. (unlike json).
You don't show anything with that. Sure you can walk through any schemaless JSON document, because it has generic JSON document structure, but the same can be done for XML too in any language. You can't make sense of the document this way beyond its wellformedness. There being numbers don't help you much, you can't tell anything about them beyond them being numbers.
>JSON maps well to javascript types, but not anything else.
Except it does. Take python. Ruby. Any language that has a notion of dicts, lists and strings/ints/floats. That's basically every high level language ever. Even exotic stuff like tcl. And e.g. C, a low level language, has a thousand implementations of those same structures.
The often mentioned design paralysis of choice between elements and attributes - in JSON there can be many ways to implement a collection of name/value pairs. One interesting case is compound key: you can use a mini serialization format and still make it an object (actually saw this in the wild).
> in JSON there can be many ways to implement a collection of name/value pairs.
Are there? JSON has dicts and lists. I mean you could store a collection of name value pairs in a list, maybe even a list of lists, but that's just stupid and an incorrect usage of the format.
Where as in xml there really are tons of ways to do that, and ALL of them are awkward
Most of these just amount to "some parsers are bad", which.. I mean sure? But xml's surface area for parsers to be 'bad' is so much greater, and its gotchas are thus much more subtle. I hope you're not trying to suggest that all xml parsers are identical and perfect.
Your assertion was that json's edge cases are "as bad or a little worse" but imo this document doesn't suggest that at all. Every single thing listed in it can go more wrong with xml, not less.
This just don't work for interoperable protocols (or file formats). If you have a protocol with multiple implementations, they may be written in different languages, with different XML libraries.
The serializer exports the schema and you map the schema to whatever code is needed, using whatever XML library, to extract the data based on the schema. XML rules are strict and how to extract data from the document, given schema, is unambiguous. If you don't have a decent XML library, then you're stuck. Oh wait, the same holds for HTML any other format.
This is really not a simple answer. Now you've just compounded the problem with N solutions, where N is the number of encoders in the wild in use by people who want to talk to your service.
And then you've basically got SOAP, which is one of those acronyms that manages to be none of its constituent words in practice.
Posted in a comment above, but the idea is that text content is text content, because XML was intended as a document interchange format, not a data interchange format.
When you look at SVG content for instance, you'll notice colors and coordinates are contained within attributes, because it means a document reader which cannot understand some crazy new drawing element would still interpret that data is text content, leaving the document accessible (if perhaps badly formatted).
If you care about that rule, then structured, semantic, non-user-accessible data uses elements for structure and attributes for data. This also lets you ignore the difference between semantic and non-semantic whitespace - no whitespace has semantics.
CDATA sections are (usually) represented in tooling, but should be considered just a text node with different escaping rules (unless your document format actually assigns a purpose, which is a really bad idea from an interoperability perspective). CDATA is not meant to provide a way to embed binary data (both XML and JSON are somewhat bad for information which is not primarily text).
Similarly, processing instructions are somewhat orthogonal to the document format. I believe XSLT is the only spec which defined a standard behavior for them, but there were examples such as commercial document editors which saved information like the cursor position as processing instructions.
FWIW (as a human interacting with JSON) - there are extensions to JSON such as JSON5 which aim to add the extra flexibility that makes data entry easier. JSON5 adds comments, trailing commas, unquoted symbols, single quoted strings, and so on. Perhaps its biggest issue is that it allows non-finite number values like NaN, which makes it a superset of JSON at the data model layer. So you can't be guaranteed JSON5 text can be stripped and quoted into valid JSON text.
One of the virtues of JSON is that you can break it into parseable chunks by line, which also enables stuff like line-delimited JSON streams. JSON5 seems like a good format for a configuration file, though.
The problem with JSON alternatives is that now I come to say that I wish HJSON [0] had become or becomes the standard... so now we have the N+1 Standards xkcd comic strip again
I liked this summary: XML is almost always misused [1]
The gist is that XML is best used for markup, because it inherits assumptions from the "document" metaphor that are not needed and are sometimes unhelpful for data interchange. An example of this is hashmaps & sets--they intentionally leave "order" or "sequence" out of their representation, but documents as a form of data interchange force you to keep thinking about it.
https://xmpp.org/extensions/xep-0394.html - a laughably perverse usage of the already perverse XML format
I don't know why they did this but I guess they had reasons - perhaps XML doesn't actually do that well even for markups
But then web developers came along and just put this directly into the DOM of their web clients, leading to endless XXS exploits, so XEP-0071 was burned at the stake.
XEP-0393 might look ad-hoc, but it's essentially what people were typing into their chats and emails since time immemorial.
People sometimes think this is Markdown and then pick a markdown library off the shelf and then the HTML passtrough bites them, leading us back to the beginning.
I really don't understand how Matrix and Mastodon etc are allowed to pass around HTML embedded in JSON as if that somehow solves all those problems.
Tbh if a client is dumb enough to put xhtml-im directly into DOM with no verification that is the client's problem, not the XEP's, and that should be no reason to cancel it.
JSON seems to have better parsers than XML, and XML is more verbose. XML has comments though. That said, I think they're more or less equivalent. I hate YAML, though. Awful awful language. Surprisingly complex, and whitespace sensitivity in a language that's templated all the time? Must be the work of the devil. JSON > XML >>>> YAML, IMO.
I used to come from the whitespace-hate camp but I just use linters now and presto literally all of my problems ever with YAML gone.
Use YAML for k:v things, programming languages everywhere else. It's eminently readable and writable and with those linters I mentioned, untroublesome.
> I just use linters now and presto literally all of my problems ever with YAML gone.
Copy a valid fragment of YAML from one file into another file, run the linter, and presto! garbage. Whitespace-indented formats are poison for auto-formatters. They might have made sense back in the 90s when such "advanced" tooling was rare, but they're a bad choice today.
I don't understand this. If you copy a piece of JSON out from one file to another without thinking it will also result in garbage. You need to know where to place code. That seems reasonable to me.
If you copy&paste an object {"a": 5, "b": 6} or an array [1, 2, 3] from one JSON file to another (or from one location to another in the same file), then the structure of the result is unambiguously determined by the opening and closing delimiters {}[]. When you do the same with an indented YAML fragment, the indentation level at the source location might be different from the indentation level at the destination, wreaking havoc on the structure of the result. You have to manually adjust the indentation of the pasted fragment before auto-formatting the file, or else the information about the intended indentation is lost.
I don't know, man. Have you ever had the need to template a yaml file so that a piece of text is inserted from another yaml file? Indentation becomes quite important, and you remember that experience. Indentation math is not my idea of fun.
I think XML gets a bad rep for a similar reason that CSV gets a bad rep.
i.e. there is a wide variance of what “passes” for XML, and two systems that profess to “speak” XML often speak mutually incomprehensible dialects.
I think each of your valid complaints about JSON extend to XML, for example. Totally agreed that XML gets an especially bad rep because most xml parsing/generating libraries in popular programming languages are frustrating.
> i.e. there is a wide variance of what “passes” for XML
not really, it's standardized.
what differs is how it's used, i.e. a top of most serialization formats there is another implicit often overlooked serialization layer which roughly maps domain logic to the structures the serialization format supports.
The problem with XML is similar to that of XMPP, to many features and variations, nobs to twist, things to easily subtle get wrong, etc.
Also string content encoding is terrifying in XML as it overlaps string content and string formatted control structures and pretty printing in a messy way. (Like imagine pretty printing _inside_ of a json string without clear separation of weather or not a newline is content or formatting.)
Then you hit the hell of having to interact with some system (almost always written in Java) that's using some 20 year old SAX parser that barfs if the tags aren't in the exact order it expects.
> I don't see why developers fear XML so much. I think it has to do with the fact that everything is built by web developers now.
I heard exactly this same thing 15 years ago from a mainframe programmer I was working with on an integration project, but flipped around: "this XML nonsense... you web developers want everything to look like HTML!" Wasn't a good look then, isn't a good look now.
> does it always use UTF8 or does it break the standard?
It always uses UTF8, or it is invalid JSON and MUST NOT be parsed. For the same argument against XML, you could argue whether it actually honours the encoding field (i've found many cases where things lie in the encoding field, and still parse fine by XML parsers)
> Which version of the JSON Schema does it use?
Quite literally the only one I have ever seen in use, ever, is JSON Schema[0].
> Is it using a schema at all?
Most often, no.
> What happens when "$ref" appears in the content body?
Nothing? JSON doesn't have lookups. There is no way to do lookups in JSON. If you do otherwise, you do not have JSON anymore.
> How should I deal with duplicate keys?
This is an actual, real issue with JSON.
XML is a terrible format. It's filled to the brim with footgun features like entity expansion (only ever used to DOS servers), no data types (everything is a string, your parser just needs to know better), no meaningful reason to have both attributes and content, ambiguity between an array of one element and an element, etc. etc. etc.
XML was originally meant to be a document markup format, an evolution of SGML. While the markup was consistent, the interpretation of it was to be left to the actual document format definition.
Unfortunately, there was another camp which was trying to change it to not just be an extensible document interchange format but a data interchange format. These have different requirements.
For example, someone asked when you put data in an element vs an attribute. There was a push to provide guidance at one point based on SVG - everything other than text data (such as coordinates making up graphics) were attributes, such that a non-SVG view of the document would just be all of the textual data appended.
Most of the tooling issues came from this disconnect between document and data oriented interchange, such as tooling having options to toggle between interpretations of pretty fundamental concepts such as namespaces.
It also became pretty common for technologies to come out of the document-oriented space (e.g. XPath and XSLT) which led to it being basically impossible to compose or decompose XML-based data without potentially changing its meaning - unless you were doing so with tools that understood the interpretation of the data itself.
CBOR is a pretty good as a smaller / "better" JSON if you have a free hand choosing.
It has ambitions to replace ASN.1 / X.509 coding for certs, but I don't see it being used.
It is a bytewise binary coding, so you can't really write it by hand, whereas you definitely can expect to do that for JSON. However if you must send binary data, then it is a more natural fit, it will send it with a short header overhead rather than have to bloat it with base64.
All of these complaints also seem like valid complaints about XML, except the thing about duplicate keys.
Also as far as I know, there is only one "JSONSchema", and it has a clearly defined version scheme. Are there other JSON Schema standards that I'm not aware of? I'd be interested to see them.
as long as you don't take stuff as ASN.1 representative for binary formats. Wait, there is a human readable serialization format for ASN.1 using XML...
>Now I need to find out which type of JSON an application uses (does it always use UTF8 or does it break the standard?
Utf-8 isn't standard to XML, it's required by XMPP. So in this scenario the same could be required from json.
>Which version of the JSON Schema does it use?
No one uses those and everyone is happy. As for XML schemas, xmpp makes even harder to use than it makes everything else. Because it has this endless xml document that you end up having to parse with a streaming parser and those understandably don't support schemas. You end up having to build the little DOMs yourself, for every stanza. This isn't me theorizing, this is what clients (dino, gajim, tkabber) do. Now with the homebrew DOM you're unable to use xpath (unless again you implement it yourself which is no easy feat) which makes xml's awkwardness i.e. the fact that xml doesn't map into common language structures like dicts and lists so, so much worse.
> How should I deal with duplicate keys?
Generally you don't because libraries don't even support that, and people generally are decent enough to never use those.
>All serialization formats are stupid and messy in some way but I think XML gets a bad rep because of the defaults many parsing libraries picked (and the vulnerabilities they introduced).
And because it is messy. Not pushing for json in particular but even json doesn't have dtds, entities and so on and so forth. Thus parsing libraries don't implement such features, thus fewer vulnerabilities. Whereas vulns in xml parsers are basically the norm - look at python's out of the box xml parsers - there are four of them and each one has at least some vuln marked in the table in official python docs!
Again, I'm no json fan, but it's already light years better than xml. "Any damn fool could produce a better data format than XML." Look at bittorrent's ad-hoc format, bencode. Even that is much better than xml. And in a high level language an entire parser would take you one evening to write with no prior knowledge of the format. And that implementation would be probably about the size of those ad-hoc DOM implementations in xmpp clients, that don't even do any parsing themselves, and would be much more pleasant to use.
XML is also superfluous by this standard, since by induction on the principle of "unnecessary reimplementation", we could be exchanging data via S-expressions.
> Now I need to find out which type of JSON an application uses (does it always use UTF8 or does it break the standard? Which version of the JSON Schema does it use? Is it using a schema at all? What happens when "$ref" appears in the content body? How should I deal with duplicate keys?) and how to properly encode it. It's the quick & dirty way to serialize data, and that's great for messing around with prototypes and getting started quick, but terrible for business critical applications and public-facing APIs.
I've never known any of XML's features to help with actually solving a business problem. Oh great, your documents have a mandatory schema; what benefit does that actually give you? It doesn't mean you can skip doing logical validation of a request/response post-deserialization (a schema can enforce that an id is an integer, but not that it's an id that actually exists in your database). In theory it might make it easier to complain about bugs where the real system doesn't match the documentation, but in practice it's more likely to be considered an error in the schema than a bug in the system. As far as I can tell all that using a schema actually does is means that you'll occasionally reject a document that you could have processed, or (even better) refuse to load the document because the schema's website is down.
Similarly with namespaces: the only impact I've ever known XML namespaces to have is to frustrate users who can't understand why their XPaths aren't matching until they configure all their namespaces. I know in theory there are cases where one XML document embedded in another might mean that an XPath would match something it shouldn't, but I've literally never seen that happen in real life, whereas having everything silently not match until someone configures namespaces happens all the time.
Similarly with custom entities, the only thing I've seen them used for is DOSes.
There's also a bunch of other issues: standard XML Schema is awful (RelaxNG is better, but you can't use it because it's not the official schema format), the format is almost-but-not-quite whitespace-insensitive which gives you the worst of both worlds (and similarly for text encodings), and XML is deeply associated with verbose overengineering because that's the main thing it's historically been used for. But even putting those aside, it really is as bad as it's made out to be.
> Protobuf and friends are probably a much better serialization system for chat messages.
Completely agreed.
> I don't see why developers fear XML so much. I think it has to do with the fact that everything is built by web developers now. Sockets have been replaced by sockets and concise protocols have been replaced by messy HTTP requests. I wonder how long it'll be before ISPs start blocking any traffic not directed towards port 80/443.
That's pretty backwards IMO. Have you ever seen what SOAP requests actually look like? It's like a complete reimplementation of everything that HTTP does, in more verbose form... but it's only ever used on top of HTTP. The web/JSON stack has a long way to go to catch up with WS-* for messy overcomplication (and I say that as someone who thinks WS-* is not actually as bad as it's generally considered).
> I've never known any of XML's features to help with actually solving a business problem.
Namespaces let you version data and unambiguously mix elements with the same (simple) name in the same document. Esp. the first point is necessary for long-term data archival.
> Oh great, your documents have a mandatory schema; what benefit does that actually give you?
It can be compiled to strongly-typed DTOs for your language of choice. I.e., seamless, strongly-typed cross-language data exchange. As opposed to manually picking apart the document with DOM or letting the serializer guesstimate the type as with untyped json.
Also, schema can express (and validate) in-document references.
Etc. XML without tooling is painful, yes. With tooling it's a powerful and reliable tool.
> Namespaces let you version data and unambiguously mix elements with the same (simple) name in the same document. Esp. the first point is necessary for long-term data archival.
How do namespaces help with versioning? That seems like a complete non-sequitur.
As for unambiguously mixing elements with the same simple name, I acknowledged that that's a theoretical possibility, but I've never seen it be important in practice.
> It can be compiled to strongly-typed DTOs for your language of choice. I.e., seamless, strongly-typed cross-language data exchange.
The tooling for that is very limited and ineffective, IME, to the point that you're better off writing some class definitions and generating XML or JSON serializers from those. There's a huge impedance mismatch between the kind of constraints that are natural to express in XML schema and the kind that are natural to express in programming languages.
> How do namespaces help with versioning? That seems like a complete non-sequitur.
They tell you how to interpret data and to which schema definition the data conforms to. Elements `<a:MyElt>` and `<b:MyElt>` tell you explicitly how to interpret them. Without the namespace, you have to guess.
> The tooling for that is very limited and ineffective, IME, to the point that you're better off writing some class definitions and generating XML or JSON serializers from those. There's a huge impedance mismatch between the kind of constraints that are natural to express in XML schema and the kind that are natural to express in programming languages.
My experience is totally the opposite. If anything, XSD can express more constraints than most PLs will allow.
> They tell you how to interpret data and to which schema definition the data conforms to. Elements `<a:MyElt>` and `<b:MyElt>` tell you explicitly how to interpret them. Without the namespace, you have to guess.
So you'd mix and match elements from different versions of the schema in the same document? Does that work? I've never seen that done and can't imagine how code would handle that unless it was via some very simple translation rules (in which case the value would be minimal).
(I've seen documents that use the (single) schema declaration as a way of declaring that they're version 3.0 or version 3.1, but there doesn't seem to be any practical advantage to that over something more lightweight like "_version": "3.0" at the start of the document).
> If anything, XSD can express more constraints than most PLs will allow.
I don't actually disagree with this, but they're different constraints and it's not easy to losslessly convert. So it's very hard to use XSD as the source of truth and generate good, idiomatic versions of your constraints in the PL representation of your types. (It's also difficult to generate good, idiomatic versions of your PL constraints in XSD)
> So you'd mix and match elements from different versions of the schema in the same document?
No, the use-case is having an archive of documents conforming to different schemas. Or another use-case: schema evolves during the system's lifetime and you don't want to / can't upgrade old data to new schemas.
And yes, I even mix and match different schemas in the same document: pre-parsed information is stored in "my" elements, whereas the original data source is stored as extension in the XML, in its own namespace etc. So when the need arises for further processing/parsing, everything's already there in the document, with _the_ definitive source of truth. (Uninterpreted raw data)
> idiomatic versions of your PL constraints in XSD
That way is very easy: no PLs support (the XSD equivalent of) foreign keys, so that's "solved". Structs and inheritance are directly expressible, and even sum types from the languages that support it. Granted, XSD using sum types generates clumsy classes in PLs that don't.
> No, the use-case is having an archive of documents conforming to different schemas. Or another use-case: schema evolves during the system's lifetime and you don't want to / can't upgrade old data to new schemas.
Right, I talked about that case - AFAICS the schema is acting as a basic version tag (which is worth having, but can be done much more simply).
> And yes, I even mix and match different schemas in the same document: pre-parsed information is stored in "my" elements, whereas the original data source is stored as extension in the XML, in its own namespace etc. So when the need arises for further processing/parsing, everything's already there in the document, with _the_ definitive source of truth. (Uninterpreted raw data)
Embedding the original document sounds useful, but namespaces still seem vastly overengineered for that case - you'd presumably have a standard, well-defined place for the original document to go, so anything parsing/using your document knows about it and can just skip that node. I guess you get a little bit of value from being able to write xpaths that will never accidentally hit a node in the embedded document, but again that's something I've never seen actually be a problem in real life. Namespacing seems to be built to support the idea that you'd arbitrarily interleave nodes from multiple schemata, and that still seems like a solution in search of a problem.
> That way is very easy: no PLs support (the XSD equivalent of) foreign keys, so that's "solved". Structs and inheritance are directly expressible, and even sum types from the languages that support it.
Oh? Can you point me at a good implementation for Haskell or especially Scala? (TBH I think if we're accepting that the PL is the source of truth for what the constraints are then we don't gain much from encoding more of them into schema versus just checking them after parsing, but every little helps).
Except that it's syntactically separate so no other version tag can masquerade as yours. PLs have namespaces as a separate syntactic construct as well, and for a good reason.
> Embedding the original document sounds useful, but namespaces still seem vastly overengineered for that case
Quite the opposite, it's the simplest option. Everything (original and interpreted data) is kept together, and because of NSs, there's no danger of misinterpreting the one for the other.
> Can you point me at a good implementation for Haskell or especially Scala?
Not using those.
> I think if we're accepting that the PL is the source of truth
XSD can be processed to automatically generate parsing and checking code for whatever other PL than the original one.
> XSD can be processed to automatically generate parsing and checking code for whatever other PL than the original one.
Well, where are the actual working implementations of these things that you're saying are possible? You say there are tools that have good conversions between XML schema and language sum types; what tools? (and if not in Haskell/Scala then what languages?) Because my experience is that you just don't get good idiomatic representations from the tools, and end up either maintaining the schema and the code in parallel manually, or autogenerating a "dumb" schema that's missing most of your validity constraints.
Over a decade ago I worked with a vendor who had a SOAP service where every method had a single parameter, and that parameter was a Base64-encoded XML document containing the actual parameters. Just this year I was reviewing the web services documentation for a Fortune 500 company and realized that their SOAP offering was a transparently thin (like a soap bubble?) façade over a simpler "POST us some XML" API. Between these two experiences I have worked with more SOAP services than I can count, and I can't think of a single one where SOAP made anything easier.
> I've never known any of XML's features to help with actually solving a business problem.
Arm are publishing their full ISA processor specifications in XML, so it is 100% machine readable, see [1] for the actual specification and [2, 3] for why you want a machine readable ISA spec in the first place. Arm has been really ahead with this, and all the other processor manufacturer are playing catchup.
I'm not saying that XML is the idea format for this purpose, but it clearly works for them. Which other format would you think is better for this purpose? Miminal requirements:
works for gigabytes of data including graphics, automatic format checking, conversion to/from other formats, widely supported, easy to hire for, standard IDEs (e.g. VSCode plugins), long-term stability (processors need support for decades).
It may be "100% machine readable" in some sense, but I looked at a couple of those XML files at random and it seems like the majority of the content is a human-oriented markup document. Looking at it from the other side: why are they using XML rather than JSON? I don't think they're gaining much from schema or namespaces; I suspect the main reason is that that lets them have a single-file markup document that contains embedded data in a standardised way (by using XHTML and XSLT), which is sort of legitimate but only really for a niche archival use case (because normally having structured source data and some process that generates the HTML document from it is absolutely fine).
Realistically you can achieve much the same thing by having a HTML file that embeds a chunk of JSON somewhere and then has a bit of javascript to render the page based on that JSON (i.e. filling the role of XSLT) and in most practical respects that's a lot better. The only missing part is that there's no defined standard for how you do that (it's not hard to do it, but there are different places where you could put the JavaScript and the JSON and for an archival document you want to be sure a future reader will know where to look) - and, given that XHTML+XSLT is widely seen as a dead end, I suspect there's not much enthusiasm for defining one.
How do you plan to serialize chat messages into protobuf format? Just a protobuf message with one json/xml blob in it? Protobuf can't easily contain an arbitrary tree structure.
> Existing chat programs use formatted messages with quotes, bold text, smilies and whatnot.
Right, but they're not arbitrarily nested tree structures. They're structured data that you want to represent in a structured way (e.g. lists of different kinds of spans), but I don't think a full DOM tree is a good representation.
Quotes apparently allow nesting. That said, serializing trees in linear form isn't really a problem: each node simply specifies parent's index. The problem is that these nodes are polymorphic, protobuf doesn't like that.
If XMPP and Matrix were based on the same protocol stack arguments to keeping XMPP this whole time would have made sense.
But they aren't. Matrix is http and json while XMPP is tcp and xml. Protocol extensions to xmpp have added http transports - but thats just wrapping non-http requests in http and making processing harder.
I think a good comparison is X11 vs Wayland. X11 was retrofitted with all the functionality Wayland has natively - "why switch" is a legitimate question. But we will see in the next 20-30 years how the maintenance burden of X protocol legacy means we will see any new development using Wayland because its so much easier to do. In the same sense, chat apps will probably favor Matrix over XMPP because of the protocol maintenance burden - even when drop in libraries exist that implement XMPP, the inherent complexity makes them buggier, slower, and requires a lot of developer nuance to get right and its all due to legacy cruft you have to accommodate on an old standard.
Its also like how more modern kernels stopped trying to be Unix compatible. Trying to meet the POSIX standard in the 21st century is kinda a waste of effort when a lot of it is unnecessary. The model is sound, its just adhering strictly to the eccentricities of it is just unnecessary headache.
I was impressed at how easy it is to work with the matrix protocol. While playing around I was doing it all interactively in bash using curl. Want to read the latest messages? Curl the sync url. Want to send a new message? Post a small json object.
It’s actually easier to use by hand than IRC which requires holding an open connection and quickly responding to pings.
It becomes a little harder when end to end encryption is on but you just import the library they supply for almost every language and then e2e becomes transparent.
I've tried to set up some Matrix projects. The Client-Server API is easy to work with, but as soon as encryption is involved, things start getting messy. Many libraries have a hard time working right with E2EE enabled, because suddenly you need to keep track of all manner of things that aren't always documented well.
I tried to hack E2EE in by using Pantalaimon [0] but running that on a server with the necessary management capabilities is very tricky and doesn't do cross signing, so I've come to the conclusion that it's effectively useless for my use cases.
Every now and then I check back on the current state of E2EE in libraries and it does seem to be improving. Hopefully the entire process becomes easier next time I get the time to work on my proof of concept code.
> I was impressed at how easy it is to work with the matrix protocol
I think you were interacting with Synapse, the Matrix homeserver implementation, which is why you had an easy time. But I can't imagine the work done by Synapse to sync up with other servers is easy - and that's what I would consider the real protocol.
There are two protocols. The Server-Server protocol, and the Client-Server protocol. Both are "real". I was using the Client-Server protocol which synapse implements but other servers would implement exactly the same HTTP protocol.
...Which indeed makes life harder. As mentioned elsewhere in the discussion, there are compliance suites, and capabilities and feature discovery, but the clients are indeed quite diverse, and sometimes it's hard to know what to expect from them.
> XML is not a good serialization format
I personally find it more suitable than JSON for many purposes, including extensible IM, for a couple of reasons: namespaces help to avoid conflicts, while the data model itself is more convenient for general data encoding since it's straightforward to encode sum types in XML (where a tag corresponds to a constructor), unlike in JSON (where there's a few ways to achieve it, though with JSON-over-HTTP that part is usually only done on the top level and goes into HTTP query). Maybe I'd prefer s-expressions to XML for an IM, but not JSON.
> and some of the requirements in the protocol are pretty exotic and not trivial to implement in a lot of languages. Like setting up TLS on an existing connection (plus the absolute refusal of the XMPP WG to allow on-connect TLS).
I wouldn't call that one "exotic": STARTTLS is used by a few other common protocols, and allows for opportunistic TLS; easily usable with common libraries (OpenSSL, GnuTLS) too. There's XEP-0368: SRV records for XMPP over TLS [1] for direct TLS connections though.
I've never understood why XML gets so much hate outside of the context of the browser. Yes, it's verbose but fully featured parsers are everywhere and it seems far easier to include a random (base-64'd) binary blob in place if required as opposed to JSON. This is an area I know little about -- I'd love someone more knowledgeable to educate me.
I think it's partly down to being exceptionally verbose for simple tasks. Dealing with extra empty text nodes, namespacing, etc. JSON for most simple use cases is just "dump kv, load kv".
I'm not sure I understand the issue with base64 in JSON? Have I misunderstood the point there?
Additionally, XML, with its nesting and attributes, doesn't map particularly well to a specific programming construct. Also, you can represent an object in a lot of ways.
JSON, on the other hand, is dead simple: You have your list/array, dict/object and four value types. Representing JSON in your favorite language is straight-foward and how the serialized version of a hierarchy looks is immediately obvious in nearly all cases.
Depends on the language. XML maps nicely to C/C++/Java/Rust/... structures, that you can generate from a schema at compile time. Dealing with arbitrarily typed arrays and objects is painful in some languages because you need to use variants and dynamic-cast every access to the document.
That's exactly the problem I was going for - feature richness is the enemy of simplicity. The fact that you need to read a schema (and learn one of the schema languages) makes starting out with XML just much harder than JSON.
You implicitly need a schema in JSON too: it's the API documentation.
Then, every implementer in a statically typed language needs to translate the JSON format into structures in their own language. And this is impossible to automate, unless the JSON API uses a formally defined schema (eg. OpenAPI or GraphQL schemas), but then we're back to square 1.
If you don't care about that (eg. in dynamically typed languages), you can parse XML just like you parse JSON, using something like https://github.com/martinblech/xmltodict
Unbounded extensibility is a great way to get adoption of a standard, in the sense that a lot of products will say "Fully $standard Compatible!" in their marketing materials. It is a bad way to ensure that the standard actually means anything outside the marketing materials.
> Also, XML is not a good serialization format, and some of the requirements in the protocol are pretty exotic and not trivial to implement in a lot of languages. Like setting up TLS on an existing connection (plus the absolute refusal of the XMPP WG to allow on-connect TLS). JSON over websocket or HTTP are just better at reaching a wide audience of developers.
"StartTLS" behavior is pretty broadly pushed for across the IETF. It usually goes hand in hand with SASL support in protocols like IMAP, SMTP, and XMPP.
This is really done for three reasons:
1. Detection via SRV with alternative 'secure' protocol versions requires twice as many DNS lookups
2. It makes it far more likely a client/server will ignore the text on avoiding insecure fallback.
3. It makes it harder for someone who wants to do introspection of traffic to do so cheaply by blocking the standard secure ports.
Rather than multiple ports, the trend has been toward requiring TLS now that TLS certificates are something that can often be automated for free. IMHO took considerable push-back to get a cleartext version of HTTP/2 approved.
I have never seen a good justification for this attestation that isn't some variation of "you're not accommodating the optimization I want to do that is orthogonal to rendering this binary representation into a persistent representation suitable for long term storage or transmission to another system".
Even the great XML/JSON flamewar ended up with the same exact ecosystem of tools being implemented for both encoding schemes. XPATH/jq JSON schema, XML Schemas, etc...
It's all encoding. There isn't good or bad; there's different, and works with what I've got with no additional work. That's it.
One of the particular issues I recall hearing someone talk about with XMPP is that it uses XML in a way that makes it hard to use existing libraries--I believe XMPP basically relies on an unclosed XML document that keeps getting appended to, so you need to have strong partial parsing guarantees that's unusual for XML formats.
That's indeed a common complaint, though streaming XML parsing is not that uncommon, and there's even a common algorithm/API for that--SAX [1]--supported by multiple libraries; otherwise parsing even finite but large documents (e.g., Wikipedia dumps) would require a lot of memory.
That really shouldn't be necessary, though! XMPP is conceptually transmitting a series of independent messages, not an arbitrarily complex document tree. There's no reason why it couldn't have recognized this at the protocol level and transmitted each message independently. (This could have been as simple as placing a CR/LF delimiter between each message and forbidding that sequence within messages.)
It's true that DOM parsers will have a hard time. But event-based, SAX-style parsers have no issue with an unclosed document, and there are plenty of those around (and were around, back in the heyday of XMPP).
As someone who started writing an XMPP client at the time, I can confirm that the stupid unclosed document was painful. It's a leaky abstraction that goes beyond the parsers (that you can't use). It didn't help that the official documentation heralded this as a beautiful innovation.
Actually removing some stuff from XPath, could give guarantees while parsing partial files. That things get ugly sometimes if XML documents aren't complete, doesn't mean that other encodings are automatically better.
People should care about some good hedge grammars for extensible formats instead of getting political about that stupid encoding. There should be sth beyond current XML standards and libraries but please make it not be JSON.
I've posted it in another subthread already, but XML, with its nesting and attributes, doesn't map particularly well to a specific programming construct. Also, you can represent an object hierarchy in a lot of ways and it's very verbose for even the smallest stuff.
JSON, on the other hand, is dead simple: You have your list/array, dict/object and four value types. Representing JSON in your favorite language is straight-foward and how the serialized version of a hierarchy looks is immediately obvious in nearly all cases.
Wait what, since smtps STARTTLS is quite old I though it's by now
well known that _any_ form of late TLS is a massive security vulnerability...
Some form of transport layer security not being the default and starting late would for me on itself enough of a reason to exclude XMPP from a lot of things.
I'm not aware of any vulnerability in XMPP due to the use of starttls. If you (or anyone) can prove otherwise, I'd be happy to see that. We can do coordinated disclosure across projects via the XSF.
There is no attack surface because the data exchanged before the upgrade is mostly superfluous. I always felt like the refusal in the WG to add a dedicated TLS-only port in the spec was more feeding someone's conception of niceness than anything else.
Happy to see it has become de-facto supported by now though. I tried writing a client in the late 2000s with .NET TlsStream and it was not a great experience.
If any data which can potentially be exchanged before STARTTLS can affect the connection establishment in any way (except simply failing it) then there is a problem because of MITM attacks.
If there is nothing like that then there is no point to have STARTTLS.
Idk. if that is the case for XMPP. Note that what matter is what can be send, not what is send as a MITM attacker can modify anything.
But even if not, if there is a risk of an implementation being affected by anything send by a MITM attacker impersonating a server it's an problem.
Like attacking vulnerabilities in XML parsers.
Either way, I have not time to look at the protocol, but STARTTLS is security wise generally a bad idea.
> If there is nothing like that then there is no point to have STARTTLS.
I agree that there is really no benefit to have STARTTLS with modern TLS implementations (e.g. now that we have SNI everywhere), but this was not entirely the case last time the XMPP core specs were revised. However there are multiple benefits to be gained from switching, and that is a change already in progress for some time. I don't doubt that the next revision of the XMPP RFCs will reflect this.
The reason that they needed to set up such a complicated analogy is because they needed a way to justify a hand-wavey fitness function of "extensibility/modularity" instead of the much simpler and more useful "does this thing actually survive?"
Despite (or, as you argue, because of) its adaptibility, the answer to that question is empirically "no."
in my experience (working with academic data, JSON-LD), being aggressively semantic and adding namespaces to your format very rarely results in out of the box machine interoperability, and picking the right format out of a bunch being offered is always extra complexity on the side of the developer. but most of the time you'll likely be writing code specifically for one source of data.
this leads to such standards having a shadow de-facto convention-based standard (e.g. everyone on the fediverse plays nice with Mastodon, despite ActivityPub allowing for things to be different).
if you need it, just versioning your schema is imo a much better experience for developers
Granted, in the case of XMPP (but I'm no expert here) they might have overused namespaces with all those protocol extensions (XEPs) using their own NS. XML namespaces are one of the few additions on top of SGML and might've looked like a good idea at a time when many new vocabularies were expected, but they're controversial even among those who introduced them. Eg consider the following key quote from [1]:
> On the one hand, the pain that is caused by XML Namespaces seems massively out of proportion to the benefits that they provide. Yet, every step on the process that led to the current situation with XML Namespaces seems reasonable.
> This is the very situation where matrix devs should have made use of the
properties of XMPP to improve it. Even the outstanding feature (I admit.
its a fantastic idea) of matrix, decentralized conversation store, could
have been implemented in XMPP as an XEP. Imagine the time and effort
spent on improving XMPP, instead of reinventing wheels in matrix. We
could have had a neat ubiquitous IM platform.
The approach XMPP took may have made sense when it was created, and it definitely had a lot of success early on at creating a truly federated IM network, but a lot of that evaporated when people and companies started needing more from the system than XMPP could guarantee.
XMPP + a bag of XEPs isn't a "neat ubuquitous IM platform" unless we get an XMPP2 that mandates certain key modern XEPs. It's just a big giant mess, the same one it's been for a long time.
Maybe that's where Matrix should have started, I dunno. But they are where they are and the reality is that it's not up to the IETF to dictate how things should evolve. If Matrix supplants XMPP as a dominant open IM federation, and the people behind it want it to be standardized, all the "but XMPP did it first" in the world shouldn't prevent that.
I think the point that FOSS enthusiasts fail to understand repeatedly is that once a protocol becomes so embedded and fractured over multiple implementations, it becomes impossible to change.
There have been efforts to modernise irc as well but IRC still isn’t even keeping pace, let alone catching up.
I think the easiest way to handle this is to actually just drop legacy and start again when you reach this point. It is infinitely easier to build your own solution from scratch than it is to move a mountain and convince an existing community and ecosystem to do what you think is best.
And matrix is proof this works. After trying it again this year, I think they have finally created a product which works really well.
> And matrix is proof this works. After trying it again this year, I think they have finally created a product which works really well.
The server and clients improved. The protocol did not improve.
The protocol is still designed to be a metadata sponge that proliferates this data as far as possible. There’s no way you’d design it in the same way today if you didn’t have intelligence funding in the earliest stages…
If XMPP clients improved, you’d say the same thing. “Wow, I guess XMPP works”
I don't think I'd characterize HTTP as having fractured in anything like the way XMPP has, which seems to have deliberately chosen a wild-west approach to extensibility vs. HTTP's more modest evolution. Until we got http2 and http3 in a span of 5 years, but that's because a big megacorp forced it to happen -- and even then those are backwards compatible with http1.1.
So is there a list of "Modern XMPP" compliant clients, server software and hosts that I can compare the experience of to the Matrix ecosystem?
I was a pretty big user of XMPP with Pidgin on first GTalk and then later DDG's server back in the day, because that's certainly not comparable. I'd really like an open chat solution to succeed, but from where I sit "Can I convince any of my friend groups to move from discord to this?", Matrix is substantially in the lead.
It takes an appropriately set up xmpp server and essentially reproduces a social network. Microblogs, groups, and chat. Implements OMEMO and the like if memory serves. It is a great platform and just as compelling as Discord IMO.
> It takes an appropriately set up xmpp server and essentially reproduces a social network. Microblogs, groups,
This to me is missing the point - I don't even like Discord's attempts to expand into social networking, and Skype's attempts at doing the same was the impetus for many of my friend groups to move to Discord. What's needed is something that does group chats and PMs effectively.
No, this discussion isn't a comparison of implementations it's a question/discussion of why "the latest implementation" is tied to a new extensible protocol instead of a new baseline of the existing extensible protocol. There may well be reasons but "the first implementation was on the new protocol" explains the current state not how things got there.
That's pretty neat. Are there any compliant clients? If I google the XEP you linked I can't find anything but mailing lists and posts talking about it.
We're working on surfacing this information. It's hard keeping track of a diverse and evolving ecosystem - it's been attempted in the past (years ago), but manually keeping things up to date didn't work out because the info quickly got stale. More recently we've built tooling so that projects can document what they support in a machine-readable format on their site or in their repo. Most active XMPP projects are already on board with this, and we should see some nice front ends to it live soon.
The ultimate goal is an easy user-facing site with a shortlist of the clients that implement the expected modern features, alongside a longer filterable list for people who don't necessarily care about certain features (calls, for example, which are generally not expected in TUI apps).
Standards need extensibility, but you can't just slap major extensibility framework on top of a small core and then get a blanked claim that competing standards must be implemented as extensions. A good litmus test is how useful is XMPP without extensions today?
> it will
be better for XMPP and Matrix devs to combine their efforts
I don't like that open chat protocols have taken so long to get adoption either, but coercing an organizational structure seems like the best way to keep the status quo.
Imo, good standards have been tried and proven in a "many-degrees-of-freedom manner" before being accepted, such as a major OSS project with thousands of active users, which matrix is doing now. Design by comittee has a terrible track record.
Just for context -- there's a discussion in the IETF about what collaboration tools it should use (e.g., for working group meetings and general discussion) in addition to e-mail. A few different options have been trialed, including Zulip and Matrix.
This is NOT about the IETF recommending one chat / collaboration protocol over another, or standardising Matrix.
I'd argue XMPP has died long ago, at least for the purpose of the "Anything in this world that has survived for a long-time, had to be fit to withstand selective pressures." argument.
Of course it isn't completely dead, just like some enthusiasts still use about every imaginable historical computing platform, but for practical purposes, I think most public communities have moved, usually either to Matrix or closed platforms like Discord or Telegram.
If the choice is between xmpp and matrix it's a no-brainer to be honest.
I was a matrix node operator for a while, because I love operating all sorts of decentralized stuff for people in my free time. In short, my findings were that the server implementations are not ready for critical production, like the IETF for example.
The choice of server implementation is super important because it locks you into a certain SQL structure and possibly even a certain password hashing.
So back then I obviously went with synapse because the others were not even nearly ready. But at the same time matrix.org is struggling under a new wave of sign-ups. And they're still struggling. I still see messages on boards and IRC like "why is my client just hanging when I join a room", likely because people in that room are federating with Matrix.org and it's super slow.
So that wasn't a good impression of synapse right off the bat. It being the first reference implementation in Python I figured it just wasn't good at scaling. I would need more k8s resources to scale it than I would an implementation in Rust or Golang.
So now I'm waiting for dendrite or one of the other ones to become fully featured with group messages, encryption and all that until I re-launch my matrix instance.
I also won't launch it without a proper implementation for account approval, or invites, either one I make myself or someone else publishes one before that.
AND even if one of the faster implementations is ready, you still have to take into account the client app support for platforms like iOS and Android.
I agree. Matrix isn't quite ready for its prime time, but it's getting there. Synapse is getting more efficient with every update, and the spec is doing so as well.
For example, they recently released the new sync API, which makes joining a room almost instant.
I'm really looking forward to low-bandwidth Matrix as well, which is going to improve performance by a lot.
Client compatibility seems pretty good overall, with Element and Fluffychat available to almost everyone.
In my eyes, the biggest weak point for Matrix as an IM protocol is that there's no business model for server providers.
As a person said I should provide detailed rationale.
* It takes a large amount of text to set up a very poor and low value analogy
* It doesn't say what features XMPP has that actually make it superior to matrix
* It seems extrapolates from the fairly generic forward looking "Imagine a world" that matrix was started from ignorance of the existence of xmpp. A claim that is easily dismissed from any of the development documents.
It has a few questionable follow on claims - basically dismissing any form of encryption as being optional based on deployment despite all current (in the last 10+ years) recognizing the encrypting messages is a core requirement of any communication system.
I have updated the comment with the explicit problems. I still believe that the link post has near zero information as it provides no actual facts to back up any statement. All it says is "xmpp exists so it should keep existing because [hand wave]", with a dash of pseudo-biology thrown in.
I feel I need to be clear here: I understand that simply saying "this is nonsense" to an arbitrary post is not reasonable, but it is also unreasonable to require a detailed teardown of posts that are nothing but nonsense. Requiring such work is a bedrock of scammers, pseudo-science, spammers, etc as it makes it harder to stop them: if every response has to be well thought out and cited eventually people give up, and the posts go uncontended, which gives them the appearance of relevance or legitimacy.
This post is a nonsense article. It contains no information, it makes no actual claims beyond "we have xmpp so we should continue to have xmpp", it is far too long for the actual content, and the content that is there is largely hand waves references to a badly made and incorrect pseudo-scientific analogy.
This is the kind of post that does deserve a dismissive "this is nonsense" response.
One solid argument: xmpp failed as big tech (Google, Facebook) protocol and doesn't provide nothing radically better now: a ton of protocol extensions and incompatible clients.
Matrix on rise: got a business with EU governments, deliver new features (spaces), with voice channels probably will enter the real rival stage with Discord/Slack.
> xmpp failed as big tech (Google, Facebook) protocol
Did it? They embraced it, made external XMPP clients being able to interact with their own services, and then suddenly cut them off. For all we know, they are still using XMPP internally, they just don't federate anymore.
That sounds like a failure of XMPP. There is no reason for someone to need two message platforms other than userbases being split. If you actually wanted both for functionality purposes, it shows both of them are lacking.
It is a great pity that something that has created so much value for so many companies now somewhat languishes. A real shame. Not that xmpp isn't great, but it'd be so much better if the platforms that benefited from it had only contributed back more.
Like many extension-oriented protocols, XMPP works for internal use where the same organisation (or two closely cooperating organisations) control both ends, but doesn't work for communication between two loosely connected or unrelated parties.
Honestly neither websockets over HTTP nor the underlying protocols used by XMPP are particularly good. They should be designed more like IMAP; at CMU during the first significant IETF instant messaging effort, Rob Earhart distilled the lessons of IMAP into an “Access Protocol” Internet draft that makes for a better substrate for pretty much any Internet protocol: https://datatracker.ietf.org/doc/draft-earhart-ap-spec/
A lot of the reason the IETF process around IM was so lengthy and led to so much cruft and weirdness is that there were telco folks on the lists who insisted that only small UDP packets would ever get through telco networks and TCP was an absolute non-starter, and kept trying to center the discussion around their current data networks rather than even the planned 2G and 3G networks.
Also, just like email, an instant message protocol should be connection-based (TCP), provide good response time for plain text messaging over a 9600bps or slower link, and only require a page or two of C code to parse without heavy use of the standard library.
In a vacuum, something like that Access Protocol may be good, but we’re not in a vacuum. The web has won, and basing things on top of web specs conveys significant advantages: the general tooling for HTTP/XML/JSON is much better than the tooling for any specific-purpose binary or text protocol that I know of, so you can get started building stuff much more quickly, with better understood and handled failure modes, and better tools for debugging when anything goes wrong; and the large fraction of the world’s software that runs inside a web browser can also take advantage of this without needing some kind of non-standard server proxy. Most of the time I would say these advantages significantly outweigh the disadvantages of inefficiency.
There are reasons JMAP was designed to sit on top of HTTP and JSON. It makes life much easier.
There's something to be said for that here, however. When was the last time you typed out a chat/IRC message longer than 512 bytes? That's the minimum MTU on the Internet; packets below that size can ignore fragmentation and the retransmission-amplification that comes with it.
It also uses only symmetric crypto, for "that's a feature not a bug" reasons; among them: it forces its users to forge decentralized WoTs instead of relying on big centralized platforms to authenticate their own friends to them.
> A lot of the reason the IETF process ... led to so much cruft and weirdness is that there were telco folks
That has been a major problem for a long time now.
The whole "why didn't Matrix just build on XMPP?" thing is incredibly tedious - you'd hope people would have got over it by now.
It's like saying "Why have Canvas, when we already have SVG?". Or "Why have NNTP, when we already have mailing lists"? Or "Why have Git when we have Subversion"? Or "Why have Linux when we already have BSD"?
The projects are diametrically opposite in their most fundamental ways - with the sole exception that they can both be used for instant messaging. Just like SVG and Canvas both render graphics, and NNTP & SMTP can both be used for discussion forums, and Git & SVN both manage source code. They have completely different tradeoffs; one is not necessarily better or worse than the other; they are just utterly different approaches and can and should coexist based on folks' preferences and requirements.
* Matrix is a decentralised conversation history replication protocol, not a messaging protocol. Architecturally it has no concept of direct messages or store-and-forward messaging for IM; instead everything is a group conversation with a conversation history getting replicated between servers (and their clients). Conversely, XMPP is a message passing system (although it's true you could layer a DAG-based conversation database on top in a XEP, at the expense of compatibility with the rest of XMPP). It's literally the difference between NNTP (Matrix) and SMTP+IMAP (XMPP).
* Matrix is a single monolithic spec, defining precisely what features exist for a given stable release. New features are proposed as PRs to the spec (MSCs), often with competing options, but only one gets ratified and merged. Conversely, XMPP is a cloud of XEPs, with compatibility XEPs published occasionally.
There are obviously other differences (e.g. Matrix mandating E2EE for private conversations (https://matrix.org/blog/2020/05/06/cross-signing-and-end-to-...), XML v. JSON etc) but the two points above are the fundamental differences which mean that you wouldn't be able to build Matrix on XMPP short of effectively creating a new protocol - at which point you've effectively built Matrix anyway.
This said, there are also some similarities which often get misrepresented or misunderstood:
* Both protocols are extensible. In Matrix, you can send any data you like (just define a namespace and off you go), and extend existing data with anything you like. You can also go wild and define your own APIs, and room versions (i.e. dialects of the protocol) under your own namespace. Matrix provides mechanisms for backwards compatibility & fallback for clients (and servers, in future) which don't speak your dialect. However, it's abundantly clear that when doing so you are going off piste (although you're welcome to propose your extensions as a change to the main spec).
* Both protocols are open standards, maintained by non-profit foundations (https://matrix.org/foundation and https://xmpp.org/about/xmpp-standards-foundation/). Both started off being built by commercial teams (New Vector Ltd and Jabber Inc respectively) before progressively shifting to an open governance model.
Finally, the weirdest thing about this random IETF mailing list post popping up from April 2021 is that I believe the IETF chose to go with Zulip in the end: their tools team was freaked out that Matrix was openly federating and replicating history from non-IETF servers onto their instance (plus Synapse's admin UI is lacking). On the Matrix side, our solution to this will obviously be to work with the Zulip folks to get Zulip talking Matrix :)
""" Matrix is a decentralised conversation history replication protocol, not a messaging protocol. Architecturally it has no concept of direct messages or store-and-forward messaging for IM; instead everything is a group conversation with a conversation history getting replicated between servers (and their clients). Conversely, XMPP is a message passing system (although it's true you could layer a DAG-based conversation database on top in a XEP, at the expense of compatibility with the rest of XMPP). It's literally the difference between NNTP (Matrix) and SMTP+IMAP (XMPP).
"""
This is how Matrix solved my problems right away and why XMPP never actually understood or cared about my problems.
I want all my messages and history on all my devices wherever I login with the proper credentials.... not a handwavey reference to 3 different ways to implement that with 9-15 different XEP combinations which noone has done yet.
TLS/crypto and the federation on top are great/mandatory but Matrix solved the core problem first and thats what matters.
> I want all my messages and history on all my devices wherever I login with the proper credentials
Is this not possible with XMPP? Genuine question. I thought there were recent XEPs for standardized handling of message history (including export/import).
Correct, it's been possible for many years. What's generally the case is that people complaining about things like this are still using/remembering software that hasn't changed much in the last decade. Common culprits are Pidgin and Adium (both based on libpurple) which, while still developed, is lacking many modern features. I'm hoping that will change, but it's subject to the usual economics of open-source.
XMPP is between a group of people who complain that it is stuck in the past (i.e. they think it still only supports the features it supported in 2008) and another group of people who complain the opposite - that it has "too many XEPs" (this is due the fact that XMPP evolves through a mechanism of creating and deprecating extensions to the core protocol). In reality the XSF annually publishes the current list of recommended XEPs for different use cases (see https://xmpp.org/about/compliance-suites/ ) and the XEPs for multi-device messaging have been part of this set for many years, and are implemented in practically every actively maintained XMPP client.
No, the whole point of MSCs is that they aren't part of the protocol - they are just random proposals for new ideas; the more the merrier. The protocol doesn't have extensions.
When an MSC makes it through the process, it is added to the protocol, right?
Then all the servers and clients need to implement those MSCs or else they are not compliant with the protocol.
Either you're not compliant with the protocol (Matrix), or you're not compliant with a certain extension (XMPP).
I don't understand how these are different. Matrix has a lot of non-compliant servers and clients. XMPP has a lot of servers and clients that don't implement certain extensions.
> When an MSC makes it through the process, it is added to the protocol, right?
Yes.
> Then all the servers and clients need to implement those MSCs or else they are not compliant with the protocol.
It's no longer a MSC at that point; it's part of the next versioned release of the protocol. So yeah, if the feature profile their client/server is targeting doesn't implement the required features from that version of the protocol, it's not compliant.
> Either you're not compliant with the protocol (Matrix), or you're not compliant with a certain extension (XMPP).
> I don't understand how these are different. Matrix has a lot of non-compliant servers and clients. XMPP has a lot of servers and clients that don't implement certain extensions.
The difference is that in Matrix there's only one specification which servers & clients can be compliant or not compliant with. Whereas in XMPP, there's an arbitrary combinatoric explosion of XEPs which you may or may not be compliant with.
> The difference is that in Matrix there's only one specification which servers & clients can be compliant or not compliant with. Whereas in XMPP, there's an arbitrary combinatoric explosion of XEPs which you may or may not be compliant with.
We don't version XMPP in this way, but instead annually publish the "compliance suites" listing the required XEPs for a range of profiles: https://xmpp.org/about/compliance-suites/ - so for example an XMPP client can be compliant with the 2021 mobile requirements. In this document we also hint to developers which XEPs are looking promising for the future, so they have ample notice to play with them and give feedback.
Beyond the compliance suites, implementations are obviously free to experiment with other XEPs, and that experimentation feeds into the standards process and determining what will be in the following year's compliance suite.
I agree that the bickering is tedious and old at this point (and perhaps has been since the start), but there is a sense that the technical arguments miss the point of what the contention is about. Now, my argument is based mostly on conjecture and observation, and I mainly use XMPP because Conversations is the only thing that runs on my Blackberry Q10 (another thread entirely); I have not much of a horse in the race in any other way.
Though XMPP and Matrix are diametrically opposed in terms of their protocol semantics, and can be meant to occupy different use-cases at their edges (say, message passing vs. eventually-consistent data stores), their core use-cases are much the same: one-to-one and many-to-many messaging for both public and private groups, in competition with other, proprietary applications occupying the same space (e.g. Slack, Signal, Telegram, Discord). For sure, it might be agreed that there's a wide spectrum of difference even in the aforementioned proprietary applications (the whole banquets and barbecues notion), but I'm hoping it's not controversial to think that one can implement an application in the same UX vein as the proprietary alternatives using either XMPP and Matrix.
That, I feel, the the core of the contention -- that somehow Matrix has wrested focus/effort/velocity away from a still-viable project in terms of end-user-goals, thus somehow dooming it (or open-source/community-owned messaging in general) in the process. At face value, this might very well be nonsense; if XMPP cannot compete, or is not viable, nothing anyone outside the project does will affect this fact. Conversely, if Matrix didn't exist in its current form, it might not exist at all -- it's not a foregone conclusion that efforts would've been poured into XMPP or whatever else instead. In any case, competition is generally thought to be a good thing, insofar as it helps drive competitors to improve.
Emotionally, though, I think I understand the contention, and I've seen it happen not with protocols, but with things like Linux distributions, where people would lament the proliferation of disjoint efforts, where no clear, viable contender to the proprietary solutions existed at the time -- and some might argue does not exist still; nevertheless there's not much lamenting nowadays, where multiple viable distributions and desktop environments exist.
In some sense, community-owned messaging is still in a precarious state, and splitting up efforts, as it were -- since efforts are not just split in protocol implementation, but also in the ecosystem of applications etc. -- makes it feel even more precarious. Whether there's any rational basis to this, and whether either protocol has a technical advantage over the other, I don't know.
Has IETF any policy about not having competing standards?
I mean, if XMPP is an IETF standard and IRC already existed, surely they already have two. See also IMAP, JMAP and POP; several routing protocols; TFTP, FTP and SFTP.
Frankly this is a moot point until there is more than one complete implementation of the Matrix "protocol", including particularly the critical "homeserver" part.
Until then it really isn't standardized; the protocol is, effectively, "do whatever the one and only implementation does".
No strong opinion. What's more important though is adoption of some interoperable standard. So far we have a huge amount of AOL/Compuserve-esque era style IM walled gardens that still refuse to interoperate. Which is ridiculous.
Anything in this world that has survived for
a long-time, had to be fit to withstand selective pressures. So look at
what existed for long time, that's probably has properties to adapt
well.".
This seems confused and and misguided.
All it means is that it survived the selective pressures to which it was historically exposed. It doesn't mean they were the best of all possible species, just the best (given the environment) compared to the others at the time, solely at the task of making more copies of themselves.
> It doesn't mean they were the best of all possible species, just the best (given the environment) compared to the others at the time, solely at the task of making more copies of themselves.
There's a very long list of failed 'perfect' standards. "Working in the current environment and recruiting new users" looks like a very reasonable bar for a standard to clear.
I'm not quite understanding the "instead of Matrix" part… Why not both? Is the IETF unable to support more than a single standard for a particular purpose/task?
Summaries, and my counter-arguments as someone who just replaced XMPP with Matrix, and attempting contributions to Matrix code:
> [1] EVOLUTION AND NATURAL SELECTION: https://en.wikipedia.org/wiki/Lindy_effect and a biology analogy taken a bit too far. Extensibility and modularity are important. Extensibility allows adding features, and modularity allows removing features.
The last part sounds reasonable. The first part misses that nature is constantly creating new variations that try to become dominant. The Lindy effect suggests that older things stick around. But in nature, they don't stick around for the sake of age. They stick around because nothing better has out-competed them yet. This analogy seems moot to me.
> [2] IGNORANCE: I think this kind of trend "Protocol ABC doesn't have this XYZ feature, so let me start a protocol from scratch" should be discouraged.
Yes, but this is a very shallow argument. The rest of the world has advanced in terms of protocol infrastructure. We're on HTTP/2, and playing with HTTP/3 [1]. IPv6 might actually happen in my lifetime. Meanwhile, XMPP is one-long-XML-document-over-TCP and requires custom tooling for everything. IM protocols don't existing in a vacuum, and should be re-evaluated based on global knowledge, not just IM protocol history.
> [3] FLEXIBLE DEPLOYMENT: IM platforms should be able to be deployed as minimal as possible or as
feature as possible. Certain features should be able to be optionally enabled or disabled, based on the needs of the deployer. If an activist collective doesn't want to store any messages on server for privacy purposes, it can be done by dropping the XEP responsible for archiving. Matrix cannot do this.
However, Matrix is essentially an append-only event store, so in this particular example, yes, you wouldn't be able to disable storing. It's fundamental to Matrix. You can of course prune historical events, but that's not the intent. I feel the bigger question "should an IM protocol be based on distributed state replication?" deserves a better argument than "it's not implemented as an extension." It provides other benefits, like read markers and complete ordering of e.g. ACL changes in rooms.
---
This seems to be many words, but not much insight into the concrete merits of either of XMPP or Matrix.
[1] I'll leave chat protocols on the block chains out of this.
XMPP is extensible huh? Extend it to not torture implementation developers with xml.
You can't.
That's the law of software. Good components tend to be easily replaceable, bad components do not. Thus overtime as software evolves it ends up made of mostly or only bad components.
I admittedly don't know much about XMPP, but I've heard that it relies heavily on XML, which nowadays can be an offputting serialization format. Is there a XEP for using JSON(schema) (or even more dev-friendly formats like protobuf)?
XML has a few drawbacks (serialisation), but I’m not sure if it being offputting is one; you could always simply pass it through a library such as xmljson [0], if you prefer to work with what you’re familiar with.
The problem with XMPP has always been that it is too extensible with too few guarantees in interoperability. You'll know what this means if you ever set up OMEMO over 3 different client implementations. It also does too much, anything from Kafkaesque (pun intended) persistent pubsub to content labels. The quality of clients is simply not good enough, so most XEPs end up in specialized variants of clients or proprietary implementations (Facebook, GTalk, EA Origin) that lean more on ejabberd's scalability than other XMPP features like federation.
Also, XML is not a good serialization format, and some of the requirements in the protocol are pretty exotic and not trivial to implement in a lot of languages. Like setting up TLS on an existing connection (plus the absolute refusal of the XMPP WG to allow on-connect TLS). JSON over websocket or HTTP are just better at reaching a wide audience of developers.