XSLT is a failure wrapped in pain

grndn · on Dec 6, 2014

The problems that XML and XSLT address have not gone away. It saddens me when the XML-hating JSON community starts reinventing solutions that have all the same issues of bloat and complexity (see JSON schema, Collection+JSON, Siren, etc).

I would not be surprised if someone soon announces a "JSON Transformation" tool that can convert one JSON schema to another. Followed shortly by a standard for JSON namespaces so you can mix different schemas, a standard for binary JSON, a standard for JSON-encryption, and so on.

"Those who cannot remember the past are condemned to repeat it."

coldnebo · on Dec 6, 2014

Flashback to the early days of XML: "oh god, why would anyone use something as bloated and slow as CORBA to define an interface! Look how simple XML is, and it's human readable!!"

CORBA is laughing its head off: "Who's bloated and overly complex now, eh?"

XML: "Well at least I developed an appreciation of the problem domain... Unlike those arrogant JSON kids"

CORBA: "You were the same way at their age."

XML: "Sure was, gramps!"

(Both chuckle)

ultimape · on Dec 7, 2014

I've been actively staying away from CORBA myself.

Back in college, we tried our hand at using Ice http://www.zeroc.com/iceVsCorba.html and it seemed to do the trick.

Still, it does assume that both parties have access to the template generated by ice - thus going back to the issues surrounding needing a sort of schema.

That being said, there does seem to be options for interacting using json: http://ice2rest.make-it-app.com/ and as of 3.3.1 it apparently supports Google's Protocol Buffers for flexibility.

crdoconnor · on Dec 6, 2014

>The problems that XML and XSLT address have not gone away. It saddens me when the XML-hating JSON community starts reinventing solutions that have all the same issues of bloat and complexity (see JSON schema, Collection+JSON, Siren, etc).

JSON schema:

Validing schemas is not as nearly important for JSON as it is for XML. JSON's relative simplicity the problem of "almost-but-not-quite" valid encoding mostly goes away.

As a result, not many people use JSON schema.

Even when I work with XML I very rarely come across code that performed actual XML validation. Most people would just wing it and hope nothing broke. That's the first dirty secret of XML validation.

The second dirty secret is that if you are consuming an API that provides invalid XML (a common occurrence), you just deal with it and try to make it work. XML validity be damned.

Collection+JSON:

Literally never heard of this. Don't need it either. Once the JSON is deserialized you can use the language's own tools (and a combination of lists/arrays + maps). So what's the point? I don't miss XPath or XQuery.

Siren:

An solution attempting to solve a non-problem copied from a solution from the XML world that didn't solve a problem either.

Arwill · on Dec 6, 2014

Its a generational thing. Programmers grown up on JavaScript will prefer JSON. Its not a unique phenomenon. My guess is the noSQL was born out of fact that many young programmers simply don't know how to write SQL, if they don't know/want to use SQL and rely on ORM mapping alone, then they can as well do away with SQL databases too.

marcuskaz · on Dec 6, 2014

> Its a generational thing. Programmers grown up on JavaScript will prefer JSON

Not really, JSON is a simpler format with better parsing built-in for most languages. It is easier to use for programmers and performs better across the network.

This is as true today as it was true 7-years ago when I wrote this article: https://mkaz.com/2007/03/02/goodbye-xml-hello-json/

_ondq · on Dec 6, 2014

And YAML is a better JSON with comments and no need to balance braces.

matwood · on Dec 6, 2014

if they don't know/want to use SQL and rely on ORM mapping alone, then they can as well do away with SQL databases too.

Until that whole leaky abstraction problem kicks in.

tree_of_item · on Dec 6, 2014

The idea behind doing away with the SQL database is to remove the leaky abstraction of an ORM, by making the object model the true shape of the data and not a wrapper over SQL.

krig · on Dec 6, 2014

In my experience, the people forcing schemas and bloat on JSON are the same people who forced schemas and bloat on XML.

crdoconnor · on Dec 6, 2014

Fortunately they're usually ignored.

Mikhail_Edoshin · on Dec 6, 2014

So the problems are real, it's just that XML and XSLT (and now JSON schema et al.) solve them badly (to your taste)?

carsongross · on Dec 6, 2014

I took a KI(RF)SS stab at a JSON schema description language, attempting to avoid the insanity of the XML world but still provide a bit of structure for JSON, a while back:

http://jschema.org/

Haven't touched it in a few years, but I think the core idea is sound: as much schema as can be packed into 15 additional productions beyond the original 15 in the JSON spec.

I wrote a typeloader for Gosu based on it:

http://vimeo.com/29348322

skrebbel · on Dec 6, 2014

I once braindumped a similar idea on https://github.com/eteeselink/relax-json (inspired by RELAX NG, one of the few great things to come out of the XML world).

A core difference between it and your JSchema is that it, itself, is not JSON - just like XML, I don't think JSON makes for a good format to write down schema definitions. In fact, I don't think JSON is very human friendly at all[0], which is OK for a data interchange format, occasionally read by humans but hardly ever written by hand.

I did not further develop RELAX JSON, however, when I realized that TypeScript interface definitions[1] are a great JSON schema language:

    enum Color { Yellow, Brown }; 

    interface Banana {
        color: Color;
        length: number;
    }

    interface FruitBasket {
        bananas: Array<Banana>;
        apples:  Array<{color: Color; radius: number}>?;
    }

It's best to use interfaces and not classes, because interfaces allow for optional fields (with the `?` suffix), which is pretty common in APIs and wherever JSON is used.

I will write a validate-JSON-with-TypeScript-type-definitions checker as soon as I find a need for it. Open to ideas here, guys! (or alternatives)

[0] Compare gruntfiles to webpack configs (tl;dr: they're JSON and JS-that-returns-an-object, respectively. the latter allows code, comments, unquoted keys, etc etc).

[1] http://www.typescriptlang.org/Handbook#interfaces

mtrn · on Dec 6, 2014

> I would not be surprised if someone soon announces a "JSON Transformation" tool that can convert one JSON schema to another.

I'm not exactly guilty of that, but I do have some use cases, where one json document needs to be mapped to another one.

In a weak moment, I actually sketched a simple protocol for using javascript to express these "transformations": https://gist.github.com/miku/620aecc5ad782f261e3b

hyp0 · on Dec 6, 2014

There have been several JSON transformation tools, but they haven't taken off. (EDIT in general, not just for transformation,) when a task gets too complex for JSON, it's easier to switch to XML. Thus: XML protects JSON.

But I agree there is some pressure on JSON. And if someone can come up with a way to do schema and transformation that isn't complex, it will be adopted like crazy.

Counter-point: (1) all the cool kids use dynamic typing these days, and don't need schema (schema is a form of typing). (2) transformation is easy in fp (which the cool kids are also using), and don't need a separate transformation tool.

crdoconnor · on Dec 6, 2014

>There have been several JSON transformation tools, but they haven't taken off. When a task gets too complex for JSON, it's easier to switch to XML. Thus: XML protects JSON.

Uh, no. In general if you're sensible enough to use JSON as a data interchange format, you're probably sensible enough to use a real programming language to do transformation.

coldnebo · on Dec 6, 2014

There is a lot of confusion between typing systems, metatyping systems (that can implement arbitrary type systems) and the transport representation of such systems.

I agree with your counterpoints, but the cool kids are still having issues with transport representation of arbitrary types. Sure (eg Ruby) can use Kernel dump and load to marshall arbitrary types, but what happens when the other end doesn't have the type loaded or available?

Ouch, maybe we should invite Java Enterprise Beans to the party to comment on the semantics of distributed types?

JSON is currently deceptively simple precisely because its wire representation (with simple types) is equivalent to its type definition which can be directly evaled in js to produce a memory instantiation. But none of that holds in the general case... Try for example marshalling a JSON object with a tree structure.

Maybe we end up going in Alan Kay's direction, where we stop trying to send representations across the wire and just send code... But that too was tried in the past. It has other limitations.

It's complicated.

crdoconnor · on Dec 6, 2014

>JSON is currently deceptively simple precisely because its wire representation (with simple types) is equivalent to its type definition which can be directly evaled in js to produce a memory instantiation.

It's not deceptively simple. It's just simple. The fact that it can be evaluated in JS is incidental to its use as a general purpose format for serializing data structures.

>Try for example marshalling a JSON object with a tree structure.

I've done this lots of times. I don't see any issue with it.

coldnebo · on Dec 6, 2014

The core problem is definitely the support of custom types. I agree, if you refuse custom types everything gets a lot simpler.

Here's a very simple example of marshaling comparing Marshal dump and load in Ruby using YAML, vs. custom JSON marshalers: http://www.skorks.com/2010/04/serializing-and-deserializing-...

Note that the post shows a tree structure in YAML because Marshall gives it to you for free (synonymous with Java serialize behavior). But the post punts on implementing the same tree in JSON, probably because it's messy and complicated.

Nothing about that looks simple to me. For example, the JSON class label "A" has to be interpreted by something to actually instantiate an A Object. YAML is a bit better-- it at least defines that there is a Ruby class instance being marshaled -- but it doesn't help you if that class doesn't exist the same on client and server.

Pretty soon this leads to madness in production systems where the server evolves "A" (v2) among a sea of separately evolving clients "A" (v1, v1.1, v1.2). Then versioned class annotations get introduced, followed by attempts to differentiate between serializable and non-serializable fields, deep and shallow serialization, etc. etc. Pretty soon, your JSON marshaling isn't so simple anymore.

crdoconnor · on Dec 6, 2014

>The core problem is definitely the support of custom types. I agree, if you refuse custom types everything gets a lot simpler.

Which makes perfect sense. If you cut everything down to bool, null, number, string, list and map you can represent anything and you get to remain language agnostic.

Dates can be encoded as strings and so can most of the other more 'awkward' types. This is additional work, but it's not that complicated and not a lot is gained anyhow by putting this stuff in the spec.

You really can't get more complicated than this anyhow, without introducing nasty security vulnerabilities.

coldnebo · on Dec 6, 2014

> you can represent anything

If you represent something, you need to interpret it later... i.e. both client and server need the same interpretation in order to avoid errors.

> Dates can be encoded as strings and so can most of the other more 'awkward' types.

You say 'encoding', I say 'serialization'.

> It's not that complicated

It isn't as long as you use the same platform for encoding and decoding.

Maybe our experiences are different. I remember one time I had to unencrypt a string in Ruby that had been encrypted in Java. I thought, this will be simple, it's a standard AES encryption, I'll just stuff the string into the corresponding Ruby object and decrypt! I mean, both of these objects were implemented according to the same industry standard right? Boy was that a learning experience. :) Framing, padding and seeding was not implemented the same way -- it was left as a platform implementation detail that only someone trying to integrate across systems would ever notice.

felixgallo · on Dec 6, 2014

As an example of the deceptive simplicity, could you please describe how many bytes are required to deserialize numbers represented in JSON? Note: answer is not 2,4, or 8.

crdoconnor · on Dec 6, 2014

The JSON spec has no opinion on that and that's a good thing. It is up to you, your language and your parser to decide how large the largest JSON number should be.

Would it be simpler to enforce a global 32 bit or 64 bit limit on those numbers in the spec? I don't think so. Why should the limitations of embedded microcontrollers exchanging data apply to those of astronomical systems and vice versa?

coldnebo · on Dec 6, 2014

I should have been more specific... it's not really about trees per se, it's more about custom types. (I was thinking of trees of custom types).

ultimape · on Dec 6, 2014

I'm a fan of the techniques as described in : http://smile.amazon.com/Expert-Business-Objects-Books-Profes...

Basically, it uses the underlying runtime to serialize the entire object to the other machine. The theory being, if object oriented programming ties data to functions, why not ship both over the wire?

It works really well with the micro-services paradigm.

Not exactly a 100% solution, but it does solve the issues at least among the languages the runtime supports.

skywhopper · on Dec 6, 2014

Ultimately, I think the problem is that everyone thinks we need to be able to do all the same things with structured, hierarchical data that we can do with normalized, relational data. And it's like going from 2-D to 3-D. Things don't get just twice as complicated, they get exponentially more complicated as the problem-space and schema complexity grows.

Thus any toolset that tries to address relationships, schemas, searching, and grouping in an XML/JSON type data format is going to be exponentially more complicated than RDBMSes and SQL.

boronine · on Dec 6, 2014

I'd just like to plug Teleport [0] into the list of reinvented JSON-based solutions. It's in active development :)

[0] http://teleport-json.org

crdoconnor · on Dec 6, 2014

That actually looks pretty nice.

lowmagnet · on Dec 6, 2014

The only use of XSL in my past was to take XMLthat's already in the right structure and simply iterate via apply templates. XSL excels this sort of simple template work.

So does mustache, with a bit less resistance/overhead.

So does Polymer if you make your xml tags into web components and apply them. It also has all of the benefits of Polymer's isolation.

I think xsl was and is a good idea. I just think that other things have come along that are easier to get into.

benihana · on Dec 6, 2014

>XML-hating JSON community

What are you even talking about? It seems like you think that XML and JSON are two mutually exclusive technologies and that if one is bad, the other is good. Or that if one is used, the other can't be? I don't understand why you even started talking about JSON; the article was about why XML and XSLT suck, not why JSON is superior to them.

I'm not really sure what point you're making. JSON is meant to make passing data in JavaScript easier. It's a necessary tool for interactive, JavaScript-heavy web frontends. I'm not sure why there's so much condescension in your post for JSON and its tooling. I'm just a bit confused; it comes off like you have a personal stake in XML and you find any mention of JSON, even obliquely, as offensive.

There's no reason that XML and JSON can't both be tools we use when the situation calls for it. This kind of dogmatic defense or condemnation of technologies offhand doesn't really do the HN community or the programmers at large any good.

general_failure · on Dec 6, 2014

> JSON is meant to make passing data in JavaScript easier.

No, its not.

nly · on Dec 6, 2014

What's with linking to an old "XML sucks" page with no content, insight or original thought? Is it bash <whatever-was-on-the-homepage-a-few-hours-ago> time?

On XSLT: find something that fills its role completely, with the same level of tooling, and then have a rant about inferior tools being popular, until then it doesn't really matter if it sucks.

hobs · on Dec 6, 2014

If you refresh HN regularly, you will see that really often. People are just gaming the community for a few upvotes or views as far as I can tell.

I see this timeline all the time: 1. User posts article about xyz technology 2. User posts article negating post 1 3. Most boring shitstorm every occurs 4. karma karma karma

ramped · on Dec 6, 2014

I think you just described every discussion medium on the Internet.

As a reddit refugee, I was hoping for a little more on HN.

VMG · on Dec 6, 2014

Since it is the hot topic of the day, let me add this perspective: http://www.snoyman.com/blog/2012/04/xslt-rant.html

Quote

> I'm not even talking about the hideously verbose syntax, or the completely obtuse data model. The fact that you can't know what any single line of code does without reviewing every other line in the program makes this language an abomination.

craigching · on Dec 6, 2014

That link is on the page ;)

robert_tweed · on Dec 6, 2014

One quote sums it up:

“XML is simply lisp done wrong.” – Alan Cox

It's not that XSLT isn't useful in some situations. It is. It's not that clean, simple and efficient XSLT is impossible. It is, but it's hard.

The fact that it isn't Turing complete can be a good thing. It can also cause a lot of headaches.

The main problem is that XSLT as designed and as implemented is an over-engineered god-awful mess. XSLT 2 was a huge improvement, but nobody implemented it, or they maybe only implemented bits of it in nonstandard ways (MSXML), so none of the better parts were reliable.

The idea of XSLT was sound and XPATH was pretty nice, but anyone who thinks XSLT is "good" probably has never worked on a large XSLT-based project (one where XSLT files routinely include other XSLT files and XML documents routinely link to other XML documents via xlink).

People say complexity gets out of control with OOP. Those issues pale into insignificance compared with rampant pattern matching split over many files when you have dozens of different schemas and are dealing with massive document graphs (with the occasional cyclic edge for good measure).

Good luck trying to reliably predict results in advance, or add any sort of control-flow logic to deal with edge cases without resorting to hard-coding and unrolling recursion.

sergiosgc · on Dec 6, 2014

XSLT is Turing compete:http://www.unidex.com/turing/utm.htm

It is a functional language, probably one of the reasons people don't like it.

robert_tweed · on Dec 6, 2014

To quote Haskeller Michael Snoyman (linked elsethread):

  Please don't insist on calling your bastard child of a language "functional."

It's a declarative language. It's functional language if you squint at it, sort of. Maybe. It certainly doesn't fit with the most broadly acceptable (stricter) definitions of what a functional language is.

The biggest problem, with XSLT 1 at least (and that's primarily what I'm talking about because that's what exists in the wild), is that the output of a template function is a string, whereas the input is a node tree. So to get generalised recursion or perform function composition over the input data, you need to do some evil tricks.

To give the impression of Turing completeness, you need to first parse the input stream by splitting it up into strings and then call functions that perform string processing, which means you can no longer use xpath, or any of the things that really make XSLT. At that point you're not really writing XSLT anymore, you're doing simple recursive string processing using XSLT as a horrendously overweight and improperly-equipped wrapper around your native string libraries.

Incidentally, I don't accept the linked article as a proof of Turing completeness, since it only implements some trivial programs using Turing machines. That's not the same thing as proving equivalence with a universal Turing machine. However, others seem to have done it, presumably using tricks such as string processing, or ignoring the input altogether.

You could for instance, recreate Conway's Game of Life using pure XSLT, but it wouldn't contain any actual XML transformation code and all the state would exist in the running XSLT program, not the XML document. It would also blow the stack as most XSLT engines don't support tail recursion and there's no other way to loop indefinitely.

What you certainly can't do is have one function that outputs a set of XML nodes and another function that has a for-each that consumes that set, or have the ability to run apply-templates over it. That is nothing like functional programming.

bambax · on Dec 6, 2014

Rather than singing the praise of XSLT (which I love) I'll show an example: this tool transforms any rich text to markdown:

http://markitdown.medusis.com

It's about 30 lines of XSLT that run in the browser. [Edit: it's not 30 lines but 230, but I was thinking of the number of "rules" (templates) of which there are only 29.]

There are very few other tools of its kind and I don't think there exists any client-side, with the same simplicity. This attempt for example

https://github.com/domchristie/to-markdown/blob/master/src/t...

is about 180 lines of JS, is incomplete, doesn't work with many special cases, etc.

There is no better templating language than XSLT; every other templating approach (in PHP or Python on the server, in JavaScript on the client) feels like a horrible kludge once you've experienced XSLT.

Yes, XSLT is practically dead, that's a fact. But we should be very sad about it, instead of dancing on the coffin like the OP with its stupid quotes.

crdoconnor · on Dec 6, 2014

http://markitdown.medusis.com/xsl/html2mk.xsl

It looks like 234 lines of code to me.

>This attempt for example https://github.com/domchristie/to-markdown/blob/master/src/t....

Is badly written, but still written in a better language. They're using regexps to parse HTML (omfg!), but that kind of nastiness doesn't excuse XSLT as a language.

>There is no better templating language than XSLT

Except mako, jinja2, django templating language, liquid, etc.

bambax · on Dec 6, 2014

You're right, it's 230 lines; I updated my post accordingly. I wrote the thing 3 years ago and remembered the number of rules (templates) instead of the number of actual lines. But a line's a line, so I was wrong.

"written in a better language" doesn't mean much, however. A better language for what? I'm not picking on JavaScript, which I love and use every day; but templating in JS versus XSLT is crazy.

The templating languages that you mention are, in my opinion, extremely complex and very unpalatable; and they only work server-side.

But that's all a matter of taste, I guess. What I don't understand is why so many people go out of their way to declare their hate of XSLT (and all things XML), especially now that XSLT is all but dead...?

crdoconnor · on Dec 6, 2014

>"written in a better language" doesn't mean much, however. A better language for what?

For general purpose programming, which is what transformations eventually end up requiring. XSLT isn't java or javascript. It theoretically can do everything that those two can do, but we both know in practice that once you don anything sufficiently complex in XSLT it will become horrendous. Even the biggest proponents of XSLT won't argue that you should put business logic in there (I hope, anyway).

>I'm not picking on JavaScript, which I love and use every day; but templating in JS versus XSLT is crazy.

Because?

>The templating languages that you mention are, in my opinion, extremely complex and very unpalatable

What??!? I can get web designers with no coding experience to edit them! Try doing that with xslt. They're that simple! Not only are they conceptually simpler, they are mathematically provably simpler due to their being non-turing complete.

>and they only work server-side.

Am not a client side developer so I don't know what the state of the art is in client side templating languages, but there are a few that look ok (dust, jsrender, handlebars...).

>But that's all a matter of taste, I guess. What I don't understand is why so many people go out of their way to declare their hate of XSLT

Because they have spent time debugging it and know that every second they spent doing that was both painful and unnecessary.

If you don't understand the pain it might mean that you don't even realize that the same pain you felt debugging it wasn't actually necessary.

My sincere hope is that all the people who create business critical XSLT abortions will stop doing it so I won't ever be called in again to fix what they did. That's why I'm passionate in my XSLT (and XML) hate.

bambax · on Dec 6, 2014

> the same pain you felt debugging it

XSLT is "right the first time"; I have had to do corrections and evolutions but very rarely (never?) have I had to hunt for a weird unexplained behavior.

Of course I have seen horrible XSLTs, but horrible is in no way limited to XSLT (the only thing that's XSLT-specific is when people try to do imperative programming in XSLT).

I'm currently trying to make sense of a database model where every. single. property. is a flag in just one table (isClient, isProspect, isActive, isAForeigner, isMale, isFemale, etc. etc.)

No XSLT involved whatsoever. Big pain.

woah · on Dec 6, 2014

Looks like 230 lines to me https://github.com/bambax/markitdown.medusis.com/blob/master...

patrickg · on Dec 6, 2014

+1 - I have to deal with structured documents encoded in XML on a daily base, and most of the time there is no other tool for me than to write an XSLT stylesheet.

I really like the computational model of XSLT (push vs. pull), it is so elegant. But it takes a quite some time to fully understand what is going on.

What I think is bad, is that the infrastructure for XSLT is not perferct. There is only one good XSLT 2 processor that I know of, everything else is XSLT 1.

I am currently eliminating some XSLT scripts with custom (Go) programs, because of speed issues.

geonik · on Dec 6, 2014

Those who rant about XML and XSLT know nothing about what they talk about. 50% of our server's behavior (200,000 Java LOCs) is orchestrated declaratively by a small number of XML files that use around 30 custom namespaces. These are

1. parsed on server startup for setting up persistence, business rules, REST endpoints etc

2. transformed by XSLT to a) produce nice HTML documentation, including DOT class diagrams b) generate Java source code c) validate declaration integrity and cross-referencing

With the right XSDs, IDE support is excellent (auto complete for everything). Take the time to learn it, apply it according to your needs, and reap the benefits- in the long run, maintenance work is down by an order of magnitude.

moonshinefe · on Dec 6, 2014

I'm afraid your argument of "It worked for us" and "it only uses a small number of files with 30 custom namespaces" pretty much is the most unconvincing argument I've ever heard.

In fact, the latter point is one of the reasons most people like to avoid XML.

geonik · on Dec 6, 2014

"the latter point is one of the reasons most people like to avoid XML"

And these are exactly those who will never "get" that XML is much more than a data container for tree-like structures. They should stick to JSON or CSV for that matter.

ilitirit · on Dec 6, 2014

Yeah XML namespaces are a pain to work with, but when it comes to stuff like documentation it's hard to beat XML.

Mixed content in Json is definitely not as simple as it is in XML.

sly010 · on Dec 6, 2014

I will risk that this will be an unpopular opinion, but if you are having problem with XML, you are using it to solve the wrong problem.

I understand writing XSLT and XML Schema can be difficult and I see how typing out XML namespaces can be a pain, but every sentence about XML in that article is a joke. Those quotes are all intended to be funny, not objective. Noone actually brought an objective facts against XML. Because they can't. The fact is it is widely used in many places. Anyone tell me an alternative to serialise an object tree where you also need to preserve ordering and type information, you need to store text longer than one line, or you just need to store any kind of formatting information. (and yes, you can use JSON to do that, but the resulting document will be 5x longer)

(meta: Funny quotes bashing useful technologies is the cat video equivalent of HN. Last week's article beating OOP was the same pattern.)

crdoconnor · on Dec 6, 2014

Objective facts:

* XML is complicated enough that its parsers are commonly full of obscure bugs. JSON/YAML doesn't have this problem.

* XML is complicated enough that its parsers can have security vulnerabilities (e.g. see billion laughs for just one). JSON/YAML doesn't have this problem.

* XML is complicated enough that you can create an almost-but-not-quite valid encoding. The (already complicated enough) parsers have to deal with this and the ones that don't are considered broken. JSON/YAML doesn't have this problem.

* XML's complexity does not give you any additional benefit over YAML or JSON. Serializing/deserializing dates as strings is not a problem. It never was.

XSLT is just the shitty icing on the already crappy cake. A committee created a disastrous turing complete programming language to munge this already overcomplicated data format.

jffry · on Dec 6, 2014

YAML allows deserialization into arbitrary native types, which most definitely is [1] an issue (see: the flood of Rails/YAML vulns a while back)

[1] http://blogs.teamb.com/craigstuntz/2013/02/04/38738/

crdoconnor · on Dec 6, 2014

That is an issue, but it's more of an education/naming issue since it is, after all, intentional.

I think it's really dumb that most YAML libraries have a load() and a safe_load(). If they had a load() and a dangerous_load() then the problem basically wouldn't exist.

berns · on Dec 6, 2014

XML is very simple. Maybe YAML is simpler but both are very simple. XML is tedious to write, verbose and repetitive, but not complicated nor complex. XSLT is also conceptually very simple. XML, XML schema tools and XSLT make a very powerful combination that has proven to be useful in a myriad of real world problems.

TazeTSchnitzel · on Dec 6, 2014

If only XML were actually as simple as you claim. If only there weren't a myriad of namespace, encoding, character literal substitution and other complexities.

personZ · on Dec 6, 2014

If we eliminated everything where implementations have had obscure bugs or security vulnerabilities, there would literally be nothing left.

XML's complexity does not give you any additional benefit over YAML or JSON.

This is so incredibly wrong, on every level, that it belies belief and reads like something you would come across on a "beginning programmers" forum. As others have said, JSON/YAML thus far have seen limited usage (no, that configuration file on your app is not a complex example). But as it grows people are starting to ask questions like "Gosh, wouldn't it be nice if my perimeter or the source system via a metadata file could validate the JSON passed to us". "Wouldn't it be nice to be able to convert from one JSON form to another."

And the exact same complexity is arising...poorly, and with the same hiccups that the XML system went through.

I mean some of the comments are incredible. Like "JSON is simple enough that errors aren't big" -> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number field name and the destination system just ate it. Json.

¯\(°_o)/¯

Sorry that the dates are completely wrong, but all of those years of discovery about time zones and regional settings...just make it some sort of string and they'll figure it out.

¯\(°_o)/¯

scrollaway · on Dec 6, 2014

> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number field name and the destination system just ate it.

The JSON approach does not give you everything-and-the-kitchen-sink. A lot of people consider that a feature.

If you want to do schema validation on top of json messages, you're free to do it when you receive them - the data format does not prevent you from that, it merely does not advocate and standardize one-way-of-doing-it.

The fact the various existing json schema solutions have not found a leader amongst themselves speaks loudly to the fact that it's a useless feature for most people, and the format is better off without it. Whatever the RFC would come up with, people would find fault in it... so if most users don't care, why force one solution over any other?

GP is foolish to think XML does not have benefits over JSON, but you're a lot more foolish to think those benefits (the ones you advertise, anyway) should be part of the language. You say "As JSON grows...", but that's exactly the thing: it doesn't grow. It's a simple data format and needs no new feature. Would trailing commas and comments be nice? They sure would. But we can live without them in the format itself... let alone schema validation which can be done externally.

crdoconnor · on Dec 6, 2014

>GP is foolish to think XML does not have benefits over JSON

I am? What benefits would those be?

personZ · on Dec 6, 2014

It's a simple data format and needs no new feature.

XSLT was developed entirely independently of XML. XML Schemas were developed entirely independently of XML.

XML itself is absurd simple. It is the epitome of simple. But you build an ecosystem of tools and standards around it. And that is of course already happening in JSON -- JSON Schemas, for instance, are now a thing.

scrollaway · on Dec 7, 2014

> XML itself is absurd simple. It is the epitome of simple.

I can't possibly argue with you if you actually believe that. XML is not simple. XML has CDATA, DOCTYPEs, comments, attributes, significant whitespace and so much more which JSON does not have.

crdoconnor · on Dec 6, 2014

>If we eliminated everything where implementations have had obscure bugs or security vulnerabilities, there would literally be nothing left.

The point is that by eliminating this data format you get rid of those obscure bugs and security vulnerabilities and you lose nothing of value doing it.

>This is so incredibly wrong, on every level, that it belies belief and reads like something you would come across on a "beginning programmers" forum.

I wouldn't find this quite so pathetic if I didn't have to school you on XML parser vulnerabilities.

>As others have said, JSON/YAML thus far have seen limited usage

What are you smoking? JSON is everywhere these days. More commonly used in new web APIs than XML for sure.

>But as it grows people are starting to ask questions like "Gosh, wouldn't it be nice if my perimeter or the source system via a metadata file could validate the JSON passed to us". "Wouldn't it be nice to be able to convert from what JSON form to another."

The first I hear occasionally, but it honestly isn't ever a problem. You can put validation in the code that parses the JSON. Invalid date sent? Return an error when your javascript/python/java returns an error parsin it. Name too long? Ditto. You don't need additional outside validation if your programming language doesn't suck.

The second question isn't one I have ever heard in 12 years of software development. Generally you want to do something useful with JSON input. That useful thing isn't normally "make more JSON that looks slightly different".

>And the exact same complexity is arising

Nope. Ain't no billion laughs vulns in any JSON parsers that I know of. No subtle parser bugs causing fucked up behavior down the line either.

>I mean some of the comments are incredible. Like "JSON is simple enough that errors aren't big" -> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number and the destination system just ate it. Json.

If you mistyped the account number on your banking system and it got caught by an XML validator your systems must be fucked.

That's the worst excuse for XML I've ever heard: that your systems are so terribly programmed that you must find user errors via validation of your data interchange format. Jesus.

>Sorry that the dates are completely wrong, but all of those years of discovery about time zones and regional settings...just make it some sort of string and they'll figure it out.

Essentially, yes. ISO 8601 and you're done. Where's the problem?

personZ · on Dec 6, 2014

It's a glorious time in software development when people who make and use trivial web apps think that their domain dominates, and that their superficial knowledge reigns supreme.

gerbilly · on Dec 6, 2014

Exactly this.

It's one thing to knowingly keep use simpler data formats or approaches (callback based concurrency model) to build systems that are small, _and will remain small_.

That's defensible.

But what I see is a bunch of new programmers not bothering to learn established systems, systems that have tackled a much larger problem domain, and deriding them as legacy garbage.

XSLT has its cruft, but lets see the JSON YAML fanbois tackle the same problem domain with their toy formats, then we can compare like with like.

sharpneli · on Dec 6, 2014

Or someone else has used it to solve the wrong problem.

Or your customer demands you to solve the wrong problem with XML.

A nice example: Simple configuration files which are best described as simply option=value or maybe json if someone wants to go really wild.

A customer comes and wants configuration files to be XML. Then your sales department agrees and now you have to implement XML files. The end result: Configuration files are no longer easily editable by humans. Yay!

Another example: Someone decided that using makefiles is too hard, so let's make the equivalent but with XML! I'm looking at you ant! Now they're still have the same problems as makefiles but they are much harder to edit.

crdoconnor · on Dec 6, 2014

In my experience

* Configuration files are best made with YAML (it's the most human readable).

* APIs / other forms of serialization/deserialization over a network are best done with JSON (chop it in half and it will fail fast unlike yaml. still fairly readable tho).

* Programming languages (like ant) should not be written in either one ever (fortunately I've never heard of a YAML or JSON based language).

* XML does a bad to terrible job of all three.

patsplat · on Dec 6, 2014

Agreed. How much XML was used to introduce DSLs and dynamic typing into Java?

gioele · on Dec 6, 2014

How many languages can tell you that byte X of output has been generated from byte Y of input going though such and such steps?

When programming in XSLT it is great to fire up a debugger (let's say oXygen), run your transformation, click on the wrong output and being able to go step-by-step backwards.

How many languages designed before 1999 (yeah, XSLT is 15 years old) can claim to be able to do so?

blablabla123 · on Dec 6, 2014

> How many languages can tell you that byte X of output has been generated from byte Y of input going though such and such steps?

Having written tranformations in Python that needed to carry that information... How do you do that in XSLT? (And do you think it's worth writing new code in XSLT?)

Allan_Smithee · on Dec 6, 2014

Mercury (1998).

dexen · on Dec 6, 2014

Also the venerable, tongue-in-cheek, "Not the comp.text.sgml FAQ" [1] by Joe English.

Sample:

  Q.  I'm designing my first DTD.  Should I use elements or attributes to store data?
  A.  Of course.  What else would you use?

[1] http://www.flightlab.com/~joe/sgml/faq-not.txt

krab · on Dec 6, 2014

The point with namespaces leads me to think that the main problem with XML is that on the surface it looks very simple but in fact it's not. It is tempting to take shortcuts to process the XML: for example parsing with regexes, looking at the namespace prefix and not at its definition, producing XML without proper escaping. There are also some gotchas like certain characters being non-representable in XML.

I personally like XML and XSLT (2.0) but to be able to work efficiently you need to spend some time learning which is not obvious on the first sight.

What about the alternatives?

JSON has a big advantage which is its unambiguous automapping to objects. This benefit is not that apparent in languages like Java where you'd still declare a class to represent either the XML or the JSON document. Moreover, there are projects which essentially try to bring schema and namespaces to JSON. JSON-LD is an example of a namespace without an explicit support in the underlying format. There is even a command-line tool jq big part of which is an engine similar to XPath.

S-expressions if used widely would probably go the same path as JSON - recreating a lot of what is considered as bloat in XML.

Another mentioned alternative was a custom text format. I assume the author meant just to design a format from scratch. I wrote that to use XML efficiently, you need to put in some work. But compared to making a backwards (and forwards?) compatible text format which correctly handles malformed and malicious input requires much more effort.

I don't know anything about ndb.

ilitirit · on Dec 6, 2014

XSLT does indeed have implementation issues, but it really is very powerful once you know how to use it properly. And I don't get how it's a "failure". It's one of the most widely used technlogies in the publishing industry.

Our company once had an application that processed the end-of-year high school student results and then published them in various newspapers. The input files were text files generated from the Education Department's database from various regions. The process took around 10-15 minutes (lot's of rules had to be run against the data). I replaced it with a Windows JScript script and XSLT. It took 15 seconds to transform the data.

That said, I still use XSLT regularly but I'd be lying if I enjoyed working with it. Using a decent IDE for development and debugging can help.

Someone did give me a nice tip for working with and learning XLST though - "translate your transformation rules directly from simple English to the template rules".

eg. "I need to insert the node <member> under <group>"

  <xsl:template match="group">
    <group>
      <member/>
      <xsl:apply-templates/>
    </group>
  </xsl:template>

"But if the group node exist, doesn't create, then create it"

  <xsl:template match="root[not(group)]">
    <root>
      <group>
        <member/>
      </group>
      <xsl:apply-templates/>
    </root>
  </xsl:template>

radicalbyte · on Dec 6, 2014

The problem with XSLT, XML, SOAP, WS-* is that the community is driven by vendors with deep pockets who use their power to kill interoperability.

Imagine LISP, but in the hands of Sauron or Palpatine. That's the XML group of technology.

praptak · on Dec 6, 2014

> Imagine LISP, but in the hands of Sauron or Palpatine.

Common Lisp?

paulsutter · on Dec 6, 2014

JSON/Javascript is just easier to learn and work with than XML/XSLT. For starters, you don't need to learn a special language that you'll never use for anything else.

Just in case you /do/ need to use XML every day: the primary benefit of XSLT is that it lets you avoid using XML libraries to munge some XML. Because the XML libraries are so horrendous to use from any language.

qwerta · on Dec 6, 2014

I am sorry but Javascript is not much better than XSLT. Horrible legacy technology full of bugs nobody was bothered to fix.

Havvy · on Dec 6, 2014

JavaScript isn't full of bugs nobody has bothered to fix. It's full of specified behavior that cannot be fixed without breaking backwards compatibility. That said, the issues are well known and actually bypassable.

Whether you should be using it for the problem XSLT tries to solve is another question. Probably not, since there are other templating languages.

ilitirit · on Dec 6, 2014

> JSON/Javascript is just easier to learn and work with than XML/XSLT.

Unfortuntely running Javascript on the back-end for this purpose is not something most companies do.

crdoconnor · on Dec 6, 2014

You're joking, right? There are JSON libraries for every back end language and ecosystem you could imagine.

rwallace · on Dec 6, 2014

The first part of the first quote is incorrect. The problem XML solved was bloody hard, though most people nowadays have forgotten what it was. It was to get the world's programmers to stop creating binary file formats, to put an end to the situation where every time you went onto a site, the first thing you had to do was reverse engineer binary files with a hex dump utility. XML, to its very great credit, solved that problem.

Now if you want to start asking whether it solves the problem as well as today's competing technologies like JSON, at that point I will step back and hold my peace.

bibonix · on Dec 6, 2014

XSLT is one of the most powerful and elegant technologies created in the last 15 years. Those who don't understand it and that's why can't use it, should just do their home work and learn better

donum · on Dec 6, 2014

Since two years I am working full time on a website which gets rendered via XSLT (via Apache Cocoon). When I joined, it was kind of a mess and I had a hard time understanding XSLT and the templates written.

Just recently I had the chance to do a rewrite. What I did is I created "my own HTML". Basically every module of Twitter Bootstrap has it's own XSLT template. That means you have a very easy XML "HTML" syntax, but the output is Twitter Bootstrap. And every piece of output HTML is defined just once in the whole application so it's easy to maintain.

With the help of XSLT you can abstract a lot of things. One example: I have an element called <colgroup/>. It can contain up to 12 <col/> elements. if I set the @size attribute to one of the columns, the @size attribute for the others will be calculated automatically and the output matches the Twitter Bootstrap CSS classes.

I have to say, I love it. I can't imaging writing the whole mess of Twitter Bootstrap plain HTML in a template anymore.

raverbashing · on Dec 6, 2014

My definition of powerful and elegant is lisp

XML is just death by overengineering

peteretep · on Dec 6, 2014

   > My definition of powerful and elegant is lisp

Dude, XML is just s-exprs and XSLT is macros.

serichsen · on Dec 6, 2014

> XML is just s-exprs

No. That myth has been decisively addressed by Erik Naggum about 12 years ago. His summary:

> They are not identical. The aspects you are willing to ignore are > more important than the aspects you are willing to accept. Robbery > is not just another way of making a living, rape is not just another > way of satisfying basic human needs, torture is not just another way > of interrogation. And XML is not just another way of writing S-exps. > There are some things in life that you do not do if you want to be a > moral being and feel proud of what you have accomplished.

Please read his posting/rant for the arguments. Dude. (I'll just tell you to search for "naggum xml", there are more than enough copies in circulation, and you'll find a few more postings by other people.)

Now, as for XSLT: The big problem is the hairy syntax. It is really (at least) two languages (the XML tags, and the query language that is used inside selectors). In effect, you are writing at least three languages completely intermixed in a single file: the output language (most often some XML or HTML variant), the XSLT tag language (another XML format), and the XSLT query language (an incredibly limited ad hoc micro-language inside some XML attributes).

XSLT is a very limited language, as opposed to Lisp macros, which can use the entire Lisp language.

And yes, I have used XSLT in my job, and I do have reason to think that the XSLT-stylesheets I wrote have an acceptable quality. However, I know that I could have done their job better if I could have used some structured data, an HTML formatter, and a real programming language.

DennisP · on Dec 6, 2014

Tell any lisp programmer he doesn't get first-class functions and has to use nothing but macros, and he'll run screaming in horror.

dasil003 · on Dec 6, 2014

XML and XSLT are analagous to s-exprs and macros, to say they are "just" those things is willful ignorance of a whole boatload of complexity.

raverbashing · on Dec 6, 2014

Good, so I'll replace the '<' in XML with '#:-D' and the '>' with '%:-/'

Or replace indenting levels with the XPath of where it belongs and make order optional

Syntax matters.

raverbashing · on Dec 6, 2014

REALLY? http://en.wikipedia.org/wiki/XSLT#Example_1_.28transforming_...

VMG · on Dec 6, 2014

I want to enter this piece of evidence into the trial:

Comma separated string parsing XSLT: http://stackoverflow.com/a/2850181/92493

arethuza · on Dec 6, 2014

If you are using XSLT to work on anything other than XML you are doing very strange things.

raverbashing · on Dec 6, 2014

Oh good, so it's only good in the world of XML

That's what I call USELESS

patrickg · on Dec 6, 2014

apples and oranges? There is no other way of encoding structured documents these days than XML. Like it or not, XML is the de facto standard for data exchange (export from databases, product information software,....)

XML might be overengeneered (which, except for a few things I don't agree with), but there is currently no alternative for it.

dragonwriter · on Dec 6, 2014

> XML might be overengeneered (which, except for a few things I don't agree with), but there is currently no alternative for it.

There's perhaps no general alternative to it that covers all of the things XML tries to do; there are lots of specific alternatives that cover specific things that XML tries to do. The complaint against XML isn't that there is a better general replacement so much that there is a better replacement for each (or at least, very many) of the applications and that trying to shoehorn all of them into a single solution has costs that outweigh the benefits.

boomlinde · on Dec 6, 2014

The site in question lists several alternatives, some of which are widely in use especially in cases where XML falls short: brevity, fast to parse, easily readable... Most of all, calling it the de facto standard is either dishonest or clueless.

patrickg · on Dec 6, 2014

I am working for many years in the publishing business (dealing with structured documents, product data etc.). I can tell you, that all of my customers are more open XML than any other document formats (there exist none as suitable for the job).

You can call me clueless or dishonest, I don't care. I can only share my experience with the topic. You don't have to believe me.

boomlinde · on Dec 6, 2014

Your customers in the field you work in, perhaps, but it's hard to tell that's what you mean when you write data exchange. It's a wide field that is not limited to the publishing business or your customers.

ilitirit · on Dec 6, 2014

> Most of all, calling it the de facto standard is either dishonest or clueless

It is in fact the de facto standard in the publishing industry. The "other" format is of course PDF.

Maybe all of this will change when we have more technologies that support formats that can handle mixed content as easily as XML.

Mikhail_Edoshin · on Dec 6, 2014

Does anyone use Microsoft XPS? It looked rather interesting; like PDF without any interactivity (except for links) and with special support for publishing (color management, job tickets, etc.). And internally it is a collection of XML docs and binary data zipped together into a single file; pretty neat, must be easier to use in automated workflows.

patrickg · on Dec 6, 2014

I can't answer the question but anyway:

None of my customers have ever asked for that. I have not seen a printing house that demands XPS. So I doubt that it plays any role in the market (Germany here).

Mikhail_Edoshin · on Dec 6, 2014

Thanks, that's informative. I haven't seen it used much either; the only real use I saw (aside from viewing) was an XPS printer driver for a Canon inkjet printer. (I've also heard that it produced better results than the standard driver, but have no first-hand experience.)

justincormack · on Dec 6, 2014

databases are not structured documents, you can export them eg in JSON perfectly well. Or for that matter in SQL as is the usual practise. Marked up structured text is a different matter, XML still has a use case there.

patrickg · on Dec 6, 2014

I was a bit unclear. When I get database dumps, I get them as Excel files or as an XML document.

Most of the times the documents I get are hierarchically structured.

Yes, JSON could be fine as well. But it simply lacks a standard toolchain which XSLT ans its ilk proides.

I am not trying to defend XML in any way. I just want to say the two things:

a) my customers never deal with JSON, but often with XML, so JSON (and other formats) are not an issue for me b) There is a very nice toolchain for XML, including formatters, tranformation tools, database publishing tools (my very own: https://speedata.github.io/publisher/index.html) and many others. I have not found such a toolchain for other formats.

justincormack · on Dec 6, 2014

But XML does not map well to a database dump. A database is a set of sets of tuples. It is not generically hierarchical. It maps to a set of csv files say, as a transfer medium, but it is designed to be manipulated through relational calculus, which is not easily mapped to xslt.

You are using tools not because they map well to the problem space, but because they are the tools you have, which your customers want, and which you are familiar with, but that does not mean they are actually mapping well to the domain.

patrickg · on Dec 6, 2014

Actually the XML part is the one that maps best to the problem space. I deal with structured documents (such as part/title/paragraph ... or product group/product/components). The database part (usually Excel sheets or SQL based stores) are the ones that are "insufficient".

radicalbyte · on Dec 6, 2014

The problem is that it's often misused. Using an XSLT that takes 0.5s to run to transform some XML to HTML during web requests on a busy site? Idiotic.

Using it for async transformations - html to pdf, customer message format to your message format. Fine.

crdoconnor · on Dec 6, 2014

The problem is not that it is misused. The problem is that there is a small subset of problems for which it works passably (like many other technologies) and a large universe of problems for which it will cause you massive pain.

brunnsbe · on Dec 6, 2014

Using compiled XSLT-translets in Java (XSLTC) has a huge impact on performance, with transformations taking 0.5s the problem seems to be in the way your XSLT is written and pushing or pulling the data, not XSLT itself: http://xalan.apache.org/old/xalan-j/xsltc_usage.html

Mikhail_Edoshin · on Dec 6, 2014

Compiled XSLT stylesheets must be very efficient; they're automata, like regular expressions.

duncanawoods · on Dec 6, 2014

I believe the fundamental pain of xslt was... that it was an FP language. When teaching XSLT, the difference between those who said "its elegant" vs. those who said "its pain" - is whether the individual could grok FP.

Angle bracket overload, verbosity of end tags, library support, poor whitespace handling, namespace pain were all obstacles too but it was FP that made standard problems feel like math proofs and for developers to take days to solve problems they could code in minutes in their usual OO/imperative language.

When I see the pain FP causes in the real world I'm never quite sure whether its nature or nurture. I currently believe its a bit of both but the nature part will always hobble FP adoption - if you find algebraic proofs elegant, you will like FP. If you are "normal" and proving a theorem fills you with terror then you would prefer your programming language to resemble a a cookery recipe.

I also believe all templating, especially for code-generation, requires three brains - understanding the input data-structure, understanding the processing of the template and understanding the behaviour of the output. Each keystroke in your templating language has to be carried out with full understanding of all three parts. Its too much for those if they still struggle with more common two brain programming problems.

VMG · on Dec 6, 2014

> I believe the fundamental pain of xslt was... that it was an FP language.

From the post I linked in my other comment

> Oh, and the fact that you can call a language functional when it lacks first class functions makes my eye twitch. I'm tempted to upload a video of my eye twitching just to prove it.

sqrt17 · on Dec 6, 2014

That seems like the difference between object-based (early VB) and object-oriented.

XSLT is referentially transparent (no setf for you) but withholds from you most of the goodies that people take for granted in functional or logic programming.

You could see it was written by well-intentioned FP enthusiasts. IMO the best alternative at the time when XSLT was developed would have been XMLPerl - embedding an imperative language in something that deals with the XML-specific parts appropriately. But Perl was never enterprisey enough, and XmlPerl died a painless death.

chc · on Dec 6, 2014

> I believe the fundamental pain of xslt was... that it was an FP language. When teaching XSLT, the difference between those who said "its elegant" vs. those who said "its pain" - is whether the individual could grok FP.

I've heard this before, but I don't find it to be true for me anyway. XSLT has never really clicked for me, while I really like Clojure and OCaml. Maybe the FP is part of the problem, but I also think XSLT is just a particularly obtuse functional language. XSLT makes it hard to figure out how to express even moderately complex algorithms (e.g. a map-reduce function is literally just that in Scheme, while I'm not sure I could write one correctly without several tries in XSLT) — and once you do express them, they're buried under an impenetrable mound of XML boilerplate that makes them hard to maintain or understand later.

moonshinefe · on Dec 6, 2014

That was my experience working on a huge XML/XSLT project recently. Agreed.

pja · on Dec 6, 2014

The FP bit was fine. It's wrapping the functional program in the most obtuse syntax known to man (apart from C++ template syntax which at least has the get-out clause that it was never meant to be a turing complete programming language that programmers took seriously and used to get real work done) that's the problem.

Full disclosure: My last exposure to XSLT was a long, long time ago and I've been carefully avoiding it ever since.

rumcajz · on Dec 6, 2014

It's more of a declarative language. Like Prolog.

justincormack · on Dec 6, 2014

I dont think its really either a functional programming language (no first class functions), or a logic programming language like Prolog (not implemented via unification). It is a declarative language that is in an independent category.

dalke · on Dec 6, 2014

I really enjoyed proving theorems for analysis - I have an undergrad degree in mathematics - but not so much algebraic proofs. Where does that put me? :)

In any case, we know that algebraists who program also use Maple, Magma, Mathematica, R, and Sage, or even straight Python, C, etc. FP languages are a minority even in the professional mathematics world.

kabdib · on Dec 6, 2014

I once worked on an embedded system with not a lot of RAM or code space, and we were ripping features out to put new stuff in and rewriting things to make new things fit (not a bad practice, re-writing, btw). I wrote some tools to do global optimization at the assembly language level and got back 40K because our compiler was one of those $5000 / seat pieces of crap, and we didn't have a debugger, so object-to-source-line mapping, who needs it?

Anyway, one of the PMs on the project insisted that this choice little hunk'O'hell communicate with the outside world -- at about 2K bits / sec on a good day -- using XML. Because it was standard. Because XML added to anything makes it better, no matter what it is. Because, well, nobody ever saw that traffic except other computers, but XML!

I wanted to kill, kill, kill, but instead I wrote an XML parser (kind of) that fit into about 1K of code. "Don't use entities or CDATA or namespaces," I said, and went away to a better place. I think the PM was happy. That group got sold to a company famous for its 60s-era radios and crappy teevees, and I assume everyone is happy there, because I have not heard a word from them, and XML!

chinpokomon · on Dec 6, 2014

I've seen XSLT done well, and I've seen the mess it can be when written by someone with only a passing knowledge. As a consultant I usually have dealt with the latter, and that usually goes hand in hand with a poorly designed XML schema.

One space that XSLT can demonstrate its strength is when transforming some horribly serialized interoperability data structure. If the system from which you are receiving data, produces terse XML, you aren't going to solve anything by rewriting the upstream system to produce equally lousy JSON. If you don't have the ability to fix the upstream service to produce better structured data, XSLT and XPath are wonderful tools to morph it into something more manageable. That transformation process is better written with XSLTs than it is in trying to do the same thing by slurping the data directly into some business object and trying then to work with a bad model. Don't go down the path of "garbage out, garbage in."

If you have access to both sides of the process, it might be worth rewriting the upstream system, but when working with a legacy system XSLT might be the best glue technology in your arsenal.

ohyesyodo · on Dec 6, 2014

Okay, so what am I supposed to use if I want to transform a XML file from one format to another because two different systems needs XML for input /output and they have different fixed formats?

crdoconnor · on Dec 6, 2014

I would use a language that has a decent XML parser (e.g. python + lxml) for input and a decent templating language (e.g. jinja2) for output.

Assuming it was a simple transformation, the python parsing code could be under 10 lines. Most of what you wrote would be templating. It would be 98% declarative.

If it got complicated though (e.g. you're doing some aggregation or something), the python bit would grow but it probably never end up looking that horrendous, unlike XSLT.

The same pattern could be applied to many other language ecosystems. You just need to make sure you get the best XML parsing library and the best templating language.

ilitirit · on Dec 6, 2014

You could do the exact same thing using XSLT. eg.

    var result = new XSLTProcessor().importStylesheet(xsl).transform(....);

It's a bit of a pointless example because it really depends on the transformations you need. I'm sure in some cases XSLT would be better for the job, and in other cases another language. Most of the time it would generally just depend on your environment, available tools and skillset.

crdoconnor · on Dec 6, 2014

>I'm sure in some cases XSLT would be better for the job

In some (simple transformation) cases XSLT would be no worse, but mostly it would be worse. I can't see it being clearer or easier to maintain under any circumstances.

Once your code evolves toward doing anything mildly complicated you'll wish you never made your transformation in xslt.

jgalt212 · on Dec 6, 2014

I don't know if you're trolling, but here's how I'd do it in Python

  import xml.etree.ElementTree as ET
  tree = ET.parse('data.xml')
  new_tree = my_transform(tree)
  new_tree.write("output.xml")

ohyesyodo · on Dec 6, 2014

I'm not sure if you are trolling, but you left out the actual transform. In my experience, XSLT feels optimized towards transforms, which most other languages aren't. I also dislike XSLT, but whenever I do things like this in Python, C# or C++ it tends to get more messy then my XLTS when the transforms are nontrivial.

jgalt212 · on Dec 6, 2014

I'm getting downvotes here? Obviously, not from ohyesyodo b/c he does not have enough karma to allocate downvotes.

ohyesyodo · on Dec 8, 2014

You are probably getting downloads from the fact that your response did not make any sense.

jgalt212 · on Dec 11, 2014

If you know Python, my response make perfect sense, in that it's easy to transform xml string -> python tree -> new python tree -> new xml string or json string or csv string.

snarfy · on Dec 6, 2014

I like this one:

>“XML combines the efficiency of text files with the readability of binary files”

One dirty thing about XML is that it appears human readable, but it is not human writable. You'll see something in the XML that you think you can change, but now it doesn't validate anymore after you change it using a text editor. You need an XML editor that understands XML validation to make edits. If I cannot use a basic text editor, it's not basic text. If it's not basic text, it's no better than any other binary protocol, albeit a very inefficient one.

Mikhail_Edoshin · on Dec 6, 2014

How do you manage to change C or Python source code? Omit a ";" in C or ":" in Python and it will refuse to compile.

pm24601 · on Dec 6, 2014

I normally don't do +1 on a one-line answer but this was the perfect rebuttal.

I wonder how s/he does handle that pesky ';'

hokkos · on Dec 6, 2014

What article are we supposed to look at ? It is witty quotes and links, there in no substance is this article, just bashing. Is it supposed to be an answer to this article that had a lot of discussion on proggit ? http://www.reddit.com/r/programming/comments/2o5nvy/why_i_li...

mhd · on Dec 6, 2014

I wasn't surprised, given the domain. Plan 9 fanboys tend to have a very simplistic view of the world and think that "KISS", "do one thing well" and "just pipe stuff" is the universal solution to all problems.

If the alternative look like a (t/n)roff file, I'd gladly take XML, though.

brokentone · on Dec 6, 2014

Reminds me of the person (people, team, etc) who decided it was a good idea to code websites in XML + XSLT. Because XHTML is just not good enough. The most prominent of these (I'm shocked it's still this way) is http://www.skechers.com/

lsaferite · on Dec 7, 2014

Why are you shocked? Investing in a software stack for a website is non-trivial. If it's working for them then it's throwing money away to change it right now.

pm24601 · on Dec 6, 2014

The problem with most tools is that they go through the "bright and shiny" stage:

  "New tool - cool! Lets use it on EVERY problem."

And are thus misused.

XSLT exists for inter-organizational data transfer and transformation. Don't use it for any other situation.

XML is a good (not perfect) persistent data storage mechanism where you need the data to outlast the program that created it.

I go into more explanation here: http://sworddance.com/blog/2014/12/06/xml-is-not-bad-just-mi...

Lets not blame a tool that was misused.

considerjoost · on Dec 6, 2014

If you ever find yourself somewhere you have to work with XML data and the people there use things like XSLT (and suicide is not an option) you should consider using Joost. Joost implements the lesser known STX language. It's a far cry from Haskell or other better tools but nonetheless more useful and practical than XSLT for tasks like cleaning and extracting interesting things from blobs of XML. It's available here: http://joost.sourceforge.net

mqsiuser · on Dec 6, 2014

Software hackers will come up with a single right solution. This solution is cemented in theory. Transforming one (message) tree into another must be done in the following way: http://www.usethetree.com

Don't just downvote me. Challenge me!

Edit: Maybe downvote & challenge me :-) ?

cenazoic · on Dec 6, 2014

Anyone tried the "XSLT-powered open source content management system", http://www.getsymphony.com/?

ogig · on Dec 6, 2014

I've made several small sites with it. It used some great ideas like the data model been totally flexible while easy to use and xml + xslt as server side templating was nice to use. The in-browser developer tools were good too.

Unfortunately there were also a bunch of bad points that never got fixed. Breaking changes for pluggins exhausted the small contributors community. I think the project is basically dead at this point and I've moved to another cms for small sites. (Keystone.js)

All in all, if i were to rewrite symphonycms now, i would drop xslt in favour of jade or something less anoying to writte in.

EDIT: I've been browsing symphonycms repo after writting this and it's untrue to say the project is dead since Brendo is still actively commiting.

based2 · on Dec 6, 2014

http://cocoon.apache.org/1365_1_1.html

jules · on Dec 6, 2014

The syntax of XSLT is obviously godawful, but the architecture was sound.

icantthinkofone · on Dec 6, 2014

I think I should collect a bunch of quotes espousing the virtues of XML and XSLT and put them on a web site so people can link to them.

lafar6502 · on Dec 6, 2014

xml is technically a superset of JSON, just can't imagine any clear advantage of JSON vs XML - it's the same league, just some minor syntactic differences. And what is the end of the story for JSON, is just a beginning in XML: the concept of namespaces is powerful and totally foreign to JSON. Then XML Schema - another job well done and poorly mimicked in some JSON libraries. XSLT just builds on these concepts and is not a general purpose tool - don't see a reason to have strong opinion about something so narrowly specialized.

Mikhail_Edoshin · on Dec 6, 2014

I really don't understand how programmers can hate XSLT (and XML for that matter). It's a marvelous piece of technology, truly. Maybe the whole stack is so ahead of its time that nobody gets it?

miohtama · on Dec 6, 2014

XSLT is 15 years old technology. If it is painful or not should be obvious today even before starting a project.

tete · on Dec 6, 2014

cat-v is actually pretty nice. A lot of their stuff looks a bit like "Oh my god, why" and "That person just doesn't have a clue". I thought the same with so many technologies that I got into. I always thought KISS was nice and I always felt like abstraction is a good way to reach that KISS.

Turns out it is not. Things can fail, really badly even and they do fail, really often even, even when there are companies with big pockets standing behind it and then good luck debugging the mess. This is true even for really proven technologies, but might be less obvious on those.

At some point when you are good at some technology, even if it's really popular, mainstream, be it some big SQL database (yes, all of PG, SQLite and MySQL even though I love some of those), C, C++, Java, C# or Python/Ruby/Perl/Node.js you will constantly end up with the implementation of the underlying technology.

I am not saying XSLT or any of the above don't have their use cases, but actually that a lot of them are over-engineered. Using most of these technologies I know there are issues, send bug reports and patches and hey, things get fixed really quickly. That's all good and you never can fully avoid these things, but the more simpler you get the less there is that end up biting you and causing you to start out with lets say you having your SASS to CSS compiler having an issue, digging deeper through every library finding a GCC bug or whatever. Such things happen.. more often than one would think. So based on developer pay and the issues it causes (often being a blocker for a whole team) that's actually a really big problem.

danieldk · on Dec 6, 2014

but the more simpler you get the less there is that end up biting you

And increasing the probability that you invent the wheel, badly. I have been using libxml and libxslt for years and as far as I remember I never encountered a bug. Both projects have been developed for years and is used by a gazillion other projects.

It is many times more likely that you will be bitten by a bug in your own custom configuration file parser than e.g. in libxml2.

I am not arguing for or against XML, but code reuse. Simplicity also means not reinventing the wheel and keeping your own projects simple by leveraging existing, good, libraries.