To dovetail off another HN thread (Free Springer math books)...after reading Bostock's piece, I suddenly realized that the classic Grammar of Graphics by Leland Wilkinson he quotes might be free...and it is! The 1999 version anyway:
On Amazon, the cheapest available price is ~$55. For the second edition, which was printed in 2005 (and should be due to be free soon on Springer), it costs $132+
I'm a huge fan of D3. As I work on multiple platforms not just those involving web tech and javascript, I've often wondered if it would be better to put a lot of work into taking a more cross language, cross rendering system approach to building libraries. I'm wondering if you've ever thought about how you could design something like or extend d3 to create an ecosystem where assets are shared across platforms/runtimes/mediums much more easily?
I'm often left frustrated with how much of the visualizations we make are simple in terms of graphical components yet because we mix their implementations unnecessarily in with the development environment (the DOM in d3's case). I think you are in the great position to be able to get consensus on something like that. I was thinking some more like a clean file format, runtime, and tooling for building reactive visualizations that can be reused, composed, and embedded into any environment by virtue of having a clean reference runtime that could be ported to many platforms: web, mobile, desktop, vr, and rich publishing/printing needs as well.
ps. I love how you've pulled out a smaller clean part here.
Most attempts to create cross-platform abstractions have been a failure because they (necessarily) oversimplify, and you are left with a common-denominator approach that lacks in expressiveness, performance, or both. It’s better to use the standards directly: Canvas, SVG, WebGL.
Also I think people greatly underestimate the synergy that comes from embracing (rather than abstracting) standards, in terms of the available documentation and tooling: the element inspector built-in to your browser, compatibility with external stylesheets, etc. D3’s success is likely due in part to its embracing web standards, for example using selections to manipulate the DOM (and SVG) rather than introducing a new graphical representation.
That said, for higher-level applications it might be desirable to abstract the representation. I just think people jump too quickly to generalize because they overlook the subtle features that make the underlying rendering systems good.
Yeah, that sounds right. You leverage existing tooling, skills, and reduce the scope of focus by sticking to a platform. Guess you can always share those .svg, .svg, .geojson, etcs and you get a lot of mileage for sharing effort across communities.
I think you are looking for something like Vega (http://vega.github.io/) which is more a declarative grammar for visualization (and also includes interaction and the ability to stream data). As far as I can see there is only one implementation of the grammar (in javascript) but there's no reason it couldn't be implemented on top of opengl, for example.
I haven’t used React enough to form a strong opinion on right vs. wrong, but one of my objectives in decoupling D3 (see the new modules at https://github.com/d3) is to make it easier for people to use just the parts of D3 they need.
The recent more-modular approach d3 is taking (specifically, the separation of DOM selection-driven APIs from the data manipulation stuff) means you can now just use d3 with React directly.
For example, you can use d3-scale to map from data to drawing coordinates, or straight to SVG path data (using d3-path/d3-shape). You can then easily render this to an inline SVG using React (or to some other backend using React libs like react-canvas and gl-react). Building the svg elements directly this way makes it easier to attach React event handlers to them, and gives you more fine grained control, rather than using off-the-shelf components.
I wired things up using a Flux-like architecture, where the Store would periodically signal a data change and the component would render it, transitions and all. It all worked out pretty much as I expected.
You are one of my technology heroes! Sorry if this is off topic. Language, my own and foreign, were my worst subjects through school. It was a huge accomplishment for me to learn to write grammatically. The tool I used to learn was a systematic approach to graphing sentences. Now I've stopped everything and set out on a journey to learn write software to help younger students learn to analyze grammar, to understand the parts of a sentence, and to isolate the parts to understand the semantic meaning from structure. More importantly, to help students learn to create semantic meaning from composition.
My goal requires a drag and drop user interface to build tree visualizations. Starting from zero I've been focused on learning Javascript. I'm not interested in natural language processing with machines and machine learning. The best I've seen on this subject is Ben Podgursky's nlpviz[1] which uses Stanford's CoreNLP service with d3.js to create a parse tree visualization.
Do you know anybody who is working on this, English grammar visualization? And, if there are resources available on the web? What I need is an interactive tree visualization builder.
Your sentence got passed through Stanford's NLP service and here the result is visualized with d3.js[1]. Your use of the verb to find is quite interesting. Depending on structure it means different things. According to this graph it is a verb phrase (VP) followed by a noun phrase (NP:DObj) and a gerund phrase behaving as an adjective phrase object compliment. The verb to find can function in couple different ways. It can take a noun phrase direct object (NP:DObj) only without an object compliment such as the noun cat, as in to find a cat, with a semantic meaning of discovering something that wasn't there earlier, or it can function like the verb consider which equates a noun direct object (NP:DObj) with an object compliment (ObjComp) which requires, unlike the first use case discussed with only one grammatical structure following it, two grammatical structures following it. What semantic meaning can we get from parsing whether there is one grammatical structure following the verb to find or two grammatical structures. In the first case, a dog can find the cat, but in the second case, a dog can't consider which is a higher order cognitive function that requires reasoning. If I was building AI, I can already using Stanford's NPL service determine programmatically how many grammatical structures follow to find and I can also deduce that the noun before the verb to find in this case you, or me, is something capable of higher level logical reasoning, most likely a person, or, perhaps, a dolphin, a parrot, or IBM's Watson in very rare specific cases.
Any sentence can be parsed into its constituent parts. "You might find Stefanie Posavex's work inspiring" is a free standing grammatical sentence which is used as part of a larger sentence. It can be broken further conceptually into you might find something and Stefanie Posavex's work is inspiring. What I want to do is help students learn to take these last two sentences and mix them together with you and find into a meaningful larger sentence using interactive graphical visualizations.
A verb functioning as consider requiring two grammatical structures following it is one of a small number of ways a verb can behave. There are only a small handful of verbs that function as consider such as to think, to find, and to nickname.
Graphical visualization helps understanding this so much. The visualization I linked to is only proof of concept. If I used more specific labels on the nodes of the graph you would understand the concepts more clearly than me trying to write them. I'm working day and night on this. I'm learning about closures, prototypes, and the difference between pseudo class inheritance and delegation in Javascript. I've only minimally looked at how to work with tree structures. I'm so friggen' far away, I'm just a cook.
> I'm not interested in natural language processing with machines and machine learning.
I'm confused, why not? "English grammar visualization" is inherently an NLP problem. Both handwritten and statistically learned grammars are very large and therefore hard to visualize. I think it would be better to focus on a manageable language fragment. Some kind of interactive second language learning environment--but that probably already exists. What is it exactly that you want to do?
BTW: I wonder for how many people "graphing sentences" is an effective approach to language learning.
When I first got to college realizing that I was having a very difficult time with writing, I went to the grammar section in the school's library. After failing to work through a few of the more popular books on grammar that I tried, I discovered an small obscure grammar book. The way the book worked is to first explain a handful of grammar concepts and then have the reader parse several sentences before moving onto more complex concepts. It was so systematic and methodical.
I reviewed the book a couple years ago and out of curiosity searched for one of the concepts online. There were only a small handful of search results and I think the author made up the word, the book is obscure. One of the results linked to an article on the IBM website written by one of the Watson engineers. I wrote him an email because I would imagine their engineers were using much more complicated concepts in linguistic research that a layman like my self would ever understand.
We were interested in two very different things. He is interested in helping a machine understand semantic meaning through grammatical structure while I'm interested in helping people understand semantic meaning through grammatical structure. Sure we are both using the same grammar book to parse a sentence. But how we are doing it is completely different. He is writing a program to do it and I'm writing a program to hold a child's hand while the child does it.
A sentence is a pattern of patterns. For the same reason Mike Bostock's work in visualization helps people understand patterns whether it is about the road to the White House or the road to the Super Bowl based on team salaries, visualization helps students with pattern recognition in their own writing.
we should work together! i'm working on grmmr (green-bridge.org) and we have a solid system for identifying grammar & visualizing them well. this system is built upon years and years of actual classroom usage. we're looking to build an app / a game that will let the user to identify, analyze the grammar and get applauded for that
I mostly work on security or network stuff, and mostly in areas that are broken. D3 is one of few projects I've encountered which actually makes me happy software exists -- it makes beautiful and useful things, easily, for a lot of people. Thanks for your work in creating it!
Speaking about mobile first. Are there plans to slim down D3 itself?
d3-scale is just 18KB, great, but d3.min.js is 149KB and d3-scale relies on some more d3 addon JS files like d3-array, d3-color, etc. - so to sum up: 187KB minified JS for d3-scale including all dependecies, correct?
Tests have shown, the optimal goal is to keep a web app smaller than 300KB minified JS size.
Including dependencies and localization, d3-scale is 69KB minified and 24KB gzipped. You can make it smaller by excluding optional dependencies if they are not needed, such as d3-time and d3-time-format; you can make it even smaller by creating a custom bundle using Rollup. Here’s an example:
So, yes. Making it easier for people to include only the code they are using is one of the motivations for breaking D3 up into separate modules (and adopting ES6 modules).
I'm not a d3 dev, but from what I've heard is that the current d3.min.js is v3, the current standard. The various small modules that are coming out are v4, and are meant to entirely replace the larger d3 package. Ideally, the entirety of v4 should be smaller than the entirety of v3.
Thanks, sound good. I would like to pick only specific parts to render some charts to slim down the JS size. If it's possible with v4 or already with d3-scale today (as another comment suggested), that's great.
I'm a huge fan of d3, thanks for all of your work on it.
How is d3_scale different the scales included in vanilla d3? What is the motivation? Should I plan on trying to migrate my existing d3 apps to use these new modules?
First, it represents the 4.0 API, so it’s a preview of what “vanilla D3” will look like sometime next year when D3 4.0 is released.
Second, it’s a library you can use independently of the rest of D3. So if you don’t need the other parts of D3--for example you’re using React to manipulate the DOM, or you want to render charts to Canvas rather than SVG--then you can just use d3-scale (today).
Not all of the new D3 modules have been released; for example, I’m still working on layouts and selections. So for simplicity’s sake I would say that most people should keep using the default 3.x build of D3 and wait for the 4.0 release. But keep an eye on these new modules, and perhaps tinker with them, so that you’re ready when 4.0 comes out.
It’s expensive (in terms of effort) to design for multiple screens, but often necessary. Responsive design isn’t something that can be transparently solved by technology because it’s a design problem to decide what you want to show on different screens. That said, once you’ve made those decisions, technology can make them easier to implement.
(For example, see ai2html for a simple workflow used by The New York Times to create static graphics at various resolutions: http://ai2html.org/)
For most #content today, it’s probably best to design for mobile first and then think about an enhanced (or “optimized”) display for desktop later. But there are plenty of visualization applications, such as systems monitoring, where you might want to design first (or even only) for a large desktop display.
It’s a lot easier to have rich interactive displays on desktop. Our tiny phones are powerful computers, but there’s only so much you can squeeze out of a tiny display that gets occluded by our fat, meatstick fingers. ;)
Hi Mike, nice to see you here. In your opinion, what would be the best way to visualize a bipartie graph where the edges have weights and other properties? More background if you're interested: the graph is some result from an LDA (Latent Dirichlet Allocation) analysis on a set of documents. There are a couple of hundreds topics and a few thousands word/phrases. Thanks!
I’m not sure; it would depend on what you want to “ask” your data / what you want to learn from the visualization. I would look for existing visualizations with similar applications for inspiration, and I would try a bunch of forms (with your data) to evaluate what works, and iterate.
Great open source work. I've used d3 on many projects and, perhaps interesting to some, have used it often not as a charting library, but as a view/event model for DOM manipulation with advanced UI requirements (think dc.js and crossfiltering). This is after noticing that d3 works with large data sets very quickly - even those directly adding, removing, and editing DOM nodes & data.
My question for you is, broadly, where the speed of DOM manipulation comes from. Typically DOM interactions are very expensive in terms of execution time. I searched the docs and very briefly the source code to no avail.
Cheers, and thanks for open sourcing your work. It has been invaluable to my projects.
The speed comes from doing less. :) Selections are designed so that you only modify the parts of the DOM that need to be updated, and don’t waste effort touching parts that are unchanged. The downside is that it can be more work for you, since D3 (unlike, say, React) relies on you to tell it what needs updating, rather than computing it automatically. But in my experience that trade-off is often worth it.
One more question from a lady at the airport listening g to me talk about d3 and some back story:
You've had the opportunity to monetize d3 in the beginning, yet didn't (paid model). When it garnered reputation, you didn't implement freemium or other cost-free models. (Ads, consulting, enterprise, etc)
At all stages you didn't monetize even when it'd be deemed fine. Why?
I don't know why, but as a student developer I really appreciate not bloating it with ads or freemium or whatever. I am of the opinion that scientific tools are best developed non-profit and open-source.
Hey Mike, would I be able to break it down even further and get JUST the d3 linear scale?
Also, I've noticed that d3-axis as a separate module isn't ready yet. D3 scale and axis I have found to be superb when crafting graphs with Angular - is the d3-axis module a low or high priority for you?
Yes, using Rollup you can make a smaller build. (See other comment where I linked to an example.)
d3-axis depends on d3-selection and d3-transition, so it’ll be one of the last modules to be released, I expect. That said most of the functionality is in a sense already available in the scale’s ticks and tickFormat method.
Ordinal scales don’t have a continuous input domain, so you can’t plug them into the zoom behavior like a linear scale. You could have the zoom behavior transform the scale’s output range instead (or equivalently, use the zoom behavior to apply a transform to the SVG). Can you elaborate on when you would want to do this? It might make a good example.
Not the original questioner, but you can imagine trying to implement something like a TableLens in D3 (https://www.cs.ubc.ca/~tmm/courses/cpsc533c-04-fall/readings...), and zooming in the ordinal row axis just acts as a window onto a subset of the data.
Hello! A more general question: What are some of your favourite resources (books, articles, conferences) on information visualization? I'm familiar with the works of Tufte and Cleveland, but I was curious if you have any other recommendations. Thanks!
I read an article about the similarities and differences between data art and data visualisation a short while back (couple of weeks?), but I really can't seem to find it right now. Mike Bostock's visualisations, though very visually pleasing, always tend towards the functional side of the spectrum, serving a purpose of enhancing the comprehension of concepts or relationships that lie in the data, rather than being designed for purely æsthetic aspirations.
There's obviously a place for both visualisations and art, though, and the line between the two is not always clearly defined.
>To this end, D3 espouses abstractions that are useful for any visualization application and rejects the tyranny of charts.
Hmm, a lot of projects floundered when they got that (meta/abstract)ambitious, and forgot the plainer beginnings that made them successful. Hope D3 can pull it off.
This is not a new ambition; it has been my goal since starting Protovis in 2009. It’s working out okay so far, but I totally agree that things can go off the rails if you spend too much time thinking about abstraction and not about practical examples and real-world usage.
Is the idea to have the next version of d3 be completely modular? So that if we use a tool like webpack, we'll he able to pull in the parts that we need?
Mike, I'm big fan of your work. Have been using it since early releases.
I'd love to hear your thoughts on how to make visualization more appealing for non-programming users who want to interact with data beyond the basic clicks and drops.
> D3 ... rejects the tyranny of charts
Let me share a couple of counter-examples that attempt to accomplish the opposite, albeit for the common good:
What you see is a set of charts built with D3 under the hood and placed on grid layout with a mix of configuration settings and basic control structures. What's unusual in this approach is that it introduces end-users to programmable visualization through a simplified DSL. It helps users step outside of GUI editor sandbox and yet it doesn't expose them directly to JavaScript and SVG.
I'd appreciate your critique of this approach in general, not necessarily our implementation of it :) Is this level of "tyranny" acceptable in your view?
I personally make the better part of my living off being able to do custom D3 visualizations with some degree of speed and flexibility, so thank you Mike for your continued contributions and making it free for the rest of us. I'm seriously looking forward to D3v4 ;)
Hey Mike, glad you're doing a AMA.
I have a few questions:
1. There is no great javascript library for manipulating data, like pandas for python. Are you interesting in starting one as part as a d3 module? This could benefit from d3 being a standard and from your experience in writing javascript code.
2. Do you regret the original .enter() .exit() functions? These were powerful, but were a barrier for entry because they were hard to grasp. Do you wish you'd gone full functional from the start?
3. How do you get paid? Are you sponsored? Are you looking for sponsors?
2. Not at all; I still think the data-join is a powerful concept for transforming the DOM based on data, and the new work on non-DOM D3 modules does not reflect a shift in my opinion. I’m simply decoupling D3 so that people can use the parts independently. I’m still developing the new d3-selection and d3-transition modules, and those will be released soon, but I’m saving them for last because I want to let the design “bake” to make sure I’m happy with it before release.
3. I am not currently paid (other than a buck or two I make off stickers). My wife is currently my de facto sponsor. ;) I’d like to find a way to make this financially sustainable, but I don’t know yet exactly what that will entail, and I’d like to get 4.0 released first.
Mike, your work never fails to inspire. The cubism slides (http://bost.ocks.org/mike/cubism/intro/#0) read like a revelation. It astounded me how much data you manage to beautifully convey.
w/r/t #3 - I just did the same thing with Semantic UI and our 2.0 release. Took a year off after working at The New Republic, but kept a desk there to work on open source while receiving a modicum of donations from the community. Was a hard year but the quality of work that the freedom provided was worth it.
Perhaps you should consider a donate button in the docs for the time-being while you're going down this path with the project.
Wes McKinney (author of pandas) built something like pandas for javascript as part of his (now aquired) startup datapad.io. I think he mentioned open sourcing it at some point, but he's probably too busy building Ibis at Cloudera now.
Nomenclature is becoming confusing. In "R" (the statistical programming language), vector is analogous to column, and now in D3, vector is analogous to row.
Could just be the difference between storing arrays in memory using row-major order vs. column-major order. Not sure which ordering JavaScript uses, but R is definitely column-major.
Would love to see a design surface like Webflow for D3. Then you would get high-fidelity 'create' mode while retaining fine control of SVG programmability.
http://link.springer.com/search?query=grammar+of+graphics
On Amazon, the cheapest available price is ~$55. For the second edition, which was printed in 2005 (and should be due to be free soon on Springer), it costs $132+