I work with chemical and materials companies on full stack data capabilities and interactive viz / dashboarding is a recurring pain point. Frequently there are larger scale processes (beyond a single engineer’s or scientist’s scope) that still require scientific-level interactive viz that the likes of PowerBI or Tableau can’t (or won’t?) provide… if your company even has a subscription. Things like being able to dynamically re-group/nest variables and recalculate statistical tests, dual axis capabilities, automatic sig fig number reformatting, just all kinds of quality-of-life features that in some cases are possible but in most cases are considered extreme edge cases and require too much manual config / aren’t templatizable.
Of course on the other end you’ve got the whole Python/matplotlib/seaborn/bokeh/plotly/vega/altair “ecosystem” (although it’s more of a swamp if you ask me), which require someone to maintain Python code and a means to stand up an internal server. Not to mention that most use cases require significant customization. Plotly Dash always seems somewhat promising but as someone below mentioned it’s actually kind of slow? Every time I try it I’m just kind of underwhelmed.
I hear ggplot in R is good but I’ve never used R and it’s hard to get a critical mass of people in a company behind R so that’s kind of off the table.
The only programs that really get the aesthetics of scientific plotting right without a ton of customization are JMP, Origin, and Igor Pro (props if you’ve heard of it), but these are all desktop apps… although JMP is starting to make a push into cloud-hosted stuff.
I guess all that is to say if anyone is interested in starting a company in this space, let me know.
Matplotlib works well for static plots. Altair and others freeze at around 4000 data points, which is crazy. Streamlit + matplotlib is impossible to maintain but is quick to get up and running.
Exactly. This is the exact stack I envision to be the future in that space. The work of Scott Logic is awesome (and the company looks really nice as well! People and values)
I completely agree with the entire stack. I’ve basically been learning d3 for this exact reason - the primitives are so intuitive and I can tell I’ll be able to make what I want. And yeah streamlit is so close to being useful but just not quite there. But isn’t plotly built on d3?
You'd still need to implement any custom selection widgets, data transformations (like other statistical tests) etc. still missing, but i like the technical design to build on top off. It uses https://github.com/observablehq/plot under the hood, which aims to have just as flexible a grammar as ggplot (already quite capable) but with interactive features (built by the creator of d3 and uses it under its hood).
Igor Pro is nice but the underlying language which grew out of a set of macros is pretty rough to work with (by default works via side effects and mutation of global state, though there are ways to contain them). However, I know people who have built some nice GUIs on top of it. While it does have HDF5 I/O integration, there's no memory mapping and it will choke on larger data sets from what I recall.
I've heard promising things about Makie [1] in Julia; there is also capability to build a dashboard called Genie [2] (and a commercial dashboard builder [3]) though not sure if Makie and Genie play nicely together at the moment.
I agree with cactusfrog that d3 is a step in the right direction for dataviz. It really is a swamp out there, with twenty or so different ways to do similar things but usually not quite what you want (or at least not as easy as you want it to be)! I've been researching dataviz out of curiosity and annoyance with the current state of dataviz software for a while before this post popped up and would be interested in integrating a lot of the past decade-or-so of open-source work in this space to a tool for full stack data viz. I'd love to work with as many engineers who do data viz as I can! You can get in touch with me via email at puterproblems [at] proton.me if you're interested (and to find out more about me as I'm aware this is a brand new account :D).
Doesn't R require coding too? And my recollection of Igor Pro was from long ago (on a 68k Mac), but it also required coding in its bespoke scripting language. In fact I walked away from it for that exact reason... I wasn't going to spend brain cells on somebody's proprietary language.
I think that for things like dashboards, we're still stuck between "code" and "no code" tools. I don't know of a happy medium.
Vega/Altair is a declarative grammar, so I wouldn't dump it in with the others. It's also convenient because it reduces to json (easy to store) and has TypeScript libraries for native presentation in a browser.
For the folks who don't like the default style of matplotlib, fonts, colors etc...
I made a stylesheet for my plots based on tailwindcss which you can use: https://www.bruderer.ai/blog/matplotlib
Additionally, Seaborn (https://seaborn.pydata.org/) is a great mention for people that want to use Matplotlib with better default aesthetics, amongst other conveniences:
"Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics."
I really like the book's subchapter on colours, wish it was even more elaborated on. Colours are one of the subtle things I so often find difficult to get right.
As to using matplotlib in published research: when I started out as an undergrad, everybody in the research team used OriginLab for plotting -- my impression of it then was pretty good. At some point, I started using matplotlib + Latex + science plots and it caught on, mostly because there's no need to shift all the data around to a separate programme. Scienceplots package does heavy lifting with fonts and styling for specific journals, so it's just a matter of designing the right plot geometry and information density [1].
I can recommend proplot (https://github.com/proplot-dev/proplot) as a ”beautifier” wrapper for Matplotlib—particularly useful for scientific publications
My weapon of choice is still R + dyplr + ggplot2 - mostly because I have been using it for so long and know it by heart.
I'd love to try something new, but don't feel the whole Python world is it. Is there any modern take - doesn't have to be production ready but should show a promising future? Anything from a more modern ecosystem, like Rust or zig, maybe?
It probably depends on your interests but I'm excited about the efforts using GPU rendering for interactivity (it's related to vispy but I forget the latest status of whether vispy is the current focus... I think the last I remember is someone was prototyping a vulkan-based system).
Not necessarily in Rust itself. Actually I like R as a language. I was more thinking something along the lines of Typist. A modern take of a proven concept, possibly with a Rust inspired scripting language.
As an Engineer (but in a scientific field) I just love Matplotlib. I moved a number of years ago from Matlab and even Excel (love how I still see papers with that mid grey background and full magenta on figures).
I'm a poor data analyst and programmer but have always managed to do what I need, mostly just time series and scatter plots. Sometimes a little more involved for my ability: such as a set of heat maps with common scaling and some tiles omitted...
I see some comments about the quality of the rendering and the look, but I think the default is really good. Nice proportionality of text, lineweight etc IMO.
Anyway I'll get to my point. I really can't get any value out of the Matplotlib official documentation. I'm not going to criticize it, just say that its not compatible with my brain. On this basis, good quality and accessible literature like this is very well received.
I find the same thing about the official docs. I can't really put my finger on the reason, except that it might be due to the vastness of MPL and the need for the docs to auto-update with each revision cycle.
When I do find useful docs, they're usually in the form of an example that I can use as a starting point. Referring to the docs comes later if needed. The consequence is that I probably use only 1% of MPL's capabilities, but that's already more than I could ever have imagined.
I've only been using Google Copilot for a few weeks, so it's too soon to know if that's how I'll deal with the situation in the longer term.
Nice book. But frankly Tikz and pgfplots are better for scientific visualization, unless your plot is too complex (in which case you have to settle for a lower quality plot).
TikZ does have a steep learning curve and can get pretty convoluted, but once you get a hang of it, you have a tremendous amount of control over what you are doing. I sometimes feel like TikZ is more like TeX instead of LaTeX given how primitive it can be, but it can be quite powerful if you give it enough time.
I would actually be very surprised if that is true. tikz is one of those examples (like e.g. M4 macros or assembly code or OpenSCAD) where producing working code requires significant mental arithmetic. Most skilled humans need to write down intermediate calculations on a piece of paper.
If you don't believe me, look at the "tikz unicorn" of GPT4 demo fame. Then try to replicate that with the currently available "degraded because of safety" GPT4 versions. Then ask yourself how on Earth you could use that to produce publication quality graphics.
I tried the "Draw a unicorn in TiKZ" in chatGPT now, and it actually looks even better than in that paper, and the code is very clean too. Not surprising as the quality of outputs from chatbots have steadily improved in the ~17 months since.
How do you use those outside of Latex? Lots of figures are coming from Matlab or Python, where Latex is a non-starter. Is there a cookbook to pollinate?
You can use matplotlib's pgf backend for direct use in latex. That said, not sure I agree that's better for scientific viz. Perhaps in the past. Also tikz/pgf is much more fiddly and has it's limitations too (imho).
Matplotlib is very easy to customize via “style sheets”, so you can make it match any style you want. Note that you can even turn on the “usetex” option and provide a preamble to compile all text/equations via TeX.
No doubt it’s very customizable. But the display quality seems to me not as good as even matlab. At least with defaults, which is a bit barebones. You have to work on it .
The syntax of python is also a bit verbose. Like, why do we need plt.show(), when I plot a figure obviously I want to plot it! Now I know, the point is, see matlab, gnuplot etc.
Compare with the default plot of gnuplot.
The limitation of latex is you can’t do much calculation with data, and memory usage.
Don’t get me wrong, I still sometimes go to python for plotting. But it’s general purpose programming language, and for specific applications there are custom tools.
I often use Matplotlib within a GUI. It's convenient to be able to defer the display of a plot until it's fully built, for instance if the plot is displaying data in a live update mode.
> The syntax of python is also a bit verbose. Like, why do we need plt.show(), when I plot a figure obviously I want to plot it!
That is kind of annoying. Fortunately, it's easy to turn that off by activating "interactive mode", in which figures automatically appear when you use a plotting command. You can do this via "plt.ion()", or have it set automatically with
Just keep in mind that when you do this, most plots will by default reuse the existing frame without clearing it. So you need to manually call plt.figure() more often (which in my opinion is more logical than plt.show).
There’s also a change in semantics regarding whether the script “hangs” until the user closes the plot window, if I recall correctly.
Don’t get me wrong either – I personally prefer Gnuplot by quite a margin, Matplotlib is indeed too verbose. These days I use Matplotlib more mainly because I use Python more.
I’m just saying that it doesn’t take too much configuration to make publication-quality plots in Matplotlib either, by e.g. enabling the TeX integration and setting up a reusable stylesheet file. The style sheet format is actually very reasonable, and several examples ship with Matplotlib. There’s also the Seaborn package if you want some nicer default stylesheets.
(Side note, the same goes for Gnuplot… As shown by e.g. gnuplotting.org you can definitely create publication-quality plots there as well.)
gnuplot + tikz terminal. I have yet to see better looking figures.
And technically, it's one of the rare if not the only one solution that can give you bibtex resolved citations in figure labels....
(Don't do it, the journals hate it)
Of course on the other end you’ve got the whole Python/matplotlib/seaborn/bokeh/plotly/vega/altair “ecosystem” (although it’s more of a swamp if you ask me), which require someone to maintain Python code and a means to stand up an internal server. Not to mention that most use cases require significant customization. Plotly Dash always seems somewhat promising but as someone below mentioned it’s actually kind of slow? Every time I try it I’m just kind of underwhelmed.
I hear ggplot in R is good but I’ve never used R and it’s hard to get a critical mass of people in a company behind R so that’s kind of off the table.
The only programs that really get the aesthetics of scientific plotting right without a ton of customization are JMP, Origin, and Igor Pro (props if you’ve heard of it), but these are all desktop apps… although JMP is starting to make a push into cloud-hosted stuff.
I guess all that is to say if anyone is interested in starting a company in this space, let me know.