Hacker News new | past | comments | ask | show | jobs | submit login

I avoid widgets for data exploration, which should be written from the start in a well-tested and library-focused sort of way even when it's ad hoc.

Model tuning absolutely should not be done with something like a widget. In fact, I could see how that could easily lead to unreported issues with multiplicity of testing when someone's just sliding around a slider and seeing what looks best, oblivious to the statistical consequences. Model tuning is better handled by having a separate sort of model specification file, in which parameters, data cleaning steps, etc., have to be registered ahead of time before any code is executed whatsoever. That allows full transparency and reproducibility: you can even map model specs to unique IDs and backtrack to which analysts executed the job to fit that model, how many times it was updated, etc. ... whereas someone monkeying around in a notebook with a slider bar, that's absolutely not OK. It would be OK for demoware, but never ever for serious production cases.

Avoiding widgets works great in practice. When I was working in quant finance, this was a reason why we heavily decoupled all data presentation code from all data exploration code.

We also realized that the interactive plots just add nothing 99% of the time and are virtually never worth the headache. Just use static plots until there's a serious use case that truly requires interactivity. Above all, don't use interactivity just because it's the shiny new thing.

bqplot, Bokeh, d3py, etc., these are great engineering projects that just unfortunately don't have pragmatic use cases and are generally adopted out of hype and an obsession for the new more than for pragmatism.

After working with lots of these tools, we just began to realize that e.g. mousing over line charts or maps and being able to click to drill down into data points was simply not helpful. Streaming plots do have some applications when you have to view real-time dashboards, but in these situations it gets abused and used incorrectly, mostly due to bad dashboard ergonomy (cough Bloomberg), and so there actually can be a cost-effectiveness argument in avoiding the streaming dashboard anyway, kind of like cost-effectiveness arguments for reducing alert menus to avoid alert fatigue. The cognitive foibles of the user matter to the design!

In a Tufte sort of sense, it just did not actually aid in perceptual understanding. It's more of a "let's do it cause we can" thing than a "let's do it because it actually offers actionable insight" thing.




Hey, Tufte also criticizes Excel, which is still most widely used tool for analyzing data and making plots that are not static. Engineers love it. Anyway I will stop here my replies.


Nah. Excel is terrible. Tufte's right about that one. I'm sure there are engineers who like Excel -- I mean Coldplay keeps selling albums, so?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: