Hacker News new | past | comments | ask | show | jobs | submit login
Announcing ggraph: A grammar of graphics for relational data (data-imaginist.com)
115 points by adamnemecek on Feb 25, 2017 | hide | past | favorite | 30 comments



More people should be aware of the work that goes into plotting data in R. All UI design is a subset of data visualization, and boy are the R people really ahead of the pack in data visualization.

Some other projects that might be of your interest:

https://vega.github.io/vega/

http://ggvis.rstudio.com/

http://bokeh.pydata.org/en/latest/


"All UI design is a subset of data visualization"

Since UI design potentially encompasses almost all interactions with designed objects, you must be working with a very narrow definition of what a UI is.

What about the question of "parametric" vs "strategic" choices in UI design? Surely strategic design decisions require a level of investigation into the user's motivations (etc.) that is well outside the remit of data visualization.


I've done a lot of work with graph network libraries in R/ggplot2 for data visualization, even figuring out how to make them interactive, in WebGL for scaling to hundreds of nodes, with minor code changes.

Tutorial/Intro: http://minimaxir.com/2016/12/interactive-network/

Practical example using HaveIBeenPwned data: http://minimaxir.com/2016/12/pwned-network/

The catch is that this trick uses the ggnetwork library, which is less-actively developed than ggraph. I remember trying to port the interactive code to ggraph but with not much success (since ggnetwork serves a more of a translator for native ggplot2 functions, while ggraph offers more flexibility). Now that the official release of graph is out, I'll give these types of visualizations another try.


I'm in talk with the plotly devs regarding ggraph support — once that materialises it should cover most interaction


Great to hear! I'll keep an eye out :)


That's fantastic!


For network visualization in R I've found this package to be extremely easy to use http://datastorm-open.github.io/visNetwork/

It performs well with large networks (> 1000 nodes) and makes nice javascript plots that can be zoomed, highlighted and dragged about.


This looks perfect for something I'm working on, thanks! Have been using d3 so far but have been hitting up against performance issues.


Try switching out svg for canvas...


For larger graphs -- 10K nodes, 100K nodes, ... --- we built http://github.com/graphistry/pygraphistry, which leverages GPU client+cloud acceleration. We're happy to share API keys with folks doing fun things.

There's a REST endpoint and a Pandas (Python/Jupyter) convenience library, so should be quick to use even if the data is in a database. Our customers are primarily doing stuff like security incident investigations over Splunk event logs, for example.

For anyone interested in bigger datasets, just send a note to info@graphistry.com .


Nice job Thomas! I've gotten a hell of a lot of mileage out of ggraph already, your work has been tremendous. Eventually it will be nice to dump plots to JavaScript for poking purposes, but for publication there is nothing that comes close to ggraph. And I've tried everything.


Really happy you like it. Would love to see something published using it


Oh -- you will :-)


From the perspective of a Dot/Graphviz user, this appears to allow you to define a graph, then program how it's laid out. Do I have that correct?


Yes, you use already available network packages like igraph and network (where you can import dot files) and then use the declarative API of ggraph to specify your plot


Can anyone recommend software for showing how network graphs change over time? I have a time-series relation dataset of <subject, verb, object, timestamp> and I'm looking for ways to visualise it.


The development version of this package supports graph animation (not yet on CRAN): https://github.com/bwlewis/rthreejs


If you're in R this might work: https://github.com/statnet/ndtv


Depending on the number of time points you can use the facetting functions in ggraph to create small multiples


Sacha Epskamp's tutorial has a nice section on this:

http://sachaepskamp.com/files/Cookbook.html

Also the section on penalized Ising models is great, I haven't seen anything else like it.


About 10,000 relations, each with a unique point in time. Looks like the facet isn't quite what I want.


You can of course animate it like this https://twitter.com/thomasp85/status/694905779539812352 but a static representation is often easier to interogate. I would try to bin the time points and facet on them...


Tangentially related: there's a port of ggplot to python: http://ggplot.yhathq.com/

The problem with ggplot and friends is R. It's frankly painful and archaic to work with.


Without needing to get into an R vs Python discussion I think it is fair to say that this is a perfectly valid subjective opinion and that a lot of people will disagree completely... none of the ggplot2 Python ports holds a candle to the original, feature-wise. So the question is whether you will reach out for the lesser tool or learn a new programming style...


I'm not a big Python fan, frankly. But R is painful enough for me to keep looking around. Right now my workflow is to do all my data processing in another language and just use R for plotting.


With dplyr & purrr, if you can't find a DSL in R that works for you, the odds are you'll need lisp or Haskell to satisfy you.


With the Apache Arrow project, my understanding is that there will be an opportunity for languages like R, Python w/Pandas to share in-memory dataframe structures. I could imagine a future world where much tighter interaction with languages like lisps or Haskell.


Haskell is what I use.


Base R is painful. dplyr isn't.


True, but 90% of the work in most data processing is before you can use dplyr or the rest of the tidyverse.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: