Hacker News new | past | comments | ask | show | jobs | submit login

Yep, close -- it's where an edge can connect more than 2 nodes

Two things to make this way more intuitive & practical:

* Modeling: We use them for helping folks correlate across events, or correlating across wide database rows. Ex: Finding bots & fraudsters in your signup events or accounts table . A signup event might have the same weird User Agent, same weird IP, same weird domain name, etc: its good to see events connected across multiple such things! So any time you have an event or a wide row, you can think of it as a hyperedge connecting (correlating) the multiple entities involved.

* Visualizing: It's typically weird visually to draw an edge connecting more than 2 nodes, so most people don't. Instead, sometimes, our users want to see a correlation event as an explicit node in a graph. SO there's a natural bipartite graph from events to entities, like "(signup)-> (user agent)" + "(signup)->(ip)" + ... . But other times its annoying, so they rather just see "(ip:node)<-[signup:edge]-->(useragent:node)". Interestingly, for really wide data, having an explicit "hypernode" (event node) does increase the number of nodes... but significantly decreases the number of edges b/c avoids drawing really big connected components. Formally, add |events| nodes, vs multiply edges by |number of columns|^2 .

If you want to play with it, try "graphistry.hypergraph(pd.read_csv('http://blah.csv'))['graph'].plot()" in http://github.com/graphistry/pygraphistry (or just to get the ._nodes and ._edges dataframes out)

Nowdays, we also do a lot with embedding arbitrary data and exploring similarity graphs that way (use k-nn for inferring similarity edges), so not just across entity links but also say timestamps, $'s, and byte counts. Any dataset embedding can thus be thought of as a projection of the hypergraph, and supporting more interesting columns than just entity columns (categoricals). So in the above example, `graphistry.nodes(pd.read_csv('http://blah.csv')).umap().plot()` and we'll infer the `._edges`. All the nodes in an embedding view are essentially hypernodes, and instead of connecting them to entity/attribute nodes... project those out, and just connect the hypernodes to one another. This gives a very different way to think about stuff like explainability of AI :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: