Hacker News new | past | comments | ask | show | jobs | submit login

You're missing the point of this project. It's not about the feasibility of throwing a billion points at a pile of software, to get an image. I can do that with a simple Python script. It's about doing so to create a meaningful and accurate data visualization, and not just a picture of, say, shiny spheres or a scene from Avatar.

I actually have a background in 3D computer graphics, and it's precisely because of my detailed knowledge of raytracing, rasterization, OpenGL, BMRT, photon maps, computational radiometry, BDRFs, computational geometry, and statistical sampling, etc... that when I came to the field of data science & specifically the problem of visualizing large datasets, I realized the total lack of tooling in this space.

The field of information visualization lags behind general "computer-generated imagery" by decades. When I first presented my ideas around Abstract Rendering (which became Datashader) to my DARPA collaborators, even to famous visualization people like Bill Cleveland or Jeff Heer, it was clear that I was thinking about the problem in an entirely different way. I recall our DARPA PM asking Hanspeter Pfister how he would visualize a million points, and he said, "I wouldn't. I'd subsample, or aggregate the data."

Datashader eats a million points for breakfast.

Since you're clearly a computer graphics guy, the way to think about this problem is not one of naive rendering, but rather one of dynamically generating correct primitives & aesthetics at every image scale, so that the viewer has the most accurate understanding of what's actually in the dataset. So it's not just a particle cloud, nor is it nurbs with a normal & texture map; rather, it's a bunch of abstract values from which a data scientist may want to synthesize any combination of geometry and textures.

I chose the name "datashader" for a very specific and intentional reason: we are dynamically invoking a shader - usually a bunch of Python code for mathematical transformation - at every point, within a sampling volume (typically a square, but it doesn't have to be). One can imagine drawing a map of the rivers of the US, with the shading based on some function of all industrial plants in its watershed. Both the domain of integration and the function to evaluate are dynamic for each point in the view frustum.




> Datashader eats a million points for breakfast.

So does opengl on a decade old computer.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: