How Graphviz thinks the USA is laid out (2021)

travisd · on July 28, 2022

> [ Note: this is where I lost interest and stopped writing. ]

Kudos for actually bothering to publish the damn thing. Many people (myself likely included) would have spent hours agonizing over minute unimportant details and ultimately never actually get it out the door.

openfuture · on July 28, 2022

I believe that people should be a lot more trigger happy when it comes to writing stream of consciousness and sharing it.. but at the same time we should be structuring our communication tools around iterative refinement so that the reader can pick up the slack and make the adjustments they feel make sense and slowly the final version of the text emerges from this process.. Right now we do a lot of denial of service attacks on science because we've raised the bar to an anxiety inducing extent, I believe it is important to bring it back down again so that we can all be humans.

sph · on July 28, 2022

I've never seen that line before and that might motivate me to finally keep a blog. I like collecting thoughts, but I often dislike the polishing required to turn thoughts into a complete blog post.

Sometimes one just wants to do the 80% of the effort and be done with it.

Kaibeezy · on July 28, 2022

The remaining 20% of the job takes another 80% of the effort.

samstave · on July 28, 2022

You should defin

dotancohen · on July 28, 2022

I noticed that, too, and realized how thought-provoking the ability to just say "OK, I'm done" really is. That, and marking (though not actually serving) as `Content-Type: text/shitpost` really demonstrates that the author is more interested in sharing the technical aspects of his discovery than in needless elaboration and commentary.

samstave · on July 28, 2022

>sharing the technical aspects of his discovery than in needless elaboration and commentary.

He should start a recipe blog.

blue1 · on July 28, 2022

To me, it feels like sloppiness presented as a virtue.

Even if you have lost interest, there are more civil ways to close a text.

eCa · on July 28, 2022

If you had said ’lazyness’ instead of ’sloppiness’ it would have become a higher order joke.

Kaibeezy · on July 28, 2022

Higher order if misspell ‘laziness’? Don’t get. I am so unsophistimacated :(

revolvingocelot · on July 28, 2022

"According to Larry Wall, the original author of the Perl programming language, there are three great virtues of a programmer; Laziness, Impatience and Hubris

Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it.

Impatience: The anger you feel when the computer is being lazy. This makes you write programs that don't just react to your needs, but actually anticipate them. Or at least pretend to.

Hubris: The quality that makes you write (and maintain) programs that other people won't want to say bad things about."

[0] https://web.archive.org/web/20150205083209/http://threevirtu... [1]

[1] NB that threevirtues.com, in this late year, has been taken over by some sort of sea salt seller

Kaibeezy · on July 28, 2022

Ah! Thorough, helpful and interesting answer.

absurddoctor · on July 28, 2022

Given the author, higher order jokes seem especially appropriate.

gred · on July 28, 2022

I wonder if some of the oddities generated by the "neato" and "fdp" engines (e.g. SC behaving like a peninsula) would be addressed by also modeling the borders with the oceans (i.e. the coasts) and with other countries -- just like he modeled the borders between states (e.g. the ocean would "push back" on SC).

dang · on July 27, 2022

Discussed at the time:

How Graphviz thinks the USA is laid out - https://news.ycombinator.com/item?id=25611053 - Jan 2021 (80 comments)

How Graphviz thinks the USA is laid out - https://news.ycombinator.com/item?id=25604462 - Jan 2021 (4 comments)

zokier · on July 28, 2022

It bothers me that the graphviz algorithms almost always feel to me doing a poor job of laying out nodes. Like here in the neato layout you have both nodes overlapping eachother and edges overlapping with nodes, both which should be avoidable, and in fdp there is also some overlap and additionally RI and WA placements are "wrong" (non-planar). sfdp is just a mess, and twopi is even worse. Is there any software that works better?

I realize that this might be fundamentally difficult problem, but it just feels odd that there hasn't been much improvement in this area.

Bromeo · on July 28, 2022

I believe you can explicitly set "overlap=false".

graphviz · on July 28, 2022

That's right. The initial layout in neato (which is really statistical multidimensional scaling) treats nodes as points, then draws shapes on top of the points without considering overlap. overlap=false or overlap=prism invokes a second layout phase as described in this report: https://link.springer.com/content/pdf/10.1007/978-3-642-0021...

There may be better methods by now - it would be a good intern project to look at that.

It would be a good idea to simply take the state centroids as an initial placement and onyl run the overlap removal phase.

In edge routing, we were really impressed with the "repulsive curve" work by Keenan Crane at CMU, https://www.cs.cmu.edu/~kmcrane/Projects/RepulsiveCurves/ind... - scroll halfway down for the graph layout examples.

bjourne · on July 28, 2022

Wow, that's really fascinating. I never knew there was a connection between knot theory and laying out graphs.

zimpenfish · on July 28, 2022

I tried this with the London tube map in 2004. It was not entirely successful[1].

[1] https://rjp.frottage.org/codeblog/tubemap_20041203_1745.html

ygra · on July 28, 2022

Typical metro maps are quite complicated to automatically calculate. At least I know of no efficient algorithm and force-directed layout doesn't really work, since you want the stations in roughly the right place, especially relative to other stations. A colleague tried something based on a paper a few years ago [1], alas it only really worked with a commercial ILP solver. We've tried an open-source one a year before that and could only get results in acceptable time for three lines or so.

I think most tube maps are hand-arranged, perhaps with some algorithmic help, but oftentimes you also have to fit a specific aspect ratio or shape (because the greater metropolitan area is also shown on the map, along with fare zones), which I don't think any algorithm currently does well. And the manual approach can result in niceties like the actual circle around the city center here: https://www.rsag-online.de/fahrplan/liniennetzplaene

[1] https://www.yworks.com/blog/automatic-metro-map-generation

alphalima · on July 28, 2022

I wonder if adding River Thames crossings will improve things. That's a major component of the tube map.

zimpenfish · on July 28, 2022

Huh, didn't think of that. I'll give it a go later.

sph · on July 28, 2022

Looks mostly decent, just needs vertical and horizontal mirroring. Also I don't understand the colour choice, sometimes it switches from one to another on the same line (Barking -> Upney).

zimpenfish · on July 28, 2022

> sometimes it switches from one to another on the same line (Barking -> Upney)

Beats me. I just ran the Perl to convert the list of tubes to the neato format and it made them both green. Maybe I generated the images and then later fixed a bug in the generation? I could probably check but I'd have to resurrect Trac from 18 years ago and ... uh.

(See also EastCote and Eastcote on the map where there's now only EastCote in the data file.)

samstave · on July 28, 2022

November 2005

[ Note: this is where I lost interest and stopped writing. ]

zimpenfish · on July 28, 2022

I probably broke hobix with a Ruby upgrade and started a spiral into blog engine madness...

th0ma5 · on July 28, 2022

I couldn't find any comments from previous posts pointing out that the dot output uses rank which as things progress as listed increases in value unless you say rank=same on all edges.

vonwoodson · on July 28, 2022

Four corners is a square, but I wonder: why not an X? If you can literally step from Arizona to Colorado I’d say that they’re connected. Anyway, cool maps. I love how they help promote the kinds of default graphziv layouts in a commonly understood dataset. Well done.

normac2 · on July 28, 2022

I think it's because you are going to cross through New Mexico or Utah first (unless we represent "you" as a point, line, or other zero-width object, and have it traverse a 45° line through the infinitely small point of intersection of the four states—which I'd say would be a kind of generous model for any path that happens in real life).

samatman · on July 28, 2022

What a great lead. Spoiler alert: I've been to Four Corners.

I've also flown over Labrador, and would never say "I've been to Labrador" as a result. While I have also visited Indiana, I have travelled through it by rail or car much more often, between Chicago and Michgan. If I had only done this, I would probably say "well, technically I've been to Indiana, but I drove right through it".

Leading me to the "boots on the ground" theory of visitation: a place has been visited if your feet/footware make firm contact with the ground, or an object anchored to the ground in a durable fashion.

At Four Corners, it's quite possible to step directly from New Mexico to Utah, and from Arizona to Colorado, without setting foot in the other two states in the quadrant.

At most, you fly over Arizona and Colorado, on the way from New Mexico to Utah. As we've established, flying over a place is not a visitation.

Which brings us to the fun part! Every tourist who visits four corners walks a circle around the States. Only some, and most children, do the jumps!

If you don't do the jumps you can't say "I've been to Utah straight from New Mexico", which is too bad, since that and its dual are the only topologically interesting part about Four Corners! The other pairings share long borders.

Also, Monument Valley is breathtaking. Truly a wonder of the world.

jokoon · on July 28, 2022

I had a lot of fun using graphviz to create a call graph of my C++ code, thanks to clang.

Unfortunately, I was not able to generate something readable.

VoidWhisperer · on July 28, 2022

Two curious points:

1. I see it thinks Hawaii is effectively in canada. I'm assuming this is because the default behavior for unlinked nodes is for them to end up near the top in empty space?

2. Why are there no edges connected to Arkansas as it is within the main continental US?

rsstack · on July 28, 2022

AK is Alaska.

VoidWhisperer · on July 30, 2022

Oh.. I think I had a brain fart when writing this. Thank you

mminer237 · on July 28, 2022

It's probably because the US is generally wider than tall but the NE US has many more states, so that leaves an otherwise empty space in the top-left.

nebulous1 · on July 28, 2022

Is there anything interesting in how neato came to orient this correctly?

graphviz · on July 28, 2022

It is surprising though maybe they gave a good initial placement?

euroclydon · on July 28, 2022

Neat for trivia. TN and MO are tied at 8 for the most connected state.

brrrrrm · on July 28, 2022

RI/NY doesn't seem right

linksnapzz · on July 28, 2022

It's right.

The border isn't on land; it's a nautical boundary-there's a line from just south of Stonington CT, running between Montauk Point (NY/L.I.) and Block Island (RI).

vanilla_nut · on July 28, 2022

I was wondering about this same thing, makes perfect sense that there’s a nautical border between Long Island and Rhode Island. Thanks!