Hacker News new | past | comments | ask | show | jobs | submit login
You probably don't need PostGIS (blog.rebased.pl)
147 points by lobo_tuerto on April 9, 2020 | hide | past | favorite | 51 comments



You (probably) should just use PostGIS, because it's the easiest thing to do, everybody understands it, it's eminently google-able, and ubiquitous. Unless you have a specific reason why you can't, just use PostGIS and enjoy your life.


I strongly agree. The article explains alternatives to PostGIS, but doesn't say anything about why you should avoid PostGIS. I see no reason at all to avoid it, even when you're just storing lat/long points. (It's "overkill," says the article, but that's harmless at worst.)


> (It's "overkill," says the article, but that's harmless at worst.)

I don't disagree in this specific case, but that's not always true.

I remember on a contract years ago another team working on a project - that started before I joined and was still going when I left - to build a big data analytics pipeline (hadoop, kafka, etc.) that achieved nothing. The whole thing could much more easily have been implemented with SQL Server (which they already had), PostgreSQL, or whatever.

At the very least it cost them a lot of wasted effort plus maybe a dozen sets of salaries and contractor rates (there was a mix). But it also stymied solutions in other areas because the answer was, "oh, we don't need to do that because the big data project has it covered." The opportunity cost was significant: the company lacked the capabilities it needed for as long as this white elephant continued lumbering on its way.


The only reason I can think of it to avoid the extension installation, but then it goes on to use another extension for a function that's available in PostGIS, along with everything else.

So yeah maybe if you only need to store the data and don't need operations (why even bother then), otherwise just use postgis.


Also, you will know that if in future your use case extends, you wont have to rewrite things.


I'm future you! I made the mistake of taking a system that did some GIS like things. The first layer was easy! PG has some handy built in types! Hoory. But then Business needed features! Two years later you've got eight half-baked complex features that would have been almost trivial with this extension. Don't make my mistake again! Yea, you may not need it but, you will WANT it.


Indeed, it is very often the case. Furthermore, it is not like PostGIS is adding costly overhead. Zero impact where you don't use it. And potentially a lot of trouble saved where you do.


Isn't this like saying that you don't need a Datetime library since you can just use unix epoch timestamp? You should specify what coordinate system you're using [1] (not unlike a timezone). There's more edge-cases than you think: the suggested containment operator wouldn't work on shapes that cross the antimeridian line (at 180 W = 180E). The earthdistance module assumes the Earth is a sphere.

[1] https://en.wikipedia.org/wiki/Geographic_coordinate_system#G...


In fairness, assuming the earth is a sphere is rarely a problem. A typical “closest to me” algo is will have a ton more UX error due to differences between travel time and as the crow flies travel time. Frankly I agree with the article in the sense I have implemented decent solutions with MySQL just using a bit of math to filter to a reasonable lat,long span and then ordering with the haversine formula.

You don’t need PostGis to find the nearest McDonald’s closest to an IP address of every visitor. A degree of lat is 69 miles, and a degree of long is easy to calculate based on the lat, but you may need an extra case if you want to be correct for the tiny number of people that live near the antimeredian line.

I wonder if people run toward PostGis just because they don’t know to google haversine or the spherical law of cosines.

That being said, I have often come up with decent hacks that solve a customer problems but biting the bullet and adopting Postgres may have been ultimately better. I wish PostGis was a skill I have, but in a pinch the law of cosines is quite passable.


The thing is, it starts with find me the closest point. And then, in the next meeting, management wants to know which are in a specific area. And this case is about points contains by a polygon.

I'll gleefully counter-argue some run toward PostGis just because they don't like NIH syndrome ^^


> I wonder if people run toward PostGis just because they don’t know to google haversine or the spherical law of cosines.

Even if they do know how, it looks like running PostGis is an easier way to achieve the desired outcome.


...if there was only one common datetime library, and it pulled in 800 MB of dependencies. In that case, I might default to using unix timestamps, too, despite the edge cases.


The arguments here aren't great. It's also mentioned to install some other extension (why, then, not use PostGIS at this point?). Don't really understand what the author was getting at.

Also, as was mentioned, this was posted earlier today: https://news.ycombinator.com/item?id=22820485


The "other extensions" are part of Postgres core, and are just "extensions" in the sense that they're loadable modules that are not loaded by default.

PostGIS is a whole big external thing that Postgres doesn't ship with.

Any running Postgres cluster can load and use the extensions in the article, with no ops work. Even environments like RDS have these extensions available.

If you want PostGIS, you need to explicitly run PostGIS; if you want PostGIS-as-a-Service, you need to find someone who is explicitly offering PostGIS-as-a-Service.


Worth pointing out that RDS includes PostGIS, which is good since you can't poke the DB server directly. Although I am sympathetic to the blog post, since PostGIS is overkill in complexity for simple use cases like a store locator.


Redis has GIS features built-in for store locator type calculations, so that's another option:

https://redislabs.com/redis-best-practices/indexing-patterns...

Here's a couple Lyft talks:

Redis at Lyft: 1,000 Instances (2017)

https://www.youtube.com/watch?v=U4WspAKekqM

Geospatial Indexing: The 10 Million QPS Redis Architecture Powering Lyft (2017)

https://www.youtube.com/watch?v=cSFWlF96Sds (2017)


I think it's still very valuable to raise awareness of the existence of a lot of these more basic features in postgres core. Though I think it would be better titled "You might not need PostGIS".


Well you actually very probably do need PostGIS if you're doing real spatial analysis.

PostgreSQL geometry field are just that, 2D geometries, on a plane, in no particular projection system.

Spatial geometries are relative to a projection system. Try to "draw" a square with GPS coordinate and project it in your local official projection reference, you'll probably not get a squared.

Now, if you don't need precision, feel free to use geometries. But be very aware of the trade off you're making. The article, in my opinion, dismisses them a little too fast.


I think the position of the article is “if you’re just making a web app with some dots on a map, you’re not doing ‘real spatial analysis’”.

I basically agree with the post because I found myself in a similar position a few years ago. I inherited a map with dots on it, and I thought “oh, I need to move this to PostGIS.” But it turned out to be overkill. I wasn’t using any geometries, and I didn’t even need distances from a radius. I was just showing events by location, time, and type. Two columns for latitude and longitude were more than enough.


Oh fair enough, not really arguing that point.

But there is a major trade-off and the article does not even mentions it. It is perfectly fine to consider the GPS coordinate system and basic geometries are good enough for short distance. As long as you're aware that GPS coordinate are actually set in a precise projection system and that you are making a informed decision.


Not related, but what if I wanted is "projection-less" 3d spatial data, then does PostGIS support it.


Yes, PostGIS supports projection agnostic ”geography” types: https://postgis.net/workshops/postgis-intro/geography.html


OpenStreetMap is a prominent user of Mapnik, but it's not an OpenStreetMap project. It was started way back in time for the sake of starting it:

https://mapnik.org/news/update

When it got so far along, OSM started using it in the demo rendering system that backs the 'standard' tiles on openstreetmap.org. Those tiles often get called "mapnik", because Mapnik is a big chunk of the rendering system, but they are really OpenStreetMap-Carto, a map style maintained for OSM.


Notably, the main OpenStreetMap database does not use PostGIS, and stores longitudes and latitudes as scaled integers.

https://wiki.openstreetmap.org/wiki/Rails_port/Database_sche...


Really OSM is kind of a special case. Database-wise it is more of a catalog of points and edges. Traditionnally, GIS databases support more complexe geometries natively.


Limit with PostGIS is that it uses “Simple features model” - OSM model is way more, not less complex: topology (object element hierarchy/links), versioning, editing metadata, schemaless tags etc - none of them were possible in plain postgis. Only more free data schemas (eg jsonb) are added to since OSM debate over db solutions over 10yrs back, the other things are still not there really.


Well, you've said better than I. IMHO, PostGIS is geared toward building classical GIS databases, while OSM is more a Topological Catalog. If I'm not mistaken, they say so themselves and visual map are just a by-product of geo-referencing "everything". Way more complex and way more harder to work with.

Edit: maybe I should have said "higher level geometries" than "more complex". In the sense that OSM "knows" about point and edges and nothing else.


Has there been any work to minimize the size and number of dependencies, or modularize it, or produce a "PostGIS-lite"? Usually I'm happy to grab a good library even if it's overkill, but PostGIS pulls in many hundreds of megabytes -- far more than every other library I use combined. I wish there were a way to say "I want the GIS data types, but I don't need JPEG2000 and every other possible image format".


> If using built-in Points, note that the order of ll_to_earth’s coordinates is reversed: points are (x, y) which corresponds to (lng, lat). But the function takes lat, lng.

This is a strong reason to use PostGIS.


It’s a contentious point, but plenty of formats and standards use lng, lat ordering.

https://macwright.org/lonlat/

https://macwright.org/2016/07/15/longitude-latitude-is-the-r...


The built-in geometric types work as expected only at the equator. They become more and more distorted as the move toward the poles. Circles become ellipses. Lines curve.

PostGIS's built-in Geometry types maintain their shape anywhere on the globe.


I think you might be thinking of Geography types. Geometry types really just operate in the 2D euclidean space which their associated SRS maps them to. And no 2D SRS can accurately represent the reality of working on a sphere, so I'd be very surprised if you found you could move a circle from the equator to europe and it remain a circle when back-projected.

Geography types try to deal with this sort of weirdness but are unfortunately slower and more complicated to work with.


> Can I measure the distance between two given points? Make sure you have the earthdistance module enabled: CREATE EXTENSION IF NOT EXISTS earthdistance CASCADE (cascade pulls in requirements, which are just the cube module). Then use the eponymous earth_distance function; note that it doesn’t take coordinates directly, but its own type, which you convert to with ll_to_earth(lat, long).

Maybe I'm missing the point but I don't really see how that is easier then using "CREATE EXTENSION postgis;" and "ST_Distance" ?


`CREATE EXTENSION postgis` implies that you have PostGIS installed. PostGIS is about five times the size of Postgres itself, and has about twenty requirements, some of which have their own heavy trees of recursive requirements, some of which are near-impossible to build in certain environments (e.g. macOS Homebrew) without weird finessing. The (alpine!) Docker image of a PostGIS-enabled Postgres is 713MB. That's a lot of extra binary if you just need to record some points!

And the real problem with all that isn't the size/deps themselves, but rather the fact that the size/deps will probably lead to you wanting to use a binary distribution (or pre-made Docker image) of PostGIS rather than treating PostGIS as any other Postgres extension installed against a regular running Postgres cluster. Which then means that you'll need to find/settle for the set of extensions that are packaged for that binary distribution, rather than just using the standard set of extensions that come with a standard Postgres distribution (e.g. the PGDG Postgres apt repo, which contains many individually-packaged extensions.)


Unless you're referring to something else, the official [Alpine PostGIS docker image is 151MB](https://hub.docker.com/layers/postgis/postgis/9.5-2.5-alpine...), and you can find minimal images around 35MB.

These images contain geometry engine binaries like GEOS, which you will likely need if you want to perform operations other than `distance` at decent speeds.


I think the bigger issue is whether your production database platform supports PostGIS. If you don't have C extension privileges over your production database, you're going to have to start convincing some admins to do you a favour.

(These days, it's actually quite a common request though and it's generally one of the first extensions most e.g. cloud offerings support)


Postgres.app comes with PostGIS plus GDAL etc. There’s no good reason to use Homebrew Postgres over Postgres.app.


Maintaining your own custom fork of Postgres is a pretty good reason.


Only if there's a good reason for doing that.


Here I thought this article would have been written by The Flat Earth Society!

If the web frameworks - specifically Django - do a lot of the heavy lifting with their integration and the installation in modern Postgres is so much easier, I don't really understand why NOT use it?


If you are storing points, you probably want to do things with that data. You don't know exactly what yet, but having standard formats and tooling is probably a good idea.

What is the downside, exactly? I guess you have to learn a few things, but it might be good to know anyway.


Yes, for many use cases you can use other extensions to achieve what you want, but what advantage you have for not using PostGIS?


As mentioned above, in another comment, a big tradeoff is potentially maintenance and operations work. It may not be a hard thing to maintain, but it's another dependency that's not core to Postgres.

I've done a lot of ops work in my dev career, and I'm always reluctant to add another dependency (JS, btw, makes me crazy with how little is included in the standard library, and NPM is required for everything). So, I might be tempted to follow the advice in the article, and avoid using PostGIS for the simple use cases provided in the article.

For reference, I've also helped with non-devs work on python code for GIS systems for daily delivery systems. So I would most definitely be tempted to try to utilize PostGIS if I was in the GIS world dealing with map updates, and daily starts and stops.

Side note. Having been a programmer for several decades, and professionally for a decade at the time, I had truly forgotten how bad novice programmers are. Nested loops on long running functions all over the place (calculating optimal routes is not cheap computationally). It felt good to get some perspective at the time by helping the GIS department with their code.


Back in the day, a GIS pro gave a talk at OSCON on how far he could push Open Source software to do mapping for free.

One of the best talks I've ever seen, although if you do that much work yourself, you might as well form a company.

I consulted for the Canadian JPL, so I had tons of satellite data, but it was mostly ice, tundra and forest regions. Not much urban data, alas. :)

Although you could take your notebook down to the Ministry of Works and they would plug in a cable and give you all their street data at the time for free. I actually did that once.


If I just want to find the nearest point in a small country using the <-> operator, do I need any extension at all?


Possibly, yes but be aware of the trade off and evaluate the actual error to decide for yourself for that given country.


This is a dupe from earlier today.


latitudes should be numeric(7,5)...


Coordinate in general should be in an explicit and documented coordinate system (I suppose you mean GPS coordinate system, EPSG:4326/WGS 84)


Mongo is another alternative, has pretty solid support for geospatial queries built in

https://docs.mongodb.com/manual/geospatial-queries/


I stored the coordinates of Atlantis, Mu, and Sandy Island in a MongoDB database, but then…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: