Hacker News new | past | comments | ask | show | jobs | submit login
PostGraphQL: PostgreSQL meets GraphQL (compose.com)
339 points by otoolep on Dec 13, 2016 | hide | past | favorite | 124 comments



I tried applying GraphQL to standard webapp/services projects, and could not get really far. Maybe I'm missing something, help is appreciated.

GraphQL lets consumers (Web UIs, Mobile Apps etc) define the shape of data they want to receive from a backend service. Theoretically, this is awesome since at the time of writing the backend service you don't always know which fields and relationships a consumer might require. The problem arrives when you add security. There are plenty of things that a client is not supposed to see, often based on the roles of the requesting user. That means that you can't pass a GraphQL query directly to an underlying DB provider without verifying what is being requested. You'd end up writing about the same amount of code that you'd have written with a standard REST-style (or other) interface.

I also considered using it for Server to DB data fetches, where my backend Web Service would request an underlying GraphQL aware driver for some data. I did not find it particularly more expressive than using SQL or using an ORM.

One good thing about GraphQL is that it sort of formalises the schema and the relationship between entities. You could perform validations on incoming query parameters and mutations. It also helps clients understand the underlying schema and would serve as excellent documentation. This might be useful, but a REST API also offers some level of intuitiveness, and ORMs (especially in strongly-typed languages) offer some level of parameter and query validation.

These are probably early days, but it'd be nice to see some real world use cases for GraphQL which lie right in the middle of simple todo-list type of apps, and the unique needs of FaceBook.


The point isn't to write less code -- especially server-side code.

The point is to provide a consistent, feature-rich, declarative, verifiable and performant API that is easy for the client to develop against.

SQL is an excellent analog -- no, it doesn't make the SQL server implementation easy, but it's a consistent, declarative, and powerful API. "Even" business analysts can write SQL queries.

If you don't have a strong incentive to make life easy for the client, the case for GraphQL isn't as strong.


Great point about not focusing on code reduction. GraphQL is such a big change from rest that the primary benefit will be different for each team you ask. Form my perspective in a startup with ~50 developers is this:

By providing a clear and powerfull interface you can significantly reduce inter-team communication leading to both higher throughput and faster iteration cycles.

From a technical perspective reducing round trips is important, especially on mobile devices and GraphQL makes this trivially easy.

These two points I suspect is the primary reason many medium+ sized development teams end up implementing their own bastardised version of GraphQL in their existing rest api. Many people we talk to at http://graph.cool are interested in deploying a thin GraphQL wrapper on top of their existing api for exactly this reason.


GraphQL works really well for some edge cases, and those edge cases will be more evident if you're FaceBook. For smaller projects, the benefits of integrating GraphQL libraries on the client and the server v/s special handling for a limited set of APIs is not obvious.

By special handling, I meant:

    https://example.com/friends?fields=name,pic,posts&posts_len=10
The vast majority of APIs will not require this kind of result-shaping. And for those that do, passing comma separated fields (using a protocol teams already understand) is worth trying before adopting something more drastic. Of course, YMMV.


Often what gets lost in these kinds of discussions is the difference between what is theoretical and what is practical. Yes, you can have a perfectly designed REST API which has all of these features.

What happens, though, when you want a new piece of data? Do you redesign the API to fit this new piece of information into it or do you make a new endpoint? What if the new piece of data doesnt actually belong with `friends` but if you do the fetch for friends and this new piece of data at the same time you can get a performance boost? Do you make a new endpoint `friendsandfamily`?

Now, what if you want to migrate all calls over to that endpoint, you now need to figure out if any of your frontend code or legacy systems are still using that old endpoint before you remove it.

The maintenance of that system is much higher because you've conflated what data you want with how you want to get it.


how is this solved by GraphQL though?

You still need to add a new edge `friendsandfamily` (same as adding to a new endpoint) or add a new a field to the existing object (same as adding a new field to the existing endpoint).

What am I missing?


(1) GraphQL returns only what the client asks for. No need to worry that adding to an existing resource will increase response times or bloat the response for existing queries.

(2) If you decide to go the friendsandfamily route, you'll just be adding `family`. You won't have `friends` endpoint and `family` endpoint and `friendsandfamily` endpoint. A GraphQL schema doesn't increase combinatorially in size; rather the possible compositions increase combinatorially.


for 2) isn't this the same as adding an optional "include=family" to the existing friends endpoint?

This is what the JSONAPI spec suggest too.


Yep. You can use query parameters to fix 1) and 2).

Go down the rabbit hole a little bit further, and you have GraphQL (if you're lucky....if not, you'll just have a mess).


I gave a talk about this recently [1]. One of the subtle points of why GraphQL has the potential to be more productive for smaller teams & projects is because it is well specified. Standards are a bit constraining, but they're also enabling. There are already some nice tools built on top of GraphQL and more will come as the project matures. This is of course the same value proposition as Swagger, JSON-API, and other formalizations of REST. In my opinion those projects are likely more mature today - especially for native mobile apps. But in a few years? I suspect GraphQL will be more productive even for small teams.

[1] https://www.youtube.com/watch?v=P0uJHI7GjIc


Even when it does require result shaping you can just use a non-standard-REST POST request where the requested information is in the HTTP body. Implementing GraphQL on the server side seems like a lot of effort for no immediate gains. I am pretty sure it works very well for Netflix and Facebook though.


Does GraphQL actually provide performance advantages right now? From what I've read about it, it seems like it suffers from a bit of an N+1 queries problem if you don't do a lot of custom optimization, which seemed like considerably more work than a REST API.


Just like SQL, GraphQL needs a good query planner for complicated queries.

REST is more like a NoSQL database where you write all the joins by hand: on the plus side, you get to control the query execution, on the negative side, you have to control the query execution (also you have round-trips and lack some possible internal optimizations).

The "performance" of GraphQL refers to the fact that most things can be queried in one round-trip, rather than many for RPC, REST, or vanilla HTTP. Each of these are abstractions, not implementations.

So the maximum performance of GraphQL is better than the maximum performance of REST, particularly over mobile networks...as you said though, the devil is in the implementation.


fwiw I think Facebook's current implementation interfaces with a read-through cache. I think they've managed to make it fast without having to spend all of the effort on query planning that way.


I also built a prototype with it. IMHO the best feature is strongly typed data and introspection, which enables lots of tools (like GraphiQL). But you can build something like this for traditional RPC APIs too - that's not necessarily related to the other GraphQL features.

The next advantage is of course letting the client define what he wants to retrieve in a single query. Whether this is useful depends on your application I think. For mine - while it was cool to use - it had no big benefit compared to having a set of orthogonal RPC APIs that the client could use to fetch the information it needs. Less round trips are a reason, but if you have your API on HTTP/2 or something else which supports small queries efficiently it might not have a too large impact.

The other large difference is that with a "lots of small APIs" approach you have to compose the data on the client while with GraphQL it's done on the server. One advantage of doing it on the server makes the load of it more predictable, since the complexity of the queries is well known and there can't be clients which try to issue queries with an abnormal amount of complexity.

What I disliked most about GraphQL were mutations. The ability to include multiple mutations into a request made it sometimes hard to reason about which of those succeed or errored based on the query result. I also don't see a huge benefit of specifying a complex query result for a mutation - as I am also now one of the believers that queries should be separated from mutations/commands and that the mutation mostly needs to report back whether an operation succeeded.

So all in all I now think it is an interesting operation for a Query API, if the target application requires complex user-defined queries. But I would rather not use it if clients mostly query the same amount of data or for command/control APIs.


You can compose data on the server without GraphQL. PostgREST does this (ironically not very "RESTful") thing: http://postgrest.com/api/reading/#embedding-foreign-entities


Would it be possible to write the server part such that either all mutations succeed, or none of them do?


I've written an interface similar to GraphQL some time before it was made public using JSON to describe the desired data tree, filters, etc.

The real upside of GraphQL is that it has become popular enough that there are parsers in most languages, and that you don't waste time designing and testing a new standard. If you use it for a public API, you can also simply point people to the existing documentation.

It's definitely not a widely accepted standard, but still better than rolling your own!


A question from someone that has never used GraphQL: what makes it better than SQL? You don't even need parsers in different languages for SQL, you could just send it straight to the DB. But I assume I'm missing something.


You might be confused by how PostGraphQL works, as it makes the queries very close to the database structure.

In real life there will usually be a lot more going on to serve a GraphQL query than simply translating it to SQL.

The data you'll send might come from many different sources, including any kind of database, caches, or just constants in your code.

As commented here, permissions over data is a big topic, and if you were accepting raw SQL, you'd have to parse / tokenize it, and somehow validate it before executing it. This is what GraphQL does, but instead of trimming down and restricting the usage of SQL, it defines a language that is similar, close to being a subset of it, with a much simpler syntax.


You can allow predefined SQL strings and define security rules for each string. If you need new query, you have to write new security rule for it. Won't work for dynamically generated SQL, but for most CRUD SQL it should work.


If you have N predefined SQL queries, then why not just wrap them in REST endpoints (it sounded to me that untog was suggesting that SQL be the interface to the data). If the user can write whatever query they want, then validating by string isn't really a good idea...


> You don't even need parsers in different languages for SQL, you could just send it straight to the DB.

You probably don't want to do that for security reasons. So either way you need a parser, and at that point, it makes sense to use something that is database agnostic and allows you to model relationships without complex joins or otherwise exposing the internal representation of the data.


Clients can use it to request exactly what they need. Since it's a graph, you can just keep grabbing nodes to get the fields you need in the shape that works for your client. GitHub added a GraphQL endpoint for their API, the improvements they cited were reduced overfetching of data the client didn't use e.g. urls to linked resources, and fewer underfetches e.g. using the results of the first request to populate a second or third request.

http://githubengineering.com/the-github-graphql-api/

It's not really GraphQL vs SQL, it's GraphQL vs RESTful APIs.


I have done this same, rolling my own. We copied a popular public API, LinkedIn I believe. We called it "api selectors".


> You'd end up writing about the same amount of code that you'd have written with a standard REST-style interface.

With every fad that has hit backend development, ORM/move-everything-to-frontend/merge frontend and backend a la Meteor/GraphQL/etc, there have been promises of simplicity and less code.

The problem, as you discovered, is that the overwhelming majority of non-CRUD code on the backend is directly related to security and cannot be abstracted away or moved anywhere else. You end up dancing the same dance, no matter what paradigm you use and some of them make that dance a bit more awkward.

Personally, I've settled on Swagger a few years back. There's no fancy paradigm, it's just plain REST with an open sec. Because you define your API ahead of time, you can automate CRUD, automate validation, testing, and documentation. All that's left is the nitty-gritty which you have to do no matter what and it doesn't have an opinion on how that should be done, that part is up to you.


it can be moved, to the database itself (constraints, grants)


We have an app running on graphql with 3 different user roles.

I'm not sure what other people do but on field resolvers we check for the current users access level and see if they should be seeing this thing. If not, the resolvers return errors.

In a more elegant approach, we have some of our object types duplicated / inherited. Say, "User" and "UserAdminView". If an admin queries user(id:3) they receive "UserAdminView". If a user does that, they receive a watered down "User" type, which exposes only the fields users can see. UI then selects the appropriate fragment.

Maybe there are much better ways though.


Writing the same amount of code is OK.

Doing one round-trip instead of several, no matter how deep and complicated your data tree is, may be the interesting part.


That can easily be achieved by not following REST so strictly.


Actually, it can easily be achieved by following REST strictly, especially the part where the key part of REST API design is selecting resources and designing appropriate representations for the intended use.

"REST" as consistently and independent of use case a trivial CRUD layer over base tables is...well, almost as from actual REST as the JSON-based RPC over HTTP version of "REST".


Sorry, can you provide examples of deeply nested result hierarchies that are expressed well with Rest?


I use JSON API which allows for "compound documents" instead of deeply nested results[1]. I would say it's well expressed, and it doesn't violate any of the principles of REST.

[1] http://jsonapi.org/format/#document-compound-documents


I am not sure I follow the conversation. What would a deeply nested result hierarchy be, exactly?

Are we talking about something like:

CREATE TABLE person (id INT, name TEXT, birthdate DATE);

CREATE TABLE car (id INT, brand TEXT, model TEXT, licence TEXT);

CREATE TABLE p_c (pid INT, cid INT, FOREIGN KEY pid REFERENCES person (id), FOREIGN KEY cid REFERENCES car (id));

And for a given person, we want to return their details plus the cars they own?

gopher://example.com/person/{id}

# Get person ID

1, Joe, 1970-01-01

gopher://example.com/car/{id}

# Get car ID

1, Ford, T, 1313

gopher://example.com/person/{id}/cars

# Get all cars owned by person ID

1, Joe, 1970-01-01, 1 Ford, T, 1313

,,, 2 Ford, S, 1717

gopher://example.com/person/{id}/cars/{id}

# Get specific car

gopher://example.com/car/{id}/owners

# Get owner(s) of car ID

gopher://example.com/car/{id}/owners/{id}

# Get specific owner

gopher://example.com/car/{id}/owners/{id}/birthdate

# Get specific owner's birthdate

gopher://example.com/car/{id}/owners/birthdate

# Get birthdate of all owners (assuming no domain overlap between IDs and field names, other solutions possible otherwise)

If this is the sort of thing we're talking about, I've got that t-shirt and assumed everyone had too, so I guess I'm missing the point here. By how much?


You seem to have messed up the format of your RFC4266 gopher:// urls. The first character of the path is the type, not part of the selector. Neither types 'c' nor 'p' are standardized, but I believe the convention is that they are calendar and ASCII-based page-layout documents, respectively.

I'm not sure what the correct type actually is, though. For JSON the closest RFC1436 type is probably 0, though I would be inclined to consider using (non-standard) type j instead.

</pedantic>


Indeed, it's been a long time and it wasn't common to actually use URLs back then!


Thanks for the example. I am also trying to wrap my head around it.

Is it possible to get all that information in one round trip ? I believe that is one of the benefits of graphQL.


> Is it possible to get all that information in one round trip ?

If we're talking about the same thing, yes.

One pattern that I use in my APIs is as follows:

Assume a JSON object like:

{ name: "Joe", colour: [ {red: 0, green: 128, blue: 90}, {red: 35, green: 88, blue: 199} ], hair: { length: { value: 9, uom: "cm"} }

I have on occasion provided an API like:

# Assume that {id} returns an object like the above.

GET /api/{id} # Returns the full object

POST /api/{id} # Replaces the object

PUT /api/{id} # Overwrites properties in the object

GET /api/{id}/name # Returns "Joe"

POST /api/{id}/name # Overwrites the name

PUT /api/{id}/name # Overwrites the name (also)

DELETE /api/{id}/name # Removes the "name" property

GET /api/{id}/colour/0/green # Returns "green". Other methods as above.

* /api/{id}/hair/length/value

PUT /api/{id}/hair/colour # Creates a new property

GET /api/{id}/colour/0;2 # Returns the first and third items in the array

GET /api/{id}/colour/1/red;green # Returns those two properties from the object. Note that this imposes some restrictions on property naming (no semicolons)

And this I never implemented, but I would if I had a need:

GET /api/{id}/colour/(0/red;green);(1/blue) # Returns [ { red: 0, green: 128}, {blue: 199} ]


The problem is that you are making up the syntax. GraphQL gives you this power in a standard that is machine readable so tools can be built around it.


Seems that way, with the added benefit of the client specifying the fields they need. Valuable if the underlying API changes to add a field but the clients can't use it yet.


I find that if I'm having to deeply nest result hierarchies into my REST payload I probably haven't spent a lot of time thinking about how the API I am building will be consumed. Usually designing my endpoints to align more closely with the concepts consumers of the API expect prevents the need to nest a lot of data. You may have to break your endpoints away from your data structures somewhat and reorganize your API into the logical structures that more closely represent the way interactions are performed in your software.

That's not to discount GraphQL, because I think it clearly has it's use cases, especially at Facebook size.


Verily!

GraphQL is one way to do just that.

REST has the virtues of uniformity and discoverability. I don't yet understand whether GraphQL possesses any of these too, though.


In theory yes, in practice, I never saw REST discoverability being used much IRL


GraphQL puts discoverability front and centre, by prioritizing introspection and static typing. GraphiQL (the browser-based GraphQL explorer) is pretty hard to beat.

Check out Github's GraphQL explorer for a great example of this: https://developer.github.com/early-access/graphql/explorer/


Because 90% of the "REST" developers don't actually build hypermedia apis, so they aren't really "REST" if you denote "REST" as somewhat complies with Roy Fielding's paper where he invented the acronym REST


Can you provide standardized examples of REST's uniforms toy and discoverability?



Its worth nothing that Roy Fielding, the guy who coined the term REST, doesn't consider an API RESTful unless it implements HATEOAS. At least, that's what he said in some talk I watched sometime ago.


He doesn't consider an API restful unless it is a hypermedia api, of which HATEOAS is an excellent example of.


Naming and formatting it differently is an easy way to get away from the REST dogma, though.


For me the main benefit of graphql is to have a common interface for mixed backends. One query seems unified, while it may request data from an SQL DB, the cache, redis, the file system, elastic search, etc.

It makes it very easy to switch backends, or reuse the same query on the client or the server. Microservices have one common standard language to query data, to whatever it wants.

ORMs tried to do that but they work well only for relational data. Plus unless you use JS everywhere, you can use it on the client and the server.


Anyone know if colleagues of GraphQL address this somehow? Like Falcor, om.next, Datomic, DataScript, Relay, Neo4j etc?

(I know some of those might not be the same, just throwing in a bunch of names vaguely clustered in my head)


SPARQL, the W3C standard real Graph QL, where most implementations accept queries direct over http. Have implementations that do role based security depending on the connection. Commercial implementations such as StarDog or Oracle Semantic Network among many others, will allow you to limit certain subsets of your data in many ways, and connect to many other datasources via different translators/mappers.

For the JSON developer though you would need to combine it with a nice JSON-LD formater which is not yet common among implementations.


> That means that you can't pass a GraphQL query directly to an underlying DB provider without verifying what is being requested. You'd end up writing about the same amount of code that you'd have written with a standard REST-style (or other) interface.

One benefit of committing to PostgreSQL as a backend: It already supplies a rich set of features for fine-grained security, and with views it's easily possible to add the filtering features.

This can reduce the need for much of the app middleware that does this traditionally. This is the thinking behind PostgREST, for example.


I've seen access control implemented both at database level (views and stored procedures) and at middleware level (Java/Hibernate). I'll greatly recommend the first of the two approaches, even if it means that developers have to deal with SQL and of the PL of the database. Thing is, they need to know this stuff anyway to be worth their pay. Hiding behind an ORM gives you this nice, familiar, and deceiving feeling of data just being objects which you can manipulate pretty much for free. I've seen many examples of code which load up the entire contents of tables, then proceed to manipulate it in clunky Java. Scales terribly, of course.


Hm, I'm curious. How do you implement database level access control for the generic case of website with hundreds of thousands users? Do you suggest, given a postgres database for example, to add the same amount of roles, and to have individual db connections initiated per each request?


Hmm.. may depend upon the granularity that you need I guess. As far as I recall, databases do implement access control at table level, but often you need better control than that.

In the system that I most recently worked on we had to implement our own access control logic as table level was too coarse. It was a datacenter management system which supported multiple users, each with a set of roles, and each user belonging to several companies. The data center inventory that would be returned by queries depended both upon your role (sysadm vs customer admin vs regular Joe) and upon your company affiliation. I guess I'm mostly writing this to illustrate how quickly this kind of thing becomes closely intertangled with the domain and with your business logic.

Will try to answer your questions to the best of my ability:

1) The amount of users is not a scalability issue but the number of requests may be, of course. I'll assume that's what you mean. My best advice would be to profile your system to see how it behaves at great load. If access control turns out to be a problem, maybe you are doing too complex joins. One path forward may be to "denormalize" your database scheme, i.e. accept that the same information is stored in several places for the sake of efficiency. Another idea is to spin up an Elastic Search box to help with the query that is your #1 bottleneck.

2) As for the question about individual database connections, well, I think I'd pool them or I would build upon some framework or app server which pools them. It's an easy win with standardized solutions.

Caveat: Bear in mind that although I seem to be posing as some kind of expert right now, I'm really not. I'm more of an all-round full stack kind of guy. So seek advice from multiple sources :)


You treat database roles as user groups (admin/employee) and based on that define table(view) level access. Then whenever a request is made, your api code switches to the appropriate role and sends to the database the specific user id (CUGs for example). Then you use views/RLS to filter access to specific rows.


It might be something already mentioned below here, on mobile so might miss it.

However, what we do : we use PG role row level access. E.g. each database user has explicitly defined the subset of data ih has access to. All you need to ensure your application business level views do not mix/confuse logical view of partial data. And! You choose to use correct database user in your back-end data stream/feed.

Have I already mentioned the PG is a great tool and the feature gap between MS Sql, Oracle and PG constantly narrows down as we speak?

_Edit_ typos & errors


> There are plenty of things that a client is not supposed to see, often based on the roles of the requesting user. That means that you can't pass a GraphQL query directly to an underlying DB provider without verifying what is being requested. You'd end up writing about the same amount of code that you'd have written with a standard REST-style interface.

Why would you need to write the same amount of code? The only code that would remain the same is the validation logic.


The amount of code needed to verify that only the proper fields and records are included for a particular response taking into account the user's permissions etc is often not so different from the code needed to retrieve just that data in the first place.

There is an issue here where as you give more sophisticated querying capabilities to the client, you need to have correspondingly more sophisticated validation of those queries on the server, which offsets some of the gain.


Perhaps another way of asking it: how is GraphQL different to sending SQL statements from the client? In both scenarios you'd be "only" handling validation logic and ignoring the rest.


Not much difference except that GraphQL does not specify how the data is processed.

You could answer the data out of a flatfile for all that the endpoint cares.

It also allows to secure access against arbitrary data requests, since on the server end, the resources only see the individual requests to the DB.

In my experience, it's much easier and less verbose to validate GraphQL after parsing than just validating any SQL you plan to pass to the backend.


My understanding is that different user roles would see different schemas depending on what they are allowed to see. This should definitely reduce the amount of code that needs to be written. I haven't actually implemented such an API though, so I may be wrong.


It's useful for unifying upstream providers. If all you're doing is using it to pretend the server-side application layer doesn't need to exist, it's going hurt.


Comments like these. This is why I read Hacker News


> I tried applying GraphQL to standard webapp/services projects

I'm sorry I don't understand.

Graphs are not a thing for running webapps/services. They are a subset of CS/maths aimed to solve specifics problems that can be expressed as graphs.

If you don't have a graph problem to solve, you don't need a graph database.


GraphQL is a bit of a misnomer, it's not actually for querying mathematical graphs. It's so named because it was originally for https://graph.facebook.com


Been lurking HN for years, created an account just to say this. I got into GraphQL shortly after the reference implementation through a bit after Relay was released. I was so excited for the technology but unfortunately there was not much out there in terms of middleware, tutorials, support or adaptation. I built a few toy apps with it, but work applications were designed as usual and life went on. It is exciting to see GraphQL picking up speed and I cannot express how happy I am that this exists.


When you finally decide to pick a username, you pick that monstrosity xD


Related: https://edgedb.com/

I can't remember where I stumbled upon that, but it implements an object database over PostgreSQL and offers a GraphQL superset for querying. Don't know if it's still in active development, but it looks interesting.


Development is very active, will release a public beta in a couple of months.


this from the same company that maintains python libraries like AsyncPG and Uvloop - been waiting for the EdgeDB.. and this is their main product - hoping it will be out soon..


A quick search of the Github issues seems to indicate that PostGraphQL does suffer from the N+1 query problem at least to some extent (https://github.com/calebmer/postgraphql/issues/219).

Which probably doesn't matter as long as you only have a few nested children, but might seriously kill performance for some cases. I always find it disappointing when that much of the power of a database like Postgres is thrown away by an abstraction. I find the general approach of PostGraphQL (and Postgrest, which is a similar project for a Postgres REST API) very interesting, it would be nice if it also generated reasonable queries for this kind of data.


The issue mentioned in that post - the fact that this query generates 6 sql queries instead of 2

{allPosts(first:5){title}}

is a limitation of the implementation and not inherent to the architecture. We spent a lot of time optimising these cases for graph.cool and the result is that we actually have a more scalable system by issuing a few simple sql queries instead of one giant join. It does open the door for the result to be internally inconsistent, but that is no different from querying a bunch of cached rest endpoints.


You are doing the job of the database query planner. A good join (using indexes) will always beat your 3 simple sequential queries. This mindset i think is due to mysql 4.x which sucked at joins.


PostgREST generates ONE query no matter what fields or how many levels you request, so it does use all the power provided by PostgreSQL


Is there any way to implement access control logic outside of row security policies in Postgres? Or have any sort of logic in between requests based on what type of request is coming in?

The automagical nature of this software seems great but any relatively complex application would require not only CRUD manipulation but also side effects to go along with it as well.

With express I suppose you could have a middleware fire off before hand to parse the incoming query, figure out what it is, and take any extra action as necessary such as denying a request or making some side effect of the query happen. This would be a by default open policy however for those queries which you have a postgres scheme for but lack the express javascript to parse the incoming request.


You use a combination to the roles system + RLS + views and you have all the access control you need.

For side effects (like send an email) you have a lot of options (proxy, database, external script that reacts to a event generated by the db) ... you just have to get out the mindset that everything has to be done in one place/framework/language and you'll end up writing a lot less code.


What I'd like to know is how (or if!) it does query building. I recently had a project where GraphQL looked like a perfect fit, since I needed an arbitrary amount of possibly nested data about some entities. The problem I had was that the GraphQL library basically requires you to do a SELECT *, and then picks the fields the client wants from that. Unfortunately, the DB schema was nowhere near 1:1 with the GraphQL schema I wanted, and getting certain fields required some very expensive JOINs. There's a stagnant PR to add a query planner, but at the moment it offers no way to assemble your query to match the client's request: the library says 'get the data' with no details about precisely which data must be gotten.


It's true that graphql-js pushes you in a direction where the code that fetches the data from the db is inefficient, but that is not a GraphQL problem, that is an implementation problem. To be efficient, you need to implement your own logic (in the resolvers) that analyses the entire graphql AST, generates an optimal query, and sidesteps the default execution engine in graphql-js.


This is awesome!

Lot's of confusion here on what GraphQL is actually trying to accomplish. In no way were the authors of GraphQL attempting to replace or enhance SQL with the GraphQL language; they are two completely different entities. GraphQL sets out to solve annoyances that people run into while building and maintaining large RESTful services.


I'm nearing completion of a real world commercial application of some complexity that uses this along with React Relay. I knew SQL but not PostgresSQL coming into the project.

I can say that this has been a joy to work with. This is how I will be building apps for the a while now.

Huge thanks to Caleb Meredith for all the work he has done on https://github.com/calebmer/postgraphql


hi, did you use express?koa? or the httpserver for it?

I am using koa and was wondering if you ran into any gotchas that might help me.

also, what % of your code is server-side, versus clientside react?


I spent 4 days trying to implement this in a side project. My biggest complaint was that it was too automatic. It created resources that I didn’t want.

As an example, I had a table (A) that was linked to a user table (B) through a many-to-many table (C). I needed the shape of table A available as a type so that I could get all associated rows of A for users (B), but not let anyone see all of table A. I had no way of telling PostGraphQL to hide table A but allow the shape of A to be public.

I hope they expand it's configurability more in the future.


Another way to say this is: PostGraphQL exposes as an api whatever it sees in the specific database schema. It's your job to show it only the stuff it's supposed to expose and there is a lot of flexibility in the database available to help you do just that (role system, views, stored procedures, RLS, constraints).


they don't need to expand anything for you to have this functionality. You don't expose the schema where your tables live, you create a different schema (api) where mirror views for those tables live. The second step is to add conditions on those views in such a way that when conceptually your client does "select * from A/B" he sees only the rows he is supposed to see and not all the rows.


This is super cool - but as others have said, you can't expose this to a public URL, it needs a security model. The security model in PG is, unfortunately, severely outdated.

Now, if you combined this with a good RBAC security model, particularly if you baked that model into the GraphQL -> SQL conversion layer, so it sends SQL queries that work on an allowed subset.. that'd be very cool.

The security mechanisms in Firebase might serve as an inspiration :)


> The security model in PG is, unfortunately, severely outdated.

What would you like to see? (PG dev here)


I should have said "the security model in most databases is outdated" - the bitterness comes from my own failure to argue for an alternative when we built the security model in Neo4j.

Just like PG, we based security in Neo on role-based access. Users have roles, those roles then have permissions to do lots of cool things.

What we should have done, had we had infinite time (and to be clear: it was the correct choice to build a role-based system given the constraints, I just wish there'd be more time in a day..), was to build a resource-based system; instead of roles, you have graphs of direct and delegated privileges over specific resources; resource-based access control.

If I build an app today, I don't want to say "Users can edit x,y,z columns on the comments table". I want to say "Users can edit x,y,z columns on the comments table, if they were the ones that wrote them and the comment does not have a lock flag".

The bulk of web app code today, for small to mid-sized apps anyway, is about dealing with this resource control. Which is ridiculous - imagine how many CPU cycles are spent on secondary queries to look up ownership hierarchies from dumb ORMs. There's no good reason that couldn't be modeled at the level of, or below, the query planner, having it simply design the query plans to abide by the per-user access constraints.

The cost of that would be massively higher than what the planner does today - but looking at the performance of the whole system, it'd be amazing.

This is, kind of, what Firebase does, right - it rolls resource-based access control into its query planner, combined with good federated user management.


> Users can edit x,y,z columns on the comments table, if they were the ones that wrote them and the comment does not have a lock flag

GRANT UPDATE (subject, text) ON comments TO user;

CREATE POLICY comments_update_own_not_locked ON comments FOR UPDATE TO user

USING (locked IS NOT TRUE AND created_by = current_setting('postgrest.claims.username'))

WITH CHECK (created_by = current_setting('postgrest.claims.username'));

Row-level security in Postgres is awesome!


Holy crap, I hadn't realized just how powerful RLS ended up - last time I looked at this was before 9.5 and it was roles all the way down.. That is fantastic!

That's exactly what I wanted.

I guess it's true as they say, that the easiest way to get the right answer on the internet is to publish the wrong answer :)


The features provided by RLS are not really new, they just add a nice syntax to define the rules. All the things you mentioned can be implemented using roles and views.


Be careful, RLS and views don't play well together, that is to say, your policies on a table for a particular user will not be applied if that tables is accessed through a view.


I don't understand how you can't express this with some triggers and row-level security. Could you please give a technical example?


Hi Jake.

We are currently developing a security model for http://graph.cool that sounds very similar to what you describe wanting for Neo4j. If you have time I would love to show it to you and get your feedback (email in profile).

You mention the security model of Firebase as an example to emulate. Unfortunately firebase only allows permission rules to consider path elements to the data, so in reality it actually ends up being fairly restricted and difficult to maintain.


I <3 you guys in every way :P this is a perfect example of why:

No defensiveness, no push back, just a prompt to start a conversation with a potential user to better their product. :)


The PostgreSQL community has so many friendly people that just exude knowledge and charm. I really like watching talks by Christophe Pettus, to name one person in particular.


Hi, we typically run PG behind a Rails API for authentication but presumably the OP was referring to the Google Firebase JSON based security models [0] which, it seems, are related to "native" authentication methods. Not sure how this would be supported by PG though.

[0] https://firebase.google.com/docs/database/security/


The toolset for something like this is there (row level security, column based permissions), but to get to something equivalent as described there you'd need to build something using the available infrastructure...


If you look at the demos, it looks like you can do RBAC security. The user id is stored into a session (transaction?) local variable, and then used to filter various tables using postgres' row based security model.


Makes sense — PostGraphQL was inspired by PostgREST, and that's exactly what PostgREST does.


GraphQL seems like a nice idea.

Are there any negatives that might not be obvious from just reading the docs?



What does something like this mean: "GraphQL will replace REST in the same way MongoDB replaced PostgreSQL"


Initially, MongoDB was marketed as a simpler, better, drop-in replacement for SQL databases (PostgreSQL didn't support schema-less JSON at the time) It was just crappy, dishonest marketting like the mongoDB team often produce.

GraphQL does the same when it compares itself to REST. It has problems of its own and it's a false dichotomy. It's just marketting.


GraphQL was designed to solve facebooks issue - efficiently fetch data for an entire view. In addition to this it provides structure that makes it easier for large dev teams to collaborate. If your app is similar to facebook from a technical perspective just jump right in. If you are very different, definitely do some more exploration first. For example I spoke to a dev on the facebook ad analytics team and he complained that he couldn't find a good way to fit their queries into the GraphQL paradigm. Their queries are much more dynamic than most consumer/business apps and includes advanced aggregations etc.


You/he seem to to say "GraphQL is not as flexible as SQL", and that is true but it's not in competition with SQL (but REST). Answering with a question, can that facebook dev fit the same query in rest somehow better then in GraphQL. Any any other way besides "i'll build this new endpoint for this new query"?


Also- is graphQL useful for non-social products?


Sure why not? GraphQL is not specific to social products.

Any application where a user might want/need bits and pieces of various objects pieced together would be appropriate and that's a lot of applications.

For example if you are loading a user's dashboard view you may need basic user info, their unread messages, the list of objects they are looking at (e.g. list of invoices) and things like that.

Without GraphQL you'd be making separate requests to pull all of that together, with GraphQL it can happen in one round-trip.


I'm curious to hear how folks how adopt GraphQL handle server side schema changes over time. I'm all for the idea of not breaking clients almost ever [1] but sometimes you need to coordinate between client and server logic; how do you do that if the dependency of any client can literally be your entire db schema? Perhaps there is a subset of the server side schema that is designed to be GraphQL-able and is kept very stable over time?

[1] https://www.youtube.com/watch?v=oyLBGkS5ICk


I'm working on a non-trivial app with this in the stack (React, Relay, PostGraphQL, PostgreSQL). I'm a big fan. The author, @calebmer, is a great guy to work with also.


You may still keep the company alive if copying the code is not instantly making it possible to replicate your business.

Ref Non-Imitable competitive advantages


Coming from the semweb community, and being the maintainer of a visual query builder for its SPARQL query language (cf http://datao.net), I would like to mention the fact that graph queries can easily be displayed (and authored) fully graphically. This is a huge benefit to bridge the gap between IT and domain experts in companies.


For many of us here, GraphQL is a premature optimization that is quite intrusive in the stack. Think hard whether you really need it.


If you have had negative experiences with it, it would be really interesting to hear your perspective. We have all seen the ugly parts of REST and such, and GraphQL looks like it could solve a lot of problems. If that is real, or just not enough experience to undertake it's unique problems is the real question.


It's very hard to validate if you're doing validation on the application layer. If you moved your validation to the database and your users are actually db users, a joy to work with.


If your users are db users it's going to be hard to spread out in e.g. SOA, and also take advantage of multiple forms of DB


I am wondering if there are any implementations of translating Swagger API configurations (or similarly RAML, JSON Schema) to GraphQL. For example, it could be integrated with the API Gateway in order to translate GraphQL queries to ordinary Rest API calls on the fly.


The pitch on the project page for Schema Driven APIs[0] mentions "hide implementation details with views of your data" (as in CREATE VIEW) and "automatic relations with the REFERENCES constraint". Do both of these work together? E.g. could I create an "api" schema that consists entirely of views that query other schema(s), but still have the relations automatically determined?

https://github.com/calebmer/postgraphql#schema-driven-apis


Yes you can. That is EXACTLY how you are supposed to do it (PostgREST and PostGraphQL). Do not expose your tables, expose a schema containing only views and functions.

There are small bugs related to relation detection between views but 95% of the times it works and everything can be fixed and also there are workarounds for those bugs.


Finally the back-end devs have some hard feelings about frameworka too, lol.

I like GraphQL, but I'm a front-end dev. Apollo-Client is pure gold. optimistic UI, streamed results, batching...


If there is an application level function that alters the data that is stored before presenting to the client on the standard REST model, how is this case handled? Is it handled on the client side as well?


This would be a fantastic fit for open/public/research databases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: