Nice comparison! Worth noting that pg_cron is available on almost all managed PostgreSQL services. Also, many thanks to Devrim Gunduz and Christoph Berg for providing community packages.
I wrote pg_cron with the intention of keeping it as simple, reliable, and low maintenance as possible, so it's likely to remain that way.
It is possible to implement a more advanced job schedulers on top of pg_cron if needed. For instance, you can set up a few parallel jobs that run every N seconds and take an item from a job queue table.
As I just learned, apparently you can make HTTP requests from postgres like so:
copy (select 'hello world') to program 'curl -m 10 --data-binary @- https://some-url-here';
There's also a postgres extension for making HTTP requests but this seems to work out of the box (if curl is installed).
The "-m 10" parameter is 10 second timeout, to reduce the risk of this command hanging. I did not test what happens if curl returns non-zero exit code, this would also need to be tested and handled.
One could use this to monitor pg_cron tasks with external cron monitoring services. I'm not sure if this would be overall good idea, but one could :-)
> As I just learned, apparently you can make HTTP requests from postgres like so:
Well, it's more "you can execute arbitrary commands on the host node, which happens to include curl if it's installed". Which is also why this is limited to admins, as it is often seen as a security problem when users can execute arbitrary commands on the host server. So IMV there's nothing PostgreSQL-specific to that curl command and HTTP request, other than that it happened to be part of the execution stack.
This only works as admin, use https://github.com/pramsey/pgsql-http instead. Whether or not you should, is a different question, there is a certain "oh no" factor to this.
it can be exceedingly handy to update things like daily conversion tables or grab output from an ETL job that has been exposed by an API.
Construct your SQL well with proper TX isolation and guards and there's no worry, and it's much better in many cases where data warehouses are employed than writing extra software on top.
Autor here. Generally I feel much more confident about external services 'pulling' the status. But I have something cooking up to reduce the friction of external calls and possible process blocking...
Browsing StackOverflow I saw a suggestion to use NOTIFY/LISTEN for stuff like this (it was a question about sending emails from postgres) – an external process listens for notifications and runs the external commands. No blocking risk, and the external process can run on a different host, but you have one extra moving part to look after.
I have been waiting for someone to re-invent PickOS [1] but with a PG heart and a multiuser shell with PL/pgSQL as first class CLI language or some modern form of Pick/BASIC :-)
I was just yesterday wondering about the following scenario:
I would have entries that have an expiration date and need to regularly purge all required rows (think access tokens, WebAuthn challenges, etc). The service creating those rows is deployed serverless, so only invoked on incoming requests. Now the only viable options I know are a) having a lottery and run the delete query as part of the normal request handling with p=0.01, b) have a secondary scheduler system that performs housekeeping tasks, or c) using pg_cron to do so in the database.
Are there any other solutions to this? Scheduling jobs on the database system works, but I’m always wondering how others solve this.
I wrote pg_cron with the intention of keeping it as simple, reliable, and low maintenance as possible, so it's likely to remain that way.
It is possible to implement a more advanced job schedulers on top of pg_cron if needed. For instance, you can set up a few parallel jobs that run every N seconds and take an item from a job queue table.