I thought this was going to be a configuration generator and/or parser of an existing configuration file. While the documentation here is excellent (and the style of the site very nice), when I hit the home page, I am still left asking "now what?" needing sort through an endless list of alphabetical configuration parameters.
I think it would be much more interesting if I could load an existing configuration file and have a tool like this parse my configuration, giving it the highlighting and hyperlinks to documentation. It could analyze the file and give recommendations. Or I could specify my desired configuration outcome (high availability, low latency, multi-user, etc.) and it would create a starting template for me to work from.
p.s. I also personally dislike/distrust disqus and would not lean on or trust the comments there. Let alone it not working at all for anyone blocking cross-site cookies, etc.
[EDIT]
OK, so maybe I didn't read the blog post about "what's coming" which might be inline with what I just wrote. Specifically (from TFA):
> Right now we are working hard on a fully featured application service where you can have a graphical configuration interface (or UI) with Drag & Drop of your postgresql.conf files with automatic validation, and a REST API where you can store and share your custom postgresql.conf configuration files. You will also be able to download your configurations in several formats, like the native postgresql.conf, YAML or JSON.
So I guess that's getting closer to what would actually be useful to someone like me. Wake me up when that option actually exists.
Yes, that is coming. If you want to be woken up when the config file management is up, I'd recommend you to subscribe to the mailing list and/or follow on Twitter, you will be promptly informed ;)
Re: Disqus. It's not our favorite service either. We tried with Commento on other site and the experience was terrible. Data was permanently lost. We welcome other suggestions, for now Disqus does the job.
Semi-OT rant: can we stop these "for humans" qualifiers? AFAIK it started with requests and has since grown out of control: https://github.com/search?q=%22for+humans%22 (3234 repos at the moment) It hardly means anything any more and creators of alternatives that these "for humans" projects are supposed to replace probably won't be too happy with the implication that their work is somehow not for humans.
Maybe the blog post is linked because you want to highlight the not-yet-released configurator?
> Right now we are working hard on a fully featured application service where you can have a graphical configuration interface (or UI) with Drag & Drop of your postgresql.conf files with automatic validation, and a REST API where you can store and share your custom postgresql.conf configuration files. You will also be able to download your configurations in several formats, like the native postgresql.conf, YAML or JSON.
Sounds great but <s>all we get for now is a screenshot</s> (sorry, the screenshot is for a new feature available now) so it's kind of a bummer...
> Semi-OT rant: can we stop these "for humans" qualifiers?
It wasn't really intentional, and I actually didn't even know where or when this originated. It just sounded "right" (non native English speaker here). There are several two reasons why this felt right:
* Tuning Postgres well is hard. Not everybody, not every "human" can easily do it.
* There are several efforts (mostly from Academia) to have non humans (A.I.) tune databases automatically (e.g. Andy Pavlo's efforts).
> why link to a blog post rather than the actual site?
Because the blog post contains an explanation of what the web site is, how you can use it and provide much more context than the direct parameters page.
About the new functionality, it is something in the works and early disclosure is good for potential feedback. And yes, the screenshot is what you can get today, so feel free to enjoy it!
IDK, I don't mind them at all. Yeah, of course, all software is written for humans in some sense, but no single piece of software is ever going to be "the one" that satisfies every single person out there.
Some people like to keep it simple, others like to geek out on esoteric settings and full configurability. Personally I like when a software is upfront about who it's for, instead of pretending to cater to everyone's taste.
Even Windows comes in several flavors because one size doesn't fit all.
This is on a tangent, but I hate the "for humans" trend that Kenneth Reitz started with `requests`. What, does everyone else write code for porcupines?
You can optimize a format or structure for the ease of automated processing, or for the ease of human reading and writing. These goals often require contradictory features and properties.
But this is just a site documenting all the parameters available. with added extras like StackOverflow links. It's nice enough, but I suspect if Postgres configuration was voodoo to you before, this isn't going to change that much.
We don't believe much in such static tuning recommendations. Our experience tells us that they can do as much good as bad.
It is much more important to understand and learn a bit about how to tune them. You need to understand the workload, the usage pattern, to do a proper tuning.
This site is a first step into this direction: provide guidance, centralize the available documentation, provide general recommendations. Other steps will follow suit, all focused on helping Postgres users tune the configuration better.
But as of today, we know a lot of people using this site in their daily work, as a very convenient mechanism to check information you need to have handy when tuning Postgres. And/or use it to share stable and versioned URLs when you want to provide a link to reference what you are talking about, be it a blog post or a link in a document.
All in all, we hope it can be useful as it is, and even more with the steps that will come after ;)
> We don't believe much in such static tuning recommendations. Our experience tells us that they can do as much good as bad.
PGTune just automates the RAM + Connections math you would normally do manually. PGTune is a good starting point, but you still need to know what each config does and configure beyond PGTune. Nobody is saying PGTune does your configs for you, it's just automating what we always did manually before.
But I mostly disagree with that math also ;) Mostly, because that math is subject to other parameters that are not easy to know statically --otherwise, why aren't them Postgres defaults?
For example: shared_buffers is 1/4 of RAM and effective_cache_size 3/4. Well, several benchmarks have already pointed out that 1/4 is not necessarily a good number, and you need to benchmark your own workload. Similarly, effective_cache_size is slightly over dimensioned for dedicated servers and definitely too big for shared servers.
Even more clearly, the max_connections recommendation may even become a significant problem for your database. You should almost always have a connection pooler in front of Postgres and have max_connections a small multiple of your cores. PgTune's recommendation is probably an order of magnitude higher than usual good values, which may lead to much worse performance.
Another example: min_wal_size should be always a higher value than what is recommended if you have enough disk, and max_wal_size should definitely be something like significantly higher than what is recommended.
You should blog about your learnings, would love to read more about "PG performance configs you should look into". We're probably leaving too many configs at default values just because we didn't know they were worth tweaking.
I'm sure you've thought of this, but one thing you could do eventually is give a visitor a walkthrough, starting with the type of workload they expect, and highlight at each step what the most important levers available to them are. By all means stop short of calculating actual values etc, but just give people some entry point other than the full list of parameters, however nice the write-up under each parameter is (unless I'm missing something in the UI!)
Indeed. This is pretty much part of what it will be coming in the next iteration.
What we're thinking is providing a set of "cards", where each card contains a theme: logging tuning, memory parameters, autovacuum, etc. And then, every card contains a recommended set of parameters to tune, and guidance about how to tune them.
Ive always wondered where pushes for database "performance" tuning come from, and i've suspected it might be an Oracle thing due to the applications nature as somewhat of a black box with a prohibition in the license on any public performance testing against other products.
MySQL and Postgres are generally good enough out of the box to handle most workloads. Tuning efforts should best be spent at things like query optimization and normalization, or identifying unintentionally inefficient nested queries that might have cropped up through the life of the database. things like old 'select *' reports that managers of long ago may have mandated, or rogue cron jobs that run meaningless reporting. Identifying records to truncate or creating new databases entirely for different types of data instead of packing it all into one giant database as some companies tend to do, is also worthwhile.
Postgres performance with the default configuration could be significantly slower (30-40%, sometimes more than 100%) than a properly tuned configuration. It is quite important and one of the main recommendations and jobs we do on our daily work.
For example, if you leave random_page_cost at the default value and have fast SSDs, it is very likely that the fancy indexes you created may not be used and seq scans may be used instead. No amount of query tuning may fix that.
A classical one is shared_buffers, whose default size is 128MB of RAM. Unless you are running PG on an AWS Lambda ^___^ or a Raspberri Pi, this is typically a very low number.
Absolutely. But we found over many years that know which 5-10 (out of the 300) and what values are good is a hard job for most.
Despite that, we end up tuning 30-40 for most of the customer environments we work with. For instance, logging (for appropriate logging or troubleshooting) is like a dozen. Autovacuum takes its fair share too, and if you have a heavy traffic db, it is a must.
I use PG as a place to put OLAP cubes generated from primary-source event-sourcing data. In that use-case, PG really doesn't need any kind of durability guarantees, because it's not the canonical store of the data it's holding (i.e. the same data could just be repopulated into a new PG instance in the event of a hardware failure.) PG's default tuning is very sub-optimal for this use-case. :)
(Actually, question for the audience, anyone with experience in heavily-OLAP workloads: what kind of DBMS would you load an OLAP cube into, if it were still too large after projection/dimension-reduction to fit into memory, and you needed to run CPU-intensive, non-embarrassingly-parallel queries on it?)
Also, throwing some hardware like a bigger SSD or memory increase, both being quite cheap these days, will increase performance when dealing with an out of the box server as well.
Example: Year 2017, client started to have bigger traffic and users were started seeing timeouts. Server was 16 GB RAM. Mind you, it was the developer server as well, so lots of stuff were running there. Bought dedicated server, upped the memory to 64 GB and in one night we did the switch together. Downtime of only 30 minutes and to this day the client is good to go.
Wow, this is absolutely amazing! I’m planning on using PostgreSQL at a deep level for YC Startup School, and I haven’t been able to find too many super in depth PostgreSQL resources for what I’m doing (looking at event generation, permissioning, and view refreshes). I’m sure this can help deepen my expertise :)
Does anyone happen to have any advice for configuration tuning when running Postgres within Docker for Mac? I'm sure there are multiple ways to enhance the configuration for running within that environment, but the filesystem consistency concerns[1] in particular have always made me a bit paranoid.
This site seems really helpful for diving deeper into each of the individual parameter options, but it'd still be really helpful for any practical tips anyone may have.
The layout of this website feels like it was made by someone who thought that simultaneously combining every single design trend of the last 15 years, was a good idea.
I have been using MySQL/MariaDB for many years. And I have wanted to switch to Postgres, but I have read so many stories over the years of difficulties configuring (like this article/video is addressing) and tuning/debugging Postgres to cause me some concern.
Is there any validity to my concerns? (use case is CRUD apps and CMS's)
I wouldn't say that postgres is hard to tune. Yes it has ~300 parameters in a config file, but a lot of those are to do with logging/admin etc.
There are about 6 parameters that I might touch on every deployment:
Listen address, shared buffers (default it to 25% RAM), connection counts, max workers (CPU core count), max parallel workers and max WAL keep segments.
That's 95% of the tuning you need on an average deployment. The rest is a bit dependent on workload (like checkpoint timeouts on heavy-write boxes).
EDIT: And when I say "need", the actual defaults out of the box will usually get you a long way. Just set the right listen address and let it do its thing.
Not really. In nearly every case, the "Postgres is too hard" complaints are really just "Postgres doesn't work exactly like MySQL" complaints.
The area MySQL people get hung up on, generally, is the security model in pg_hba.conf. But this is a seriously powerful feature, and not at all hard to configure once you grasp what's going on.
I appreciate the note about complaints. I try to filter these kinds of things out (like amazon reviews that say "it's not what I thought it was so I rate a zero").
I will look into the security conf the next time I try it out. Thanks!
I never made any changes to Postgres for development and it was blazing fast. I made few simple changes for production to let it use more memory (just few lines that you'll find in any guide) and that was about it. I guess, there are plenty of handles for very pro users when they need to squeeze last few percents of performance, but generally it just works fast.
This site is actually misleading in my opinion; someone with no prior exposure might think postgresql users are obligated to spend their time in endless configuration push ups. That is definitely not the case. As the parent points out, developers rarely have to change more than a few things. Production means real loads and real loads drive administrators to configure and tune, but that's as true for MySQL/MariaDB as it is for Postgresql.
Over here[1] you can read about the 1163 snake cased MariaDB variables if you want... 231 of them are specific to innodb alone.
no need to be concerned really. i have used mysql a while ago before switching to postgres and the stories are the same on both sides :) i think you will hear that only the "enterprise" dbs users supposedly have it nice...
I think it would be much more interesting if I could load an existing configuration file and have a tool like this parse my configuration, giving it the highlighting and hyperlinks to documentation. It could analyze the file and give recommendations. Or I could specify my desired configuration outcome (high availability, low latency, multi-user, etc.) and it would create a starting template for me to work from.
p.s. I also personally dislike/distrust disqus and would not lean on or trust the comments there. Let alone it not working at all for anyone blocking cross-site cookies, etc.
[EDIT]
OK, so maybe I didn't read the blog post about "what's coming" which might be inline with what I just wrote. Specifically (from TFA):
> Right now we are working hard on a fully featured application service where you can have a graphical configuration interface (or UI) with Drag & Drop of your postgresql.conf files with automatic validation, and a REST API where you can store and share your custom postgresql.conf configuration files. You will also be able to download your configurations in several formats, like the native postgresql.conf, YAML or JSON.
So I guess that's getting closer to what would actually be useful to someone like me. Wake me up when that option actually exists.