Hacker News new | past | comments | ask | show | jobs | submit login

Database optimization posts are always interesting, but it's really hard to do any apples to apples comparison. Your performance is going to depend mostly on your hardware, internal database settings and tunings, and OS level tunings. I'm glad this one included some insight into the SQLite settings disabled, but there's always going to be too many factors to easily compare this to your own setup.

For most SQL systems, the fastest way to do inserts is always just going to batched inserts. There's maybe some extra tricks to reduce network costs/optimize batches [0], but at it's core you are still essentially inserting into the table through the normal insert path. You can basically then only try and reduce the amount of work done on the DB side per insert, or optimize your OS for your workload.

Some other DB systems (more common in NoSQL) let you actually do real bulk loads [1] where you are writing direct(ish) database files and actually bypassing much of the normal write path.

[0] https://dev.mysql.com/doc/refman/5.7/en/insert-optimization.... [1] https://blog.cloudera.com/how-to-use-hbase-bulk-loading-and-...




Oracle, DB2, MySQL, SQL Server, and PostgreSQL all support bulk insert. Two obvious use cases are QA Databases and the L part of ETL which pretty much require it.


For relational databases in the enterprise with large amounts of data, batched inserts work great.

For large deletes it is often better to move the rows that won't be deleted to a new table and rename the table when done.

With large updates it is important to look at the query plan and optimize it with good indexes. Batching also works well in this scenario.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: