Hacker News new | past | comments | ask | show | jobs | submit login

These are great suggestions. Today every db write is indeed wrapped in a mutex. The two optimizations I am experimenting with are:

1. When new messages are inserted, immediately append to a file on disk. Then, in batch, insert to SQLite.

2. When dequeuing messages, keep n message IDs in memory as a ready queue, and then keep dequeued message IDs in another list. Those can be served immediately (using a SELECT which is fast) and then updating messages to the dequeued status can happen in batch.

Appreciate the tips!




For 1. We went with a per-routine bytes.Buffer that was batch "inserted" every x milliseconds or n messages. However, we don't care if we lose some messages on a crash. For integrity, some queues are set to 0ms, 1msg because we don't want to lose anything, but when it is ok that messages are lost, this is great for perf.

For 2. you could probably do something like this:

    BEGIN TRANSACTION;
    -- Select the oldest pending message
    SELECT id, message FROM queue WHERE status = 'pending' ORDER BY created_at ASC LIMIT 100;
    -- Mark messages as 'processing'
    UPDATE queue SET status = 'processing' WHERE created_at < ?; -- assuming created_at is monotonic
    COMMIT;
Basically, select a batch and then abuse the ordering properties to batch mark them. Then all messages in your select you can dispatch evenly to sender threads. Sender threads can then signal a buffered channel that they've completed/failed, and the database can be updated. At startup, you can just SELECT where status = 'processing' and recover.

This is a pretty decent translation of how ours works.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: