Hacker News new | past | comments | ask | show | jobs | submit | _chap's comments login


Interesting. My understanding is that part of why mastodon is so slow/resource hungry is that it serializes background tasks to redis for a sidecar process to pick up, and that that's the normal way to do things. If rails has a concurrent runtime, why don't they just run background work directly?


Rails isn't super opinionated about database writes, its mostly left up to developers to discover that for relational DBs you do not want to be doing a bunch of small writes all at once.

That said it specifically has tools to address this that started appearing a few years ago https://github.com/rails/rails/pull/35077

The way my team handles it is to stick Kafka in between whats generating the records (for us, a bunch of web scraping workers) and and a consumer that pulls off the Kafka queue and runs an insert when its internal buffer reaches around 50k rows.

Rails is also looking to add some more direct background type work with https://github.com/basecamp/solid_queue but this is still very new - most larger Rails shops are going to be running a second system and a gem called Sidekiq that pulls jobs out of Redis.

In terms of read queries, again I think that comes down to the individual team realizing (hopefully very early in their careers) that's something that needs to be considered. Rails isn't going to stop you from doing N+1 type mistakes or hammering your DB with 30 separate queries for each page load. But it has plenty of tools and documentation on how to do that better.


`insert_all` seems to be an example of what I mean about how the framework encourages you to do the wrong thing. Here there is a lower-level hatch to do a bulk insert, but it says it doesn't run your callbacks/validations. So if you're using "good" design (or using libraries that work by hooking into that functionality), you can't use it. Laravel was the same way.

The new queue you linked is database backed, but the whole point is that you want to just run a job without needing to serialize anything outside of your process. It should just schedule it onto the thread pool and give you a promise for when it's done.

The Kafka thing also seems to be an example of what I mean: in Scala I'd just make a `new Queue` with a thread safe library, and have a worker pull off and do an insert every hundred rows or so, or after e.g. 5 ms have passed, whichever is first. No extra infrastructure needed, minimal RAM used, your queueing delay is in the single digit ms, and you get the scaling benefits. Takes maybe 10-20 lines of code.

You can then take that and abstract it into a repository pattern so that you could have an ORM that does batching for you with single item interfaces (for non-transactional workflows), but none of them seem to do this.


I supposed I've just been in Rails land for a while so I can't make an apples to apples comparison to how other frameworks approach things but I don't think insert_all is encouraging anything wrong - by the time a Rails team is reaching for it I can almost guarantee they understand the implications of it.

And again maybe I'm just not understanding but I really like having our background processes handled completely separately from our main web application. Maybe its just the peace of mind knowing that I can scale them independently of each other.


It's not that insert_all is encouraging anything wrong; it's that the normal way to use ActiveRecord does. insert_all is the right way to do things performance-wise, so you'd want to use it when possible, but if you were using other features of the framework like callbacks/validations for create/update, then you can't. The happy-path of an ORM tends to push you in a direction where bad performance all over the place is the default, and it does it in a way where if you didn't have properly calibrated performance expectations, you might think that the bottleneck is because IO is slow, but actually it could easily handle 10x the workload with better access patterns.


Having a redis job queue is extremely standard, especially for web app development, regardless of language. For one thing if the web server crashes for any reason the jobs still continue processing and also you have a log of jobs in case they fail etc


Are people using it for reliability though? Are they running redis in a mode where it persists a journal? If not, then if redis crashes for any reason, you're in the same situation.

And, like, Mastodon apparently uses a queue to do things like send new user registration emails. Why not just send the email from the new user request handler? Then if there's an error, you can tell the user in the response instead of saying "okay you should get an email" and then having it go into the ether. I was under the impression this had something to do with not wanting to tie up the HTTP worker because you want it to quickly get back to doing HTTP requests, but if it can concurrently process requests, there's no issue.

Similarly they have an ingest queue for other federated servers sending them updates. But if things are fast, why wouldn't they just process the updates in the HTTP handler? You don't need a reliable queue because if e.g. you crash, the other side will not get their HTTP response, and they'll know to retry.


It may just be out of habit and not any underlying language reasoning. Things like sending emails or doing anything but simple database operations make sense to do in a queue. For instance I’ve worked at multiple places where we did this using celery and python or bullmq/javascript. Some of them we did have a log that persisted for a certain amount of time so we could rerun e.g. emails that never got sent


I’m sure it’ll be in my neighborhood.

Let me know if you’re in town and we’ll throw a party for competitors, staff, and fans at my place.


> The Boring Company is excited to announce the second Not-a-Boring Competition which challenges teams to come up with tunneling solutions and answer the question, “Can you beat the snail?”. The challenge this year comes with a couple of twists and turns! See the Abbreviated Rules for the 2022-2023 Not-A-Boring Competition.

https://www.boringcompany.com/s/2022-Not-a-Boring-Competitio...


The Boring Company didn't install a legal septic system for the families living at their HQ in Texas. And it goes downhill from there: https://www.bloomberg.com/news/articles/2022-06-15/elon-musk...


The Boring Company's latest test tunnel was about 1034 times slower than a snail (their current stated goal).

https://www.reddit.com/r/BoringCompany/comments/xugxdl/comme...


Has the Boring Company really achieved anything that is not a flamethrower (not not a flamethrower?) and a slow EV taxi lane under Las Vegas?


It's achieved distracting voters and politicians from proven mass transit technologies.


How does this compare to other companies that dig tunnels?


Per Musk's original introduction of the snail, it's 14x faster than existing boring machines. So TBC did extremely poorly.


Well , yes, that’s how development and improvement work. You have to start somewhere and Improve from there.

Remember, the first SpaceX launch attempts didn’t make orbit. Did that make them quit because they ‘failed’?

Improvement is the name of musks game


What an incredible legacy you're creating Nick. Keep dropping those breadcrumbs for the rest of us to follow!


I got this email from an information request with the City of Kyle.

From Brian Gettinger, "Tunnel Evangelist and Business Development Lead" at Elon Musk's The Boring Company:

> Wanted to keep you guys updated on progress here. I am meeting with CAMPO on the 25th. I talked to {} last week and he is pretty excited about potentially resurrecting the Lone Star rail concept in a different form. I see the concept coming together like this:

> 1) TBC deploys individual systems in San Antonio & Austin

> 2) Development entity forms or reconstitutes to lead connection from San Antonio to Austin ‐ likely collaborating with TXDOT to follow I‐35 ROW

> 3) SATX to ATX system deployed in Segments each with individual utility

>- 1. Kyle to Austin

>- 2. New Braunfels to SATX

>- 3. San Marcos to Kyle

>- 4. New Braunfels to San Marcos

> With the leaked articles about ATX and SATX there is enough in public space for you to start having "what if" conversations with folks, just don't implicate me or TBC directly. TBC and I are most useful operating in the background, not in the front of the parade.

> Let me know how I can be helpful.

Response from the city manager:

> Hi Brian,

> Looks solid to me. We would love a connection into downtown Austin AND ABIA. Once there is a plan for the overall plan we will need to know what the cost is to Kyle for the station and we can begin identifying a location and funding plan. Please let us know what our next steps should be.


I've been really impressed with Particle easing that transition from Arduino to production: https://www.particle.io/



thanks!


I like where I live and I don't want my landlord to be thinking about how easy it is to replace me.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: