Hacker News new | past | comments | ask | show | jobs | submit | jaseemabid's comments login

I recently wrote an explainer for the sqlite file format with some helpful diagrams. This might help.

https://blog.jabid.in/2024/11/24/sqlite.html


Immutable append only persistent log doesn't imply store everything _forever_.

If you want to remove something you could add a tombstone record (like Cassandra) and eventually remove the original entry during routine maintenance operations like repacking into a more efficient format, archival into cold storage, TTL handling etc.


A notable example of a large-scale app built with a very similar architecture is ATproto/Bluesky[1].

"ATProto for Distributed Systems Engineers" describes how updates from the users end up in their own small databases (called PDS) and then a replicated log. What we traditionally think of as an API server (called a view server in ATProto) is simply one among the many materializations of this log.

I personally find this model of thinking about dataflow in large-scale apps pretty neat and easy to understand. The parallels are unsurprising since both the Restate blog and ATProto docs link to the same blog post by Martin Kleppmann.

This arch seems to be working really well for Bluesky, as they clearly aced through multiple 10x events very recently.

[1]: https://atproto.com/articles/atproto-for-distsys-engineers


That blog post is a great read as well. Truely, the log abstraction [1] and "Turning the DB inside out" [2] have been hugely influential.

In a way this article here suggests to extend that

(1) from a log that represents data (upserts, cdc, etc.) to a log of coordination commands (update this, acquire that log, journal that steo)

(2) have a way to link the events related to a broader operation (handler execution) together

(3) make the log aware of handler execution (better yet, put it in charge), so you can automatically fence outdated executions

[1] https://engineering.linkedin.com/distributed-systems/log-wha...



Table/log duality goes back further than Kleppmann though. An earlier article that really influenced me was

https://engineering.linkedin.com/distributed-systems/log-wha...


Martin Kleppmann was also directly involved with Bluesky as a consultant.


Pkl was one of the best internal tools at Apple, and it’s so good to see it finally getting open sourced.

My team migrated several kloc k8s configuration to pkl with great success. Internally we used to write alert definitions in pkl and it would generate configuration for 2 different monitoring tools, a pretty static documentation site and link it all together nicely.

Would gladly recommend this to anyone and I’m excited to be able to use this again.


Was about to ask if you had k8s api models available internally, and that someone should create some tool to generate that from the spec. But turns out it already exists in the open!

https://github.com/apple/pkl-k8s-examples


Coming from yaml+kustomize, all those curly braces are a tough sell. It looks like they roughly double the number of lines in the file.


While I learned to accept YAML it messes up editor usage.

It is so sensitive that basic text editing like copy and paste, tab, in/decreasing indent never quite do what I expect in IntelliJ.

I paste parts of yaml into another yaml and it ends up somewhere unpredictable.


Curly braces are great at ensuring correctness though, as goto fail; has shown.


Why yes, I would like to see more of those in my k8s, so glad we finally have the technology

https://github.com/apple/pkl-k8s-examples/blob/96ba7d415a85c...


This is what I wanted Apple News to be.

I wish it would give me a good curated news feed from dozens of sources, and adapt based on feedback. I badly wanted to love it, but no matter how much I tried it ended up looking something like a mix of Buzzfeed and Murdoch propaganda.

Happy to see the idea is not dead and new companies are giving it a shot.


Knowing apple, its probably illegal for those 2 teams to talk to each other.


Great post! I find it interesting that all root servers are located in United States. Should some of them be elsewhere for redundancy?


They're definitely not, that's just what their IP lookup would have you believe. There are in fact tons of root servers located all over the world leveraging anycast.

https://www.cloudflare.com/learning/dns/what-is-anycast-dns/

https://labs.ripe.net/author/emileaben/dns-root-server-trans...


There is a solarized version as well which I've used a few times in the past. https://thomasf.github.io/solarized-css


ASCII is English and limiting access to knowledge for the rest of humanity for a simpler encoding is just not an acceptable option. Someone needs to interpret those 7k words and write a (complicated?) program once so that billions can read in their own language? Sounds like an easy win to me.


counterpoint:

A complicated program is never an easy win, and English is already spoken in every country in the world.


Sure spoken, but both Arabic and CJK ideograms are written in far more countries in the world, with far more people, and for far longer in history than the ASCII set. The oldest surviving great works of Mathematics were written in Arabic and some of the oldest surviving great works of Poetry where written in Chinese, as just two easy and obvious examples of things worth preserving in "plain text".


So your argument is... it's easier to teach billions of people fluent English... than for software to support UTF-8?

You are aware that a majority of the world's population speaks no English whatsoever?


Playing the devil's advocate here. I am not a native English speaker, I'm a French speaker, but I'm happy that English is kind of the default international language. It's a relatively simple language. I actually make less grammar mistakes in English than I do in my native language. I suppose it's probably not a politically correct thing to say, the English are the colonists, the invaders, the oppressors, but eh, maybe it's also kind of a nice thing for world peace, if there is one relatively simple language that's accessible to everyone?

Go ahead and make nice libraries that support Unicode effectively, but I think it's fair game, for a small software development shop (or a one-person programming project), to support ASCII only for some basic software projects. Things are of course different when you're talking about governments providing essential services, etc.


English isn't even ASCII anyway.

Some loanwords like façade or café retain their accents.

Units like ° £ € and symbols like © ® × ÷ ½ aren't ASCII.

It doesn't take much to need one of these cases in a project.


I know almost no one who actually types the accented e, let alone the c with the cedilla. I scarcely ever see the degree symbol typed. Rather, I see facade, cafe, and "degrees".

That aside, the big problem with unicode is not those characters; they're a simple two-byte extension. They obey the simple bijective mapping of binary character <-> character on screen. Unicode doesn't. You have to deal with multiple code points representing one on-screen grapheme, which in turn may or may not translate into a single on-screen glyph. Also bi-directional text, or even vertical text (see the recent post about Mongolian script). Unicode is still probably one of the better solutions possible, but there's a reason you don't see it everywhere: it means not just updating to wide chars but having to deal with a text shaper, re-do your interfaces, and tons of other messy stuff. It's very easy for most people to look at that and ask why they'd bother if only a tiny percentage of users use, say, vertical text.


The first point is just because of the keys on a keyboard.

I see many uses of "pounds" or "GBP" on HN. Anyone with the symbol on the keyboard (British and Irish obviously, plus several other European countries) types £. When people use a phone keyboard, and a long-press or symbol view shows $, £ and €, they can choose £.

Danish people use ½ and § (and £). These keys are labelled on the standard Danish Windows keyboard.

There's plenty of scope for implementing enough Unicode to support most Latin-like languages without going as far as supporting vertical or RTL text.


For some reason people seem to think that the only options are UTF-8 and ASCII. That choice never existed. There are thousands upon thousands of character encodings in use. Before Unicode every single writing system had its own character encoding that is incompatible with everything else.


You didn't say spoken by every person. Merely spoken in every country. Even the existence of tourists in a country would pass this incredibly low bar...


Does Asian include Indians as well? Kind of amusing since Apple and Google are both lead by "asian" immigrant men.


India and China each have a billion people but the article can't find a politically correct way to distinguish between them.

Or a more serious explanation is that there is little data here because surveys lump "asians" together.


*Microsoft and Google


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: