How the author knows its website has hundreds of subscribers? AFAIK is not possible to identify subscribers to RSS feeds and counting hits won't help. Am I missing something here?
Hi! Some feed aggregators include the subscriber count in the User-Agent header, so I can pick these counts from the access logs and add them up. This is how the logs look:
Picking a few days of logs where the subscriber count has not changed much, I get a rough estimate of the total count of subscribers reported by the feed readers like this:
In case anyone is wondering why we see multiple entries for Feedly and Feedbin in the first log snippet, that's because in an older design of my website, I had multiple sections each serving its own feed at paths like /blog/feed.xml, /maze/feed.xml, etc. Later I consolidated all of them into a unified feed at /feed.xml. So the feed readers still hit the old feed URLs and then get redirected to the unified feed URL.
You can get a rough estimate based on unique IPs hitting the RSS feed. Moreover, some of the online feed readers report the number of subscribers of your feed as part of their User-Agent. An example from my blog logs: `"Feedbin feed-id:2688376 - 9 subscribers"`
In my own logs, the ones that show are Feedly, Inoreader, Newsblur, Feedbin, The Old Reader, and a few small/personal ones.
Of course, they only show the subscriber count for their own platform. And then you can also pool together all the separate requests fetching /feed/ and add it all up.
> Whatever eventually supplants Postgres is quite likely going to be based on Arrow - polyglot zero-copy vector processing is the future.
Can you elaborate this? I understand it's a very opinionated statement but still I don't see how "polyglot" and "vector processing" could be considered the future of OLTP and general purpose DBMS.
Polyglot means not having to fight with marshaling overheads when integrating bespoke compute functions into SQL, or when producing input to / consuming the output from queries. This could radically change the way in which non-expert people construct complex queries and efficiently push more logic into the database layer, and open the door to bypassing SQL as the main interface to the DBMS altogether.
Vector processing means improved mechanical sympathy. Even for OLTP the row-at-a-time execution model of Postgres is leaving a decent chunk of performance on the table because it doesn't align with how CPU & memory architectures have evolved.
Honestly, I can't envision a near future where SQL is not the main interface. Happy to see the future proving me wrong here though!
Despite I can buy the arguments about how having a better data structure to communicate between processes (in the same server) could help, it's a bit difficult to wrap my mind around how Arrow will help in distributed systems (compared to any other performant data structure). Do you have any resources to understand the value proposal in that area?
Same for vector processing, would be great to read a bit more about some optimizations that would help improving Postgres leaving out pure analytical use cases.
> it's a bit difficult to wrap my mind around how Arrow will help in distributed systems
Comparing with the role of Protobuf is perhaps easiest, there's a good FAQ entry [0] which concludes: "Arrow and Protobuf complement each other well. For example, Arrow Flight uses gRPC and Protobuf to serialize its commands, while data is serialized using the binary Arrow IPC protocol".
This will be increasingly significant due to the hardware trends in network & memory (and ultimately storage too) compared with CPUs. I posted about that in a comment a few days ago [1], but it's worth sharing again:
> here’s a chart comparing the throughputs of typical memory, I/O and networking technologies used in servers in 2020 against those technologies in 2023
> Everything got faster, but the relative ratios also completely flipped
> memory located remotely across a network link can now be accessed with no penalty in throughput
I am no expert on Postgres but the thread seems to suggest the default out-of-the-box JIT performance is actually more efficient than a custom vectorized executor that was built for the PoC. That probably rules out any low-hanging optimizations based purely on vectorization for OLTP specifically, but there are undoubtedly many wider ideas that could in principle be adopted to bring OLTP performance in line with a state-of-the-art research database like Umbra (memory-first design, low-latency query compilation, adaptive execution etc.). As usual with databases though, if the cost estimation is off and your query plan sucks, then worrying about fine-tuning the peak performance is ~irrelevant.
The idea is nice but requires trusting their users a lot. How do they prevent users from setting super high "linker_workers" to be consumed by every agent? This could open the door for malicious users to saturate the entire system...
_peregrine_ is right, we do have control over the linker_workers value for every agent. Agents are not exposed to our users, everything is transparent for them.
As a user, you just need to create a new connection to your Kafka cluster providing your credentials, and then choose the topic you want to ingest into Tinybird. We take care of everything for your, we balance the worker using the approach explained in the blog post.
We can also fine tune it for specific users, or relevant events such as Black Friday.
It's a bit different for Enterprise customers, in those cases we can set up dedicated agents and fine tune linker_workers and other parameters to optimize for their use case.
If you have any further questions you can reach us on our public Slack channel or via email.
You bring a topic I have been myself concerned about but never managed to articulate. I'm usually performing quite well at my job, and easily get "special" attention and recognition which is good. At jobs, I tend to start motivated just by the work itself but at some point, after a few victories, recognitions, salary increases, or promotions I discover myself being that guy seeking attention and recognition and start feeling demotivated if I'm not getting it.
Perhaps, that's the big thing to solve. Being attention/recognition dependants doesn't look like being a good professional.
You were supposed to show any, even slight form of anger if I was right. You say "doesnt look like being a good professional" (good boy), who are you trying to please? Are your parents proud of you?
Simply follow things along those lines. Pay attention to your vocabulary: __all__, special, good professional, ive been doing everything right, etc. Thats how you do basic psychology. Youre subconscious is talking, just listen :) hope that helps