More

mcsoft · 2024-02-15T15:03:02 1708009382

It's the kind of software where early venture capital funding can be poisonous rather than helpful. Choosing the right abstractions to build a solid, flexible platform requires a lot of user feedback. You don't get the latter unless you have *paying* customers who have switched their critical processes to your product for a while. You need to build, sell, implement, and provide several upgrades. We bootstrapped a low-code BPM platform (pyrus.com) and were lucky to break even in several years. VCs push you to grow, while premature scaling could be harmful in the long term. It takes time before your platform is mature enough to serve inherently different use cases.

mcsoft · on Oct 9, 2023

I use pivot tables all the time. The concept is brilliant, but the Excel UI leaves a lot to be desired.

At first, you're amazed at the flexibility, but once you become comfortable, you suddenly hit the limitations. Can't sort by a calculated column, can't categorize without adding columns in the data, etc.

I looked at Quantrix for a while, and it was a bit too complex for practical purposes. I wonder if there are any decent PivotTable tools out there?

mcsoft · on Sept 18, 2023

We have seriously looked at FoundationDB to replace our SQL-based storage for distributed writes. We decided not to proceed unless we are about to overgrow the existing deploy, a standard leader-follower setup on the off-the-shelf hardware. The limiting factor for the latter would be a number of NMVMe drives we could put into a single machine. It gives us couple dozen Tb of structured data (we don't store blobs in the database) before we have to worry.

fdb is best when your workload is pretty well-defined and will stay such for a decade or so. It is not usually the case for new products which evolve fast. Two most famous installations of fdb are iTunes and Snowflake metadata. When you rewrite petabyte-size database in fdb, you transform continuous SRE/devops opex costs into developers capex investment. It comes with reduced risks for occasional data loss. For me it's mostly a financial decision, not really a technical one.

Jgrubb · on Sept 18, 2023

> transform continuous SRE/devops opex costs into developers capex investment

Would you mind expanding/educating me on this point? When I think of capex I think of “purchasing a thing that’s depreciated over a time window”. If you’d said “transform SRE/COGS costs into developer/R&D/opex costs” I would’ve understood, but eventually the thing leaves development and goes back into COGS.

foobiekr · on Sept 19, 2023

Basically the SREs don't have anything to do with fdb for the most part. You add a node, quiesce a node, delete a node. Otherwise it's self-balancing and trouble-free from an SRE pov.

See my other message for the developer issues, though. IMHO fdb as it is today is too hard for most developers if their use case is anything beyond redis simple keys.

psd1 · on Sept 19, 2023

AIUI:

- developer time is approximately fungible with money - project delivery is building a thing, that you own, and that has value, and that you will use to produce other value... - ...which can therefore be entered on the balance sheet.

I've just left a company a little after it floated. In the run-up to the float, we were directed to maximise our capital time logged. That meant any kind of feature delivery. Bugfixes were opex.

I believe this was done to grow the balance sheet and maximise market cap.

mcsoft · on Sept 18, 2023

I assume a couple of things here: 1) that SRE costs would be lower with fdb at scale due to its handling outages, i.e. auto-resharding; and 2) that a migration project from *sql to fdb will be finite (hence an investment I hastily called capex).

Would love to hear from anyone with experience in fdb whether these assumptions hold.

endisneigh · on Sept 18, 2023

Were you planning on using the Record or Document layer if you went with it? Or maybe making your own layer?

mcsoft · on Sept 18, 2023

We'd use the Record layer, but it was Java-only then. It would require us either to rewrite parts of our backend to Java or to implement some wrappers.

mcsoft · on Sept 16, 2023

The article says nothing about how they instantaneously updated millions of user feeds. It was the most challenging task, as it's way easier to scale reads than writes in distributed systems. Rumor has it early Twitter had a target of 5 sec to update everyone of 50M fan feeds when Justin Bieber touched a screen. I would love to hear some technical details on how they did it.

pnt12 · on Sept 18, 2023

I remember reading a case study from their engineering blog about this a few years ago - I couldn't quickly find it but maybe it's still out there. It was some think about optimizing for read speed, because one write from a celebrity would cause thousands or millions of reads.

mcsoft · on July 26, 2023

PRQL is a breath of fresh air. Reporting languages generally miss built-in visualization and drill-down capabilities. Ideal reporting query should define not only how to seek, join, and and aggregate data, but also how to visualize output and how to present details in reaction to user clicks. There are some limited efforts like in PowerBI and Splunk but we need a standard. I wonder if PRQL guys will address this need in the future.

mcsoft · on March 10, 2023

Double-entry bookkeeping is essentially the law of conservation of energy applied to balance sheets. It's much more deep as it was invented some 5 centuries earlier than programming.

Aeolun · on March 10, 2023

Conservation of money in that case. Money doesn’t just disappear. What goes away somewhere has to appear somewhere else.

mcsoft · on March 11, 2023

exactly

mcsoft · on Sept 8, 2022

The sad truth is in any disputes Airbnb is usually on the host side, not on the guest side. Looks like in their business model sellers have far more leverage over the marketplace than buyers.

mcsoft · on Jan 24, 2022

Both CTEs and this idea address the same problem: poor readability of complex SQL queries. Compared to CTEs, the author takes the idea to split the complex query into parts to the next level.

To your point - a solid IDE will show you what's being processed at each line (or returned, if the cursor is on the last line) - in an autocomplete window or a side panel.

mcsoft · on Jan 20, 2022

Luckily it was one time effort which allocated less than 1% of our annual development resources that year. After we've walked away from large arrays allocated in LOH - there were no issues despite further traffic growth.

mcsoft · on Jan 20, 2022

thanks!