Hacker News new | past | comments | ask | show | jobs | submit login
Optimizing the Lichess Tablebase Server (lichess.org)
246 points by cristoperb 5 months ago | hide | past | favorite | 55 comments



Lichess is one of those things you just have to sit and appreciate like a fine wine. It's absolutely wonderful for people in the chess community. I use it every day and am inspired by the functionality and performance, especially knowing it's a 1-2 person shop with limited budget.


You forgot to mention that it's free, open source, and doesn't nor will ever ask for your money, and a lot of people donate. Their expenses are public. It's also available as an app !


I wish more open source end-user software learned from Lichess, in terms of how user friendly, well designed and well maintained it is.


Me too. Recently the new beta mobile app is even cleaner and has haptic feedback which is so cool.


I gotta be honest: I aspire to create something as valuable and as cool as lichess one day.


> here are the empirical distribution functions (ECDFs) with 30ms added to each response time

> The added constant seems artificial, but it's just viewing the results from the point of view of a client with 30ms ping time. Otherwise the log scaled x-axis would overemphasize the importance of a few milliseconds at the low end.

I thought this was interesting - maybe it's a standard practice I was just unaware of but it seems like a smart trick.


Did they have to reduce cost or is there any other reason to not stick 20TB of SSDs in a box and call it a day? 4TB SSDs only cost ~$300, even HP or Dell SFF drives aren't much more expensive.

I guess they were interested in doing the testing and optimization for fun. From a product standpoint I probably would have invested my limited time in other projects.


Lichess is a non-profit with a lot of volunteers, they probably don't have the same time vs hardware cost balance as most for-profit companies do


It is important not to automatically make assumption that all non-profits are impoverished and run by volunteers.

One of the most famous examples is Wikipedia.

Technically yes, they are a non-profit. Impoverished ? Certainly not !

Look at the financials, as others have already pointed out. Especially if you are in the habit of donating to non-profits, the financials can make for interesting reading.


If you look at lichess financials they currently have two full time employees - in this case it's not a bad assumption. Wikipedia has significantly more users and does fundraisers


Lichess is a non-profit. It is run entirely on donations and volunteering. It has only 1 employee, the dude who founded the non-profit, and it seems he takes far less money than he could make from any other job based on how talented he is.

Also the organization is based in France. I don’t what impact that has on costs but it’s worth mentioning.


We're up to 2 employees now! The founder and a mobile dev.

The impact on costs is "not small", because as a rough estimate, the charity pays overall about twice what the dev gets in take-home money, because French employer taxes are high (keyword for the Frenchies reading us: URSSAF).

Source: am President of the Lichess charity and have the honour and pleasure of dealing with most of the French administrative paperwork.


I had no idea that was the case, that's incredibly impressive!


They managed to reduce max response times by an order of magnitude. If this project took a week (even two) and some users went from 15s response times to 1.5s response times, only projects where the user experience is even worse or where you work for a for-profit organization where there's money to be made elsewhere (and you admit you don't really care about customer pain) would be a better justification of time.


>testing and optimization for fun

In no other industry a engineer would think like that...except in IT.

We definitely have too powerful and cheap Hardware, combined with lazy Wetware who just wants to "call it a day"....be proud of your work....or so they say.


Not calling it a day anywhere is why Lichess is such a good website.


You think engineers in other industries won't sometimes choose the more exciting option when a boring but well-understood one would do the trick? That's definitely not true in (at least) mechanical and electrical engineering from what I've seen. From people spending millions trying to have the entire factory operated by robots so they could save 100k on humans to engineers specifying friction stir welders for the most basic of welding jobs, overengineering of parts that would make the people at Juicero blush, etc etc etc.

I have no idea why software people think their industry is the only one where people cut corners. Some form of meta-imposter syndrome perhaps.


>From people spending millions trying to have the entire factory operated by robots so they could save 100k on humans to engineers specifying friction stir welders for the most basic of welding jobs

Look, I come from that industry (metalworking), if you do friction stir where it's not needed you should be kicked out of your job, but wonder, I've never heard of such a thing in reality, don't tell me you're buying another friction stir cnc to save 100k "on people", friction stir is slow, expensive and any robot can weld (normal welding) faster.

Yes people are expensive, but un-optimised work is even more expensive (on factory level), NO ONE in the metal industry would do something like this if it was not necessary (well except the defence sector, because those guys are crazy and have unlimited money).

I call your made up story complete BS.


Most things in life are a compromise and it's easy to get tempted to find the perfect solution instead of spending your time on actually moving forward.

In all industries there is always something you can do better if only you spend more time. But at most places time is worth money and I'd say $3000 for a few SSDs is little enough to not make this worth my time.


[flagged]


Please keep it civil. There's no need to attack the parent, take that kind of crap elsewhere.


> From a product standpoint

Makes sense from that perspective, but Lichess is not run as a for-profit company with a product, it's run as a non-profit organization (which it is), so a perspective shift is needed to understand their decisions :)


Take a look at their financials and $1500 for SSDs would not be out of place.

They have yearly expenses for more than $500.000

https://docs.google.com/spreadsheets/d/1Si3PMUJGR9KrpE5lngSk...

Seems really weird to be using harddrives when they already have expenses like that.


As mentionned elsewhere, we're renting most of our infra from OVH, and paying, monthly, for 40TB of SSDs or NVMes would simply explode our yearly budget.

Source: am président of the lichess charity (and also one of the sysadmins)


Looks like rented stuff to me you can't just add drives ...

And while 500k is a lot maybe they can do so much with it because they do not just throw $1500 in drives at every problem.


The reason is buried in another article

"WDL tables (.rtbw) store the outcome of positions, e.g. if a position is winning. An engine will use this very frequently to decide which endgames to aim for. WDL tables should be stored on the fastest disk (preferably SSD) you have." "DTZ tables (.rtbz) tell the engine how to finish the endgame once it is on the board. They are optional, but required to reliably convert complicated endings."

Seems reasonable to put the WDL table on the SSD for better engine performance. I do understand not choosing SSD's. The number of lookups for positions always remains the same per user per game. Yet the tablebase is growing more than exponentially.

https://lichess.org/@/lichess/blog/7-piece-syzygy-tablebases...


Why scale up when you can optimise? I'm probably going to be downvoted for this, but imo this is really the mindset that leads to bloated software.


Agreed.

This is the implicit assertion that developer time is more expensive than hardware costs.

Seems true in the short term, until the whole system crumbles.


Some questionable choices are made in this optimization.

The reason for the optimization is that there is so much IO activity the RAID checks can't complete.

It is unclear from the article if the RAID checks were ever completed on 17TiB of data. Instead, they choose to disable the periodic RAID checks and instead switch to doing the error checking as a page of data is read in. The two are not equivalent, and both should be used for important data.

Finding corrupt data only as you try to read it can lead to long running data corruptions, maybe to the point your backups do not go back far enough to restore the uncorrupted data. Underpinning this also is a change to RAID 0... While the fastest option, they are putting a lot of faith in that NVMe config handling that kind of workload.

Hope they have good backups...

EDIT: A good way to solve this is to spin up a temporary server, restore your backups to it, do the full data checks and when successful, you have also checked your backup and restore process along with the integrity of the file. You still want to have enough overhead available to complete the RAID checks on the primary server and don't use RAID 0 for performance.


They are indeed not equivalent, but for our use case this is sufficent, if we detect data corruption we can just throw away the files and download/regenerate them (this is a freely available dataset, if a bit large, https://en.wikipedia.org/wiki/Endgame_tablebase will explain it better than me). For this reason, it is also not backupped.


There is also lishogi but it is smaller enough to not require such optimizations yet.

Shogi is the most entertaining for chess variants. Xiangqi not as much.


A lichess is a female lich I'm assuming? (It's like baron / baroness)


Noble titles are a poor comparison since they're the rare example where there actually is an exclusively-male root form. For most words the root form is neuter, and both male-only (if it exists) and female-only forms require an affix.

Properly, a male lich is "werlich" and a female lich is "wiflich" (unlike other words the /f/ sound is not likely to disappear); the plurals add "-en". But generally sex is irrelevant for undead{cn} so the neuter form by far predominates.

"lichess" is an abominable mixture of German and French roots ... so naturally it is indistinguishable from the rest of English.


note - "chess" is not a Germanic word (deriving from the Arabic شَاه (shah), meaning king). Ironically enough, it comes to English via the Old French eschés, meaning that "lichess" is arguably made from entirely French roots.


Hm, I guess the "libre" is French, but "live", "light", and most importantly "lich" are all German.

If we look for relatives of "libre", they include "leed"(song) and the first half of Leopold (adding "bold") and Luther (adding "army"). The common meaning is "people".


It's "Libre" chess, as in "Free (and open source)" chess


I know it's not a fair comparison but I'm truly impressed by the quality of engineering shown by the Lichess team, when their main competitor was for example boasting about a migration to GCP and yet suffering from repeated outages due to fairly organic growth in popularity. While I believe they employ 100x more people.

Lichess' mobile app was a weak spot, however the v2 rewrite in Flutter is already pretty good while still in beta.

And keep in mind Thibault pays himself less than 60k/year.


I don't think he needs to feel bad about increasing his salary. Make it 200k/yr and make his life easier, which can only be good for the project long term.


I don't know him personally but from the talks he's given, he seems to be ideological about Lichess and his own lifestyle, in a way that would be considered fairly anti-capitalistic by most of the HN crowd :)


Do you have links to any of these talks you could recommend?


Not OP but I can recommend this talk by Thibault (the founder): https://www.youtube.com/watch?v=LZgyVadkgmI


IDK about France (where Thibault is from, and IDK if he lives there), but where i'm from, you would have a very comfortable life earning 5k every month, so his self-imposed 60k/yr salary doesn't seem unreasonable at all. At some point, more money yields diminishing returns.


> but where i'm from, you would have a very comfortable life earning 5k every month, so his self-imposed 60k/yr salary doesn't seem unreasonable at all.

(Some) HN commentators seems weirdly out of touch when it comes to salary outside of IT-heavy cities in the US. The other day someone claimed $125k/year for an employee wasn't "big money" (https://news.ycombinator.com/item?id=40927175), so I'd take any comments saying some salary is high/low with a box filled with sand.


To be fair that really isn't 'big money' in most of those cities, assuming big money has some connotation of significantly above average after tax and expenses disposable income in those areas, especially relative to your peers. I don't think it would be unfair to say that would be big money compared to many European workers in the same jobs though.


I don't know if that 5K is before or after taxes. You easily lose half of what your employer actually pays.


€60k pre-tax is roughly in the top 10% of incomes in the country based on a quick google. Not opulent, but definitely comfortable.


His salary is more like €55k though.

It's comfortable outside of Paris and other expensive cities. But he could easily double that given his background. Before quitting his job he already worked with Play and the Typesafe (now Lightbend) stack before the peak of its hype, when companies were paying top dollar for consultants.


Lichess is a great service to casual chess players like myself to get a quick game against another human. Never much of a wait.

What I do want to know is how does one pronounce Lichess? Lie chess, Le chess?, League chess?


According to https://lichess.org/faq#name: "Lichess is a combination of live/light/libre and chess. It is pronounced lee-chess"

They also link this video: https://www.youtube.com/watch?v=KRpPqcrdE-o


I guess it's because of the lychee fruit?


Thanks.


/li:/ as in libre.


I’m team lie-chess.


Lichess is a great example of how efficient Wikipedia should have been (both on the code and organization level). :-)


I think you're highly overestimating how many devs Chess.com has


I am not, that's why I said employees not devs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: