Hacker News new | past | comments | ask | show | jobs | submit | hawaiianSpork's comments login

Parquet has been the lakehouse file format of choice for nearly half a decade. But we are starting to see other contenders that are optimized more for lower latency like lance https://github.com/lancedb/lance


5 years is not a super long time. It just can feel that way sometimes.


I looked for the York Abstract Machine used in the paper and couldn't find anything about it outside of this paper: https://www.cs.york.ac.uk/plasma/publications/pdf/ManningPlu...

It would be nice to play with the code.


I wonder if exposing this as a language server would be helpful?


Yes, I've been considering that since the beginning. Website (backend in C++, frontend in Svelte) takes priority, because this solution is good enough so far, and I'd really like to have access to my ZK on my phone. Probably not a website meant for the open internet: I have a server at home + use Wireguard VPN, so my phone can connect to local services/sites at all times.


If you are looking to do data validation from the JVM, you may try Baleen (written in Kotlin): https://github.com/ShopRunner/baleen/

I'm one of the contributors. We created a DSL in the language to describe the data and create tests. You can then use that data description to validate against json, csv, avro... One of the neat things we came up with was the concept of a data trace which is like a stack trace but is a path through the data to a particular error.


Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: