Hacker News new | past | comments | ask | show | jobs | submit login
Exploring Tweets with SQLite and WASM (divu.in)
98 points by divyenduz on April 2, 2021 | hide | past | favorite | 24 comments



I have a CLI tool (written in Python) called twitter-to-sqlite which imports Tweets into a SQLite database - it's a really fun source of data to play with: https://datasette.io/tools/twitter-to-sqlite


I've been using your Datasette tools a bunch, they're really great for quick data analysis. Recently I've been play with doing an `sqlite foo.db dump` and then loading the dump into sql.js as the "DB" for statically-hosted websites.

It's probably impractical because the sqlite.wasm file is a couple megs, but it's a fun proof of concept.


That sounds like a really useful trick.


I came up with it because I was trying to figure out how to host data from datasette on GitHub Pages.


Aha, amazing, I have played with datasette in past, great work. I will link this tool back in the blog post today if you don’t mind :)


The question that keeps bother us, what is the WASM perform impact compared to native SQLite ?


Performance? Not sure TBH, wasn't a primary aim for me. The slowest bit of this experiment is to load the wasm file in the browser.


I wrote a Postgres FDW in Go to query Twitter using SQL - https://hub.steampipe.io/plugins/turbot/twitter

(Steampipe is an open source CLI to query cloud APIs using SQL. No DB required!)


Can this extend to rqlite?

It would be pretty epic


Ha, what’s your aim with that? Making changes in the browser and have them saved somewhere?

I am pretty sure I can make this upload to s3 or something to emulate that

But a part of this experiment was its offline nature, where you can play around with tweets etc without having them uploaded to a server.

Interesting idea though about having them “sync”


This could enable offline-first sync


I agree, would be an interesting POC, on my backlog


Cool but we’ve had in browser databases since ages. Things like indexedDb and WebSql. But I guess it’s a cool WebAssembly experiment nonetheless.


WebSQL died and shipping a locked database version compiled with features you want is usually better than relying on every browser DB implementation and version supporting the stuff you want to use. The apps that need SQL aren't too sensitive to initial load/download time increase by adding a MB of WASM which can later be handled by caching.


WebSQL never reached proper standardization because there was a requirement of at least two independent implementations for that and all the available ones were based on... SQLite.


All being based on SQLite wouldn't have been the problem. But it specified a database "accepting the same SQL dialect as SQLite version XYZ", and that's a fairly obvious no-go (and nobody involved thought it was worth the effort to do a better specification).


IMO, anyone given the choice between IndexedDB and "compile SQLite to webassembly and use that" would pick SQLite in a heartbeat. IndexedDB is awful in almost every possible way. It's slow, the API sucks, it's limited, etc


Ha, would be nice-ish if browser have this pre-available, TBH most loading time is loading SQLite async, then it is instantaneous


But that’s a bad idea, once browser version will be locked to a database version, security updates etc would be tied and a yada yada


Oh absolutely, just an experiment, I am really curious about doing things like video editing etc on the client side.

I did use this to delete a lot of old tweets though.


    SELECT substr(date,1,7) as d, count(*), min(likes+retweets) FROM tweets GROUP BY d ORDER BY d DESC;
Yep, looks like you mostly tossed tweets over a year old that didn't meet some threshold. Very cool to play with in the browser.


Yeah, I am slowly removing myself from the internet, I don’t think that I follow anyone on any social network where ever I exist.

This is also a great way to backup tweets etc.

Of course I have multiple SQLite files with all the data, whenever I want to delete “more”, I just take a new Twitter dump.


WebSQL is deprecated IIRC - the problem with complex databasing like SQL is that users are never meant to be querying directly on it. SQLite has a number of functions that are pointers to your own functions to override/add behavior. If a user could simply query that - poof you're SOL.

Which means to be safe from the host system you have to virtualize the execution somehow - WASM is exactly that. Therefore if you wanted sqlite in a browser, you can simply compile it to wasm.


mozilla killed WebSql because it lacked "developer aesthetics", now we got shit indexedDb nobody uses without middleware to ease the pain.

"Aesthetics and Web Developers" https://hacks.mozilla.org/2010/06/beyond-html5-database-apis...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: