A fast SQLite PWA notebook for CSV files

mwenge · on Dec 30, 2021

There are a lot of different solutions knocking around for running queries on one or more CSV files. This is a web-based notebook that prioritizes loading large files quickly so you can get running queries on them as soon as possible. It supports most text files, Excel files, JSON. It can also display your results as graphs.

vmchale · on Dec 30, 2021

I quite like Q for this though it doesn't have graphs.

Jd is also decent. More expressive languages for data.

cfeduke · on Dec 30, 2021

Visidata[0] is another great tool in this vein I often forget the name of when I really need it. Python, in the terminal, kind of like vim on CSV with graphing capabilities.

0. https://www.visidata.org/

leonim · on Dec 30, 2021

Second this recommendation for the terminal. For my CLI toolbox, VisiData is my favorite.

I find VisiData is great for quickly exploring and querying data from the CLI. It can handle many types of files (SQLite, CSV, TSV, Excel, JSON, YAML, etc). Visidata loads all the data into memory, and so is very responsive when exploring the data. It allows you to quickly do all sorts of of adhoc queries interactively, without having to write a valid SQL query.

I haven't used Q. When I first heard of it, I liked the idea that Q allowed you to run random queries on CSV and TSV files. However, it seemed like it would be slow if you wanted to do follow up queries, since it had to repopulate the in memory SQLite file for each query. Though it looks like the latest version has a way to cache the generated sqlite file. So that seems like it could help.

Also, if I have some CSV, TSV, JSONL data sqlite-utils is useful for converting them to SQLite, and then exploring with Visidata or SQL queries.

Q: https://github.com/harelba/q sqlite-utils: https://github.com/simonw/sqlite-utils

chrisweekly · on Dec 30, 2021

In this space I absolutely love https://lnav.org -- the "mini-ETL powertool" featuring embedded SQLite and *nix-y CLI for chaining / scripting.

Edit:

PS Unaffiliated, just a fan since 2016(?)

PPS Despite the name and original / primary use case, lnav is useful for things beyond "just" logfiles(!)

cube00 · on Dec 30, 2021

A PWA and vanilla JavaScript; we need more of these. Fantastic job!

oefrha · on Dec 30, 2021

Not sure what's your definition of "vanilla JavaScript", but https://dirtylittlesql.com/separators.js is clearly produced by a bundler, bundling a bunch of npm packages like events, buffer, etc. I would be pretty alarmed if someone wrote a single 7659-line vanilla JavaScript file by hand in 2021.

And of course the SQLite part is wasm.

maga · on Dec 30, 2021

Vanilla JS nowadays means not using a framework, in yesteryears it meant not using jQuery, in either case, it doesn't preclude from having dependencies.

DemocracyFTW · on Dec 30, 2021

to be fair, in case the dependencies are pegged to specific known-good versions and are bundled into one or a few files then two concerns with dependencies go away, namely, insanely deep and wide file system trees and the lingering danger of any one of hundreds of software titles getting malware-ized. One could even add that it's probably a good idea to prefer tried-and-tested existing software over writing everything from scratch.

lghh · on Dec 30, 2021

Why does it matter that it's vanilla Javascript? I don't think this would actually fit the general definition of vanilla Javascript, but I'm more curious why it matters at all?

mikojan · on Dec 30, 2021

Not OP, but in my opinion:

When I am reading through the source code of somebodies personal project, it's absolutely amazing to:

- see how to do X in discernible, concise, modern JavaScript

- follow along significant architectural decisions

- watch a web API in action, not a specialized abstraction that will necessarily hide some capabilities

That being said: This project doesn't offer much in that regard. It's hard to read and makes little use of modern JavaScript.

adolph · on Dec 30, 2021

This is really wonderful! The discussion about lay people's knowledge of sql reminded me that the Pandas API is often useful for non-sql folk. Likewise there are some projects similar to dirtylittlesql to bring Python data manipulation to the browser.

https://github.com/jtpio/jupyterlite

https://github.com/gzuidhof/starboard-notebook

kapilvt · on Dec 30, 2021

Also worth a look, built as a static web spa

https://github.com/lana-k/sqliteviz

js4ever · on Dec 30, 2021

Very cool, but I didn't found a way to configure the delimiter (most of my files use ; as a delimiter)

mwenge · on Dec 30, 2021

Forgot to add a semi-colon as a possible delimiter! Fixed it there now.

lucasverra · on Dec 30, 2021

hey there, i've left a github issue / feature request.

Column UI would be awesome

monkeydust · on Dec 30, 2021

What would be cool is integrating this with OpenAI Codex - the natural language to SQL transformations were impressive when I played around with them.

This could then be pitched to non-techie folk who want to simply ask questions from their data on flat file.

eof · on Dec 30, 2021

I’m skeptical of the utility of building for non techie people. It takes literally tens of minutes to Learn SQL necessary to a single csv file; and the overlap of people needing to query csv files and capable of learning basic select syntax is probably nearly perfectly overlapping.

I don’t knock the research at all, it’s very cool. But it has to be basically perfect for someone even slightly experienced to consider using, otherwise it will cause more frustration then time saving.

Closi · on Dec 30, 2021

> It takes literally tens of minutes to Learn SQL necessary to a single csv file

It might take you, as a programmer or technically literate person, tens of minutes to learn the SQL necessary to parse a csv file.

But I have met a whole raft of people that would firstly take much longer to learn enough SQL to get by, then would still need help when basic syntax error messages came up, and then would get frustrated before getting someone else to do basic analysis for them.

eof · on Dec 30, 2021

I have met those people to, but none of those people have ever had a csv file to run queries on.

Analysis has a lot of meanings. Getting basic facts (there are N rows that X) and selecting subsets of data are realistic analysis non technical people might do on csv files.

And I want to be clear that parsing a csv file with sqlite or MySQL is much more complex than learning “SELECT .. FROM .. WHERE” (with no joins!) when there is a tool like OPs that magically does the “parsing”.

Closi · on Dec 30, 2021

In reality, a basic query might be something like "Show me how many units of product code 010201 were sold each month this year", which actually isn't that easy to write in SQL (but very easy to express in plain English).

wswope · on Dec 30, 2021

> And I want to be clear that parsing a csv file with sqlite or MySQL is much more complex than learning “SELECT .. FROM .. WHERE” (with no joins!) when there is a tool like OPs that magically does the “parsing”.

Sqlite CSV import:

.mode csv

.import file.csv tablename

hgarg · on Dec 30, 2021

You mean like this? - https://leesaapp.com/

monkeydust · on Dec 30, 2021

Yes, thanks for sharing.

pjmlp · on Dec 30, 2021

Nice app, best of all no Electron needed.

tanin · on Dec 30, 2021

Welp, I was gonna post my web0 desktop Electron app for working with CSVs using SQL: https://superintendent.app

eatonphil · on Dec 30, 2021

Here's another one I'm working on [0]. Run it as a desktop Electron app or as a server/web app. There's also a serverless/in-memory web app demo running here [1].

[0] https://github.com/multiprocessio/datastation

[1] https://app.datastation.multiprocess.io

mynameismon · on Dec 30, 2021

Similar: Microsoft LogParser (https://www.microsoft.com/en-in/download/details.aspx?id=246...)

fxj · on Dec 30, 2021

Love it! Could you also provide some options for decimal point data? Comma instead of period.

It would be great if I could use it as a kernel in JupyterLite!

pletnes · on Dec 30, 2021

There is a sqlite kernel for regular jupyter. Not sure about jupyterlite though.

stsourlidakis · on Dec 30, 2021

This is really good! Would be great if there was a way to rename the created tables!

Multrex · on Dec 30, 2021

Is it available as Docker Image?

cube00 · on Dec 30, 2021

Given it's a PWA couldn't you just install it and use it offline?