Side note: I love the attention that SQLite is getting now. It's so refreshing after the dark ages of misapplied technology: microservices, Docker, NoSQL, and countless front-end frameworks.
I picked up some very neat new SQLite tricks from this post independent of the extension itself (which is very cool) - I didn't know about the "define" module which lets you create new table-valued functions that are defined as SQL queries themselves.
Would be interesting if this can do HTTP/1.1 pipelining. Need to take a closer look.
What makes me hesitate to look closer is that I almost always am doing some text processing after the HTTP response and before the SQL commands. Due to resource constraints, I do not want to store HTML cruft.
I mostly am doing
HTTP response --> text processing --> SQLite3
rather than
HTTP response --> SQLite3
However I also have a need for storing different combinations of HTTP request headers and values in a database. I currently use the UNIX filesystem as a database (one header per file, djb's envdir loads the headers into environment from a selected folder), but maybe I could use SQLite3. For unprocessed text, e.g., from pipelined DoH responses, I use tmux buffers as a temporary database. Then do something like
The ip-map.txt file gets loaded into memory of a localhost forward proxy.
Databases, such as NetBSD's db, djb's cdb, kdb+, or sqlite3 can help with the "append unique" step if the data gets big.
Note: Any JSON I retrieve is "text", not binary data. Most HTTP responses are HTML, sometimes with embedded JSON. Pipelined DoH is binary but I still use drill to print the packets as text. (When I finally learn ldns I will stop using drill.)
re pipelining: unless Go's network package does this magically under the hood, then sqlite-http doesn't take advantage of that! If you think that would be a nice feature, file an issue and I'll see what can be done
How complex is your text processing? If I'm working with a small dataset, I typically just save the "raw" responses into a table and create several intermediate tables that extracts the data I need. Something like:
create table raw_responses as http_get('https:/....');
create table intermediate1 as select xxxx(response_body) as message from raw_responses;
create table final as select message from intermediate;
(some CTEs[0] will be very useful here)
The "lambda" pattern described about half-way through this post might be useful as well, if you need to do some complex text processing.
Go's HTTP client doesn't support pipelining in HTTP 1.1. It will do keep alive / socket re-use. HTTP 2+ have support for request multiplexing which Go supports. Your code appears to create a new client on each request, so I'm not sure if the underlying transport will be re-used or if any multiplexing will take place. I would be surprised if it is.
Wouldn't you be doing JSON processing instead of text processing?
I myself never had to do text processing but instead JSON processing and that was convenient to do in Postgres directly, and I assume it's convenient to do in SQLite given the recent advancements in SQLite's JSON capabilities.
While this is really cool (and a feat of engineering no less), I'm always really concerned when someone suggests to make an HTTP call from their database server.
Many years ago my boss told me to make a scheduled task in Windows to execute a SQL query to make an HTTP call and I asked him why we couldn't just use crontab/cURL? His response: "cURL? Like from the 90s?"
Anyhow I didn't last very long. Got fired shortly thereafter.
Yeah, "dont make network requests from the database" is a common antipattern, but it doesn't quite apply with sqlite-http. It's mainly meant for data processing scripts that use SQL, which is typically on a local database file with pretty low risk. You probably wouldn't want/need sqlite-http on a SQLite db that a live application uses, unless you use the no network[0] option
On SSRF: There is a "no network"[0] option for the extension, if you don't trust SQL input. This is what the example Datasette server[1] uses
Also to note, this is mainly meant for use with local data processing, I doubt there's a need for sqlite-http in a SQLite DB used in regular applications. If you're just wrangling some data locally with SQLite with these extensions, I doubt there's much of a security risk
Direct link to the project: https://github.com/asg017/sqlite-http
Few other projects of interest:
- sqlite-html: https://github.com/asg017/sqlite-html
- sqlite-lines: https://github.com/asg017/sqlite-lines
- various other extensions: https://github.com/nalgeon/sqlean
And past HN threads:
- https://news.ycombinator.com/item?id=32335295
- https://news.ycombinator.com/item?id=32288165