Hello everyone! Author here. I didn't expect anyone to find this repo, much less post it on Hacker News!
This project is inactive for two main reasons:
- SQLite is not a great general-purpose SQL engine. Poor performance of joins is a serious problem that I couldn't solve. The virtual table support is good but not quite good enough; not enough parts of the query are pushed down into the virtual table interface to permit efficient querying of remote tables. Many "ALTER" features are not implemented in SQLite which is a tough sell for experimental data manipulation.
- T-SQL, the procedural language I chose to implement atop SQLite, is not a great general-purpose programming language. Using C# in LINQpad is a more pleasant experience for experimentally messing around with data. R Studio is a good option if you need statistical functions.
I think several good solutions in this problem space exist. A local install of SQL Server Express can be linked to remote servers, allowing you to join local tables to remote ones. That setup serves nearly all of SQL Notebook's use cases better than SQL Notebook does. LINQpad is also very convenient for a lot of use cases.
I appreciate the interest! I may spin off the import/export functionality into its own app someday, as I had a lot of plans in that area, but I think SQL Notebook as it stands is a bit too flawed to develop fully.
A web app for writing and running SQL queries and visualizing the results. Supports Postgres, MySQL, SQL Server, Crate, Vertica, and Presto. Written in Node.js.
SQLite definitely presents some unique challenges, despite it generally being the "simplest" relational database. I've written a web-based SQLite database browser, which you can find here: https://github.com/coleifer/sqlite-web -- importantly, it supports all the `ALTER TABLE` statements you'd expect (add column, drop column, rename column, etc). This is done by renaming the table temporarily, creating a new table with the desired changes, then copying the data into the new table. Kinda clunky, but it works :)
I feel like rewriting this on top of SQL Server Express would still be valuable. Setting up and managing linked servers manually, especially with non-SQL Server databases can be a pita especially considering managing their drivers etc.
Also, SQL Server is awful at importing/exporting data directly. Almost all forms are broken in various ways and are a source of endless grief. The only reliable built-in way is to use SSIS which is like using a 200-ton tank for a paperweight.
They probably meant SQL Server Compact, which is a smaller package and comparable to SQLite. It was meant to power WinFS back in 2006. But it's closed source and now legacy.
There are many business applications that target SQL Server exclusively.
On the one hand, I find that highly annoying. On the other hand, even if you don't like Microsoft, SQL Server is fairly awesome, performance-wise and stability-wise.
But still, there are many applications where you cannot really choose what database engine you want to use.
I suppose PostgreSQL+FDW would be just as good as SQL Server Express+Linked Servers technically. You end up trading users' MSSQL experience for pgsql.
I wonder if it would be easy to support both backends and be able to switch between them depending on the query and which features and integrations you want.
> Also, SQL Server is awful at importing/exporting data directly. Almost all forms are broken in various ways and are a source of endless grief. The only reliable built-in way is to use SSIS which is like using a 200-ton tank for a paperweight.
So much this. BCP is inferior to 5 lines of C# code, the tools for importing and exporting data in SSMS are buggy and cumbersome, SSIS is a gigantic monster. Simple "Export the results of my query into an INSERT statement" is a moronically obvious workflow and is non-existent.
I've read that the tools division of MSSQL was running on a skeleton crew for a long time, and it shows.
Despite everyone saying SSMS is hands down the best database IDE, while this may be true it is also true that MSSQL tooling is a train wreck compared to what it could be. It almost seems Microsoft intends to keep it this way, perhaps because they'd have to finally admit to the decade of neglect? Who knows the exact reason, but it's a sad situation.
Sql Server Reporting Services is great but the server website was obviously mid-2000s until very recently, even containing markup that wouldnt render in Chrome.
SSIS is a hellscape of unnecessary designers for an ETL framework, and generates ultra-brittle packages that crash if the slightest schema tweaks happen.
SSDT stores your schema and publishes diffs for easy deployment, but will gobble up all your RAM for half an hour to compile... if it doesn't crash.
And SSMS has the worst tabbed interface ever for managing queries.
I agree with you on all points. Products on the periphery of SQL Server have been severely neglected for years. There is a replacement for some of SSRS in PowerBI and SSAS, which is actually pretty nice. If I'm not mistaken, people serious about version control and deployment in SQL Server use redgate software.
My question is how other databases fare in these areas. I'm not familiar with any solutions for Reporting UI, ETL, and Schema Version Control in MySQL or PostgreSQL. What built-in tools do other databases have that outperform SQL Server's (obviously crumbling) functionality?
The general consensus is usually that this MS tooling, yes the one just described, is very good! Certainly far better than the competition. It boggles the mind.
>Simple "Export the results of my query into an INSERT statement" is a moronically obvious workflow and is non-existent.
I've only used SSIS for a few months now, but wouldn't this just be a simple Data Flow Task for that. Now, I'm not saying I would use SSIS for that (because I frankly hate SSIS) but it is a trivial thing to do.
I've never had an import job where run time was a key concern... but the old ADO.Net Bulk Insert thingy has always been good enough when Entity Framework was too slow (which, for inserts, it generally is).
How large a job are you talking? Did you use a transaction? A transaction + being in a nearby location has been good enough for every import I've done (millions of rows).
Can your software use ODBC connection (specifically System DSN sources)? I'm waiting for my IT dept to install .NET 4.6 so I can't try your software right now.
Do you recommend it for a noob on data analysis? I'd basically set up a MySQL or PostgreSQL somewhere else (like a vps) and load and download data from there.
> T-SQL, the procedural language I chose to implement atop SQLite, is not a great general-purpose programming language.
T-SQL ( I'm assuming the microsoft variant ) is not a procedural language, it's a declarative language.
> Using C# in LINQpad is a more pleasant experience for experimentally messing around with data.
This is not true. T-SQL is as close to the data as you can get. C#/LINQ/EF/etc are great for business layer/presentation layer. But for messing around with data, you can't really do any better than T-SQL. Trust me on this.
If you are a front-end developer, then LINQpad is great in building .NET code/CRUD libraries/etc for dealing with data, but if you want to work with data, T-SQL is where it is at.
Developing expertise in T-SQL ( it's not as easy as people think ) will open up a new way of understand data/programming/thinking. SQL Server allows importing of .net code/functions which you can run on the sql server via T-SQL so there isn't really much you can't do with T-SQL.
> A local install of SQL Server Express can be linked to remote servers, allowing you to join local tables to remote ones.
SQL Server Developer Edition is free now. That is a great starting RDBMs to learn databases on windows environment. It has almost everything other than enterprise level features ( clustering/replication/etc ) that most people don't need.
> T-SQL ( I'm assuming the microsoft variant ) is not a procedural language, it's a declarative language.
I think he's referring to the terrible terrible terrible procedural components of T-SQL - cursors, local variables, IF/THEN control blocks, etc.
Right Tool For The Job. Sql is good for querying/storing data, so use it for that. Then let the C# handle manipulating the data and applying rules and whatnot.
> I think he's referring to the terrible terrible terrible procedural components of T-SQL - cursors, local variables, IF/THEN control blocks, etc.
Sure. It also has stored procedures/functions/etc. But just because it has elements of imperative->procedural language doesn't mean it is a procedural language. No more than C#'s LINQ implementations mean C# is a declarative language or lambda expressions means it is a functional language.
And though I'm not of a fan of cursors, it is a legacy of a time when it was somewhat needed. And nothing about local variables/controls/etc makes T-SQL "terrible".
All of it is necessary for database maintenance, t-sql programming, etc.
> And nothing about local variables/controls/etc makes T-SQL "terrible".
Compared to a full-featured programming language? It's good for its intent, which is to provide a minimal bit of procedural scaffolding for scripts that alter the database. It is terrible as a general-purpose procedural programming language, and I've worked with too much horrifying T-SQL and PL/SQL that pushes these tools way too far out of their intended workflow.
Yes, T-SQL is predominantely a declarative language. But it includes procedural components for writing procedural scripts. And unless we're talking about maintenance scripts or deployment scripts, don't use them.
You just need to calm down. Everyone here has spent a certain amount of time dealing with shitty code.
That's all you're really claiming here. That some people write shitty code. And I agree. I've spent plenty of time with almost every flavor of SQL code that was incredibly shitty.
But I don't really blame the languages that much. I blame the people who didn't understand when and where to use one language or another.
And frankly, if you're seriously going to complain about T-SQL vs other "more full featured programming languages" and then make a blanket statement about not using any procedural components on SQL flavors . . . ummm, you might be your own worst enemy here.
Since SQL Server 2005, modern alternatives to the procedural components have been added to the product.
Even with these features existing it does not make any sense to call SQL a procedural language.
Choosing the right tool for the job is certainly important. I notice many people write off SQL almost immediately when it is mentioned here.
Often they come from a different language and do not see the value in the relational model. It's unfortunate too because there is real value in being able to go between a relational mindset seamlessly to an object oriented or functional one.
I use SQL Server daily and I love the relational model. The core SQL language I dislike because it's verbose, backwards, and restricts sane forms of reuse. The procedural components of MS SQL are just plain miserable.
I'm constantly frustrated that the two options in the database world appear to be "use SQL and all its anachronistic warts" or "give up on the relational model". NoSQL throws the baby out with the bath-water.
But this is a tangent. The original post came about calling T-SQL a "procedural language". Which is inaccurate... but the author was talking about adding extensions to Sqlite - SQlite is already a SQL dialect. So if you're bolting on T-SQL to Sqlite, what's the main thing it will bring to the table?
It's procedural components. So, in the context of comparing Sqlite to T-Sql, using the term "procedural language" might be a misnomer, but it makes sense here.
Yeah, this is spot on. SQLite provides DDL and DML commands, but no support for composing programs from these commands. I leaned on T-SQL for the syntax and behavior of variables, basic control structures, procedures, error handling, and some common library functions, but that's all that it takes from T-SQL. I didn't intend to make a judgment about T-SQL, but rather to indicate which parts were implemented in SQL Notebook.
No algebraic types. No good way to query a graph of objects down, only a flat resultset. Infuriating performance problems if you layer views too deep or use scalar functions, which are sane encapsulation. No higher-level concepts of code reuse like templates. If I have a common pattern like "subclass table A by creating table B with a foreign primary key to A and create view vwB that has the combined columns of A and B" there is no SQL construct to express this common pattern.
You just said to use the right tool for the right job, a few comments above.
Everything you are saying here sounds like a job for something other than a relational database.
I would shit a brick and beat any database developer with it that introduced the idea of a class as a primitive. Fuck that happy horse shit.
Also, graphs? Really?
You're looking for an ORM and blaming SQL for not being that.
There is no common database pattern that creates one table as a subset of another table. RDBMS is, by nature and design, not something you modify in that way on the fly. If you want that kind of object oriented functionality, push your data into a cache layer or column store or KV.
You said it yourself. Use the right tool for the right job. Nothing you've said makes even remotely any sense as a criticism of T-SQL or SQL in general. That's not the right tool for what you're talking about.
Not that a class needs to be a primitive, but that patterns like inheritance should be reusable in some way through templating or the like.
And the inability to pull down a graph as a single operation is the perfect example of sql's limitations.
Relational model is great right? Right. So why throw it out the window and give me a glorified excel spreadsheet as a result set? If I want to pull down a thing and it's related subthings in a single operation, I have to join which is wasteful in that case. This actualy hamstrings ORMs.
I understand relational databases just fine. I still have no interest in the Stockholm syndrome of defending a platform where a = a can evaluate to something equivalent to False, regardless of the theoretical underpinnings of three-valued logic.
There are so many pain points in SQL that are defended because theory.
TVFs are intensely verbose to use compared to the corresponding SVF for a single column. Cross Apply will destroy your performance. I'm venting because I've watched clean, legible code get destroyed into soup of a zillion tiny joins and endless repeated boilerplate for performance improvement by this misfeature.
> but if you want to work with data, T-SQL is where it is at.
Depends on your definition of "working with data" I guess.....TSQL is fast, but beyond that almost anything beats it for functionality when messing around.
I recently had to teach a series of workshops on SQL and I was trying to figure out the best system to allow students to independently work with small datasets without having to install any software. I found Alon Zakai's absolutely fantastic version of SQLite in JavaScript here:
We recently launched https://www.db-fiddle.com/ which you may find useful, you could save your scheme/data sql and your students can fork it for their own version :)
I did not see anything when I started putting it together, but I already had a number of datasets that I put together for illustrating specific points. My favorite is a list of pairs of airport codes and the distances between them. This is a simple table, but students can have a lot of fun with a challenge like "Be the worst possible travel agent you can be. Find an itinerary from NYC to CDG, with a maximum of four stops, that will cover the most miles." Hilarity ensues as we look at the result, but it is challenging enough for beginners learning about joins, etc.
The other great thing about this type of setup is that students can feel free to experiment with deleting records and doing all kinds of things in SQL knowing that they can immediately restore everything to the initial state by hitting refresh.
The last component that I think helped the class was adding an anonymous "share" button that allowed students in the class to submit the contents of their editor so that I could run it for the whole class and we could talk about the trade-offs of different approaches.
Only Mozilla dev, a die-hard NoSQL fan, prevented adoption in Firefox (he since left Mozilla) and Microsoft couldn't decide which of it's 10 SQL engines (no joke, think about MS Access, AD, Exchange, Outlook, WinFS, etc all had/have their own embedded SQL database incarnations) to use in Internet Explorer. In the end Microsoft nowadays ships SQLite with Windows. But WebSQL isn't supported by Firefox and IE/Edge - but the web users moved on, and like 99% of mobile devices have a webkit/blink based browser, and Chrome and Safari have the largest user base on desktop too.
It's not quite that simple. The WebSQL standard wasn't a "standard" at all, it was just "browsers already expose whatever SQLite they have, lets write that down!". No tests, no conformity, no way to tell if browser X supported SQLite feature Y, nothing. Just raw access to whatever SQLite was baked in. There were lots of inconsistencies.
Yeah, it was super simple and useful, but nobody wanted to do the work to make sure interoperability was a thing, in part because SQLite has very complex behavior and lots of quirks to adhere to.
In a terminal, `sqlite3 mydb.` works pretty well also. I find this generally a lot faster than GUI tools for quick peek or some queries (although for complex relations and visualization a good GUI is the thing to have). Not a huge MS fan but SQL Management Studio is pretty good (don't know about sqlite support.. I bet it's possible though).
Ouch, that would be very useful to me had I known about it two months ago, when I was exploring the database dump from my old Wordpress blog (I'm finalizing the process of re-launching it as a static site). I managed though, by combination of MySQL Workbench and Common Lisp REPL.
Anyway, bookmarking for the next time I'll need to play with relational data.
Yeah, LINQPad is great! I use it for so many weird cases, everything from rather complex scraping with phantomjs to trying out things like regexs. I've saved small scripts that make it easier to get into new projects, such as making lists of files that have been changed the most and people who have the most commits etc.. Really recommend it for .NET devs.
-How about import from clipboard (useful for cut and paste from excel)
-It doesn't seem to recognize tab delimiters in a .txt file. Maybe the import window should have a delimiter selector?
-Does it have a crosstab/pivot tool? Most sql dialects are lacking here because they make you explicitly define crosstab columns which is a pain for exploration work.
It's got a straight-forward installer that's 6MB. I was able to run it and use it in seconds.
Zeppelin has source available, or a binary package with a spark interpreter, or a binary package with "all interpreters". That one is 712MB. If I just want to use it, I still don't understand what to actually do. I'm downloading the big .tgz as we speak.
Following the installation guide,
A:\TEMP\zeppelin\zeppelin-0.7.1-bin-all>bin\zeppelin.cmd
Log dir doesn't exist, create A:\TEMP\zeppelin\zeppelin-0.7.1-bin-all\logs
Pid dir doesn't exist, create A:\TEMP\zeppelin\zeppelin-0.7.1-bin-all\run
The system cannot find the path specified.
A:\TEMP\zeppelin\zeppelin-0.7.1-bin-all>mkdir logs
A subdirectory or file logs already exists.
A:\TEMP\zeppelin\zeppelin-0.7.1-bin-all>bin\zeppelin.cmd
The system cannot find the path specified.
And now I'm completely stuck.
So the value in this compared to Zeppelin is that it's small, straight-forward, and simple.
Says more about how long I haven't used a MS OS than anything but I initially had a 5 second "why are they installing it on their floppy drive?" moment there :-)
Ditto. For me A: was always a 3.5" drive with B: being, optionally, a 5.25" drive (if installed). Not sure how universal that was though. I do like how C: has mostly hung on though.
I just installed zeppelin the other day. Apart from the rather big download, it was just executing "bin/zeppelin-daemon.sh start" and localhost:8080 was working.
This was on linux though.
Isn't this self hosted, running on the windows client, so you can quickly work with small datasets that are local? Vs. Zeppelin which is hosted (or, if local, is running via docker), and would require uploading the data?
If you already have a trusted Zeppelin or Jupyter or other notebook, then this may not be needed, but if you want to keep all local, this may be a quick and easy alternative... assuming you like SQL.
I haven't used it but it looks like Zeppelin is a Java web application that's available on localhost by default. No docker needed, just download and run.
Fair enough; I've never used it locally as I always am using hosted versions. Still, lots of extra overhead to just beat up some tables in sql. If you think you would do more beyond that, however, makes sense to try a notebook...
I've always wanted a nice SQL-oriented "notebook" type of application.
I get something of this experience in Emacs via `org-mode`, `sql-mode`, and `ob-sql-mode` minus the data-importing functionality... though with babel it's probably doable in a code block using a script.
Bonus: org-mode lets you export to many formats which makes sharing results quite easy.
In my daily work I often have the need to analyze excel and csv files from clients. I use http://harelba.github.io/q/ and it worked most of the time. But this one seems promising. Especially being able to query data from a file and join with data from a database.
Is there any Windows SQL software that can use system/machine ODBC data sources? My company uses OpenLink's ODBC drivers to access our main database (Progress OpenEdge.) I have no problem using Python, Pandas, and pyodb to connect to the data base but it isn't the best environment to develop queries.
MySQL workbench (it is available as a free download but you have to dodge quite a few "No I really mean I want the free community ed" screens) is quite good for that.
How does an SO survey change the view of the comment? If they use Linux and know three other developers two that use a Mac and one on Windows isn't the comment still true? From what they see more folks use non-Windows. Also, not every developer in the world filled out the SO survey. The comment was lamenting that it was Windows Only, I think that's still relevant.
The comment can lament whatever it wants, but its anecdotal evidence is clearly implying a premise that's proven false by actual data that shows a majority of developers use Windows.
This project is inactive for two main reasons:
- SQLite is not a great general-purpose SQL engine. Poor performance of joins is a serious problem that I couldn't solve. The virtual table support is good but not quite good enough; not enough parts of the query are pushed down into the virtual table interface to permit efficient querying of remote tables. Many "ALTER" features are not implemented in SQLite which is a tough sell for experimental data manipulation.
- T-SQL, the procedural language I chose to implement atop SQLite, is not a great general-purpose programming language. Using C# in LINQpad is a more pleasant experience for experimentally messing around with data. R Studio is a good option if you need statistical functions.
I think several good solutions in this problem space exist. A local install of SQL Server Express can be linked to remote servers, allowing you to join local tables to remote ones. That setup serves nearly all of SQL Notebook's use cases better than SQL Notebook does. LINQpad is also very convenient for a lot of use cases.
I appreciate the interest! I may spin off the import/export functionality into its own app someday, as I had a lot of plans in that area, but I think SQL Notebook as it stands is a bit too flawed to develop fully.