Show HN: Sheet Markup – add spreadsheets to a Markdown document

thangalin · on April 12, 2023

My text KeenWrite supports R Markdown. I wrote a simple function to convert CSV data into a Markdown table[1] along with a tutorial demonstrating usage[2]. This allows users to keep the data separate from the document.

Having the ability to apply spreadsheet functions as per EqualTo is brilliant.

[1]: https://github.com/DaveJarvis/KeenWrite/blob/main/R/csv.R

[2]: https://youtu.be/XSbTF3E5p7Q?list=PLB-WIt1cZYLm1MMx2FBG9KWzP...

diarmuid_glynn · on April 12, 2023

I've never seen KeenWrite before, it looks nice!

Feel free to reach out (email in my profile) if you'd like to discuss how Sheet Markdown could be added to KeenWrite. Assuming you're displaying the preview using some sort of modern HTML render with canvas support, it should be pretty easy to do.

thangalin · on April 12, 2023

KeenWrite uses the Java-based FlyingSaucer library for rendering HTML in the preview. Moreover, KeenWrite exports Markdown as XHTML, which ConTeXt imports for typesetting into PDF. While Sheet Markdown would be a fun integration, it would have to execute prior to displaying (i.e., not use JavaScript for in-browser rendering, but use Java to preprocess the Markdown tables before inserting into the XHTML).

diarmuid_glynn · on April 13, 2023

Thanks for the explanation. I think integrating Sheet Markup with that process would be challenging, since (as I understand it) XHTML doesn't support canvas.

adiM · on April 13, 2023

Note that context also support for spreadsheet like calculations in tables, so it could perhaps be translated at markup level.

sebastianavina · on April 13, 2023

I want to write a blog using blot.im; but I was looking for something like keenwrite for ease the process..

Also, I must ask. Does anyone know an keenwrite-like-emacsExtension. I would really appreciate it.

thangalin · on April 13, 2023

As far as I know, there are no other editors or word processors that offer interpolated variables at a keystroke[1], the primary reason I developed the software. A similar editor to KeenWrite is Jupyter Notebooks, which looks like it has an Emacs extension[2]. I think of KeenWrite as being meant for long-form prose with a little math and stats thrown in (e.g., sci-fi novels); Jupyter Notebooks is meant for math and stats with short-form prose thrown in.

[1]: https://youtu.be/CFCqe3A5dFg?t=48 (variables tutorial)

[2]: https://github.com/nnicandro/emacs-jupyter

diarmuid_glynn · on April 12, 2023

We've found "sheet markup" (a simplified, textual representation of a spreadsheet) useful in other contexts, such as when interacting with an LLM. I think there might be quite a few other interesting uses, happy to discuss.

breck · on April 12, 2023

Interesting! I am very interested in this stuff. If you ever want to chat I've spent way too long toying with different versions.

In my experience editing Markdown tables by hand isn't fun.

I just go with CSVs or PSVs or Space Separated values.

Here's how I do it in Scroll (my markdown alternative):

https://try.scroll.pub/#scroll%0A%20table%20%7C%0A%20%20Item...

diarmuid_glynn · on April 12, 2023

I haven't seen Scroll before, it looks nice!

I've sent a mail to the gmail referenced in your profile page.

layer8 · on April 12, 2023

Where can I find the table syntax specification and the formula language specification?

diarmuid_glynn · on April 12, 2023

We don't have a formal grammar yet. We'll put one together and add it to the GitHub repo tomorrow.

Informally: each row in the sheet is a new line, and each cell is separated with a pipe (|). Cells can contain either values (various number formats supported) or formulas. Example:

    ```equalto
    **Item**       | **Cost**
    Rent           | $1500
    Utilities      | $200
    Groceries      | $360
    Transportation | $450
    Entertainment  | $120
    **Total**      | =SUM(B2:B6)
    ```

benatkin · on April 12, 2023

Why not just use GFM?

    Item           | Cost
    -------------- | -------------
    Rent           | $1500
    Utilities      | $200
    Groceries      | $360
    Transportation | $450
    Entertainment  | $120
    **Total**      | `=SUM(B2:B6)`

https://github.github.com/gfm/#tables-extension-

Getting rid of the vertical divider is nice but I'd rather think of it as a tiny modification to GFM than a distinct language.

Putting the in backquotes could make it look more like a formula and also it could be required for formulas to prevent accidentally invoking it.

It really is nice to have the horizontal bar gone, though. I think I might make my own format based on it. I tried to get rid of the bar but saw that you can't. In fact the only way you can have everything on one side of the bar is to only have a header (thead) when it would often be useful to only have a body (tbody).

FWIW original markdown requires pipes at the start and end of each row but not GFM.

diarmuid_glynn · on April 12, 2023

I hadn't reviewed GFM's table extension previously, thanks for sharing.

At first sight, I think GFM's table extension and Sheet Markup have different goals. While the table extension is intended for displaying a single table of data, Sheet Markup for defining an interactive spreadsheet, including things like formulas. Such a spreadsheet might not really be a single "table" as such, it might be multiple separate logical tables. Also, I suspect that we will in future want to extend Sheet Markup with additional features which would be "even further" from what GFM's table extension supports.

But thanks, certainly food for thought!

benatkin · on April 12, 2023

I'm working on internal DSLs for markdown. I think it's a pretty powerful language and often it can be used in a different way rather than changed. For instance, a badge with a link could. A nice thing about a true internal DSL is that they are supported because it's the same language, being an internal DSL rather than an external DSL.

External DSLs of course give you full flexibility, as you are no longer constrained by the language. https://javieracero.com/blog/internal-vs-external-dsl/

My past work and this gave me the idea to do something that sits between an internal DSL of markdown and an external DSL - to allow tables without row dividers, but put them in fenced code blocks with a different language name so they don't get displayed wrongly by existing markdown tools, instead displayed as code. And because this is the only difference, to make it display as a table using existing gfm tools, an empty header could be added, since normally it's not desirable to have the whole thing as a header.

Here's an empty header that at least on https://loilo.github.io/gfm-preview/ shows up shorter than a normal line:

    []()|||
    -|
    Rent | $1500 | paid
    Utilities | $200 | unpaid

Though it isn't md I think I will have md in the name of the extension, much like jsonl has l in the name but a jsonl file with two or more lines of data isn't a single valid JSON document.

Edit: here's one that displays on GitHub:

    []()|[]()|[]()
    -|-|-
    Rent | $1500 | paid
    Utilities | $200 | unpaid

diarmuid_glynn · on April 12, 2023

I see. I'm a big fan of DSLs, but the internal vs. external distinction is not something I've seen articulated before.

For now, I'm treating Sheet Markup as an external DSL, which can be embedded in a Markdown document using a fenced code block. But there are certainly benefits (and costs) to developing an internal DSL for spreadsheets along the lines of what you're suggesting.

antman · on April 12, 2023

B2:B6 is a bit out of context? perhaps also add SUM(*Cost*)?

Groxx · on April 12, 2023

Seeing it in a context without a UI to show you the row and column numbers felt a bit awkward to me too...

... and it made me wonder if some other syntax would work better, which made me think that maybe something like this would work?

    =sum(column | 2+ | above)
         targets whole column "stream"
                  second item and later
                       above this cell
         pipes manipulate the target "stream"

I'm waffling between rx-like and unix-like for terms though. Or something else. But a much more relative-and-whole-sheet-focused language seems like it could be a lot nicer than pinning cell IDs everywhere.

diarmuid_glynn · on April 12, 2023

Note that you can represent much more complex spreadsheets using Sheet Markup, and while in the above example it might be clear what SUM(COST) would mean, in a more complex spreadsheet, with multiple different tables, it might be ambiguous.

As for why one would possibly ever want to use Sheet Markup for mode complex spreadsheets, one use is as a way to interact with an LLM. We've started to see some interesting results using GPT-4 to analyze various kinds of spreadsheets that have been encoded in Sheet Markup.

pflanze · on April 12, 2023

Pandoc also supports[1] pretty featureful ways to declare tables, may be worth looking into.

[1] https://pandoc.org/MANUAL.html#tables

diarmuid_glynn · on April 12, 2023

Pandoc tables remind me of the reStructuredText tables, which I used back in the day: https://docutils.sourceforge.io/docs/user/rst/quickref.html#...

Very powerful, but I found it challenging to remember the syntax since I was only using them intermittently. Still, it could indeed form the basis of a more advanced spreadsheet markup syntax, supporting things like merged cells (which Sheet Markup does not, and probably never will, support).

layer8 · on April 12, 2023

Thanks, that would be great.

seanosaur · on April 12, 2023

This looks great! Are there plans to spin up a plugin for Obsidian and similar apps?

diarmuid_glynn · on April 12, 2023

I'm not familiar with Obsidian, I'll take a look.

One thing I should mention, we have another tool which makes it easy to embed a spreadsheet in another app via an IFRAME:

- https://www.equalto.com/suresheet

The benefit of using the above is that Sure Sheet URL will always load the "same" spreadsheet. Edits aren't automatically saved, unlike (say) a Google Sheet.

noisy_boy · on April 13, 2023

I use advanced tables[0] and Excel to markdown tables[1] plugins - they make working with tables in Obsidian a bit easier.

That said, I think tables are markdown's Achilles heel - anything involving multi-line content starts to make things complicated.

[0]: https://github.com/tgrosinger/advanced-tables-obsidian

[1]: https://github.com/ganesshkumar/obsidian-excel-to-markdown-t...

MilStdJunkie · on April 12, 2023

Dang, that's nifty. In Asciidoc, everything has to go to CSV or some kind of delimited format, which means you need the TextQL extension if you want to calculate at the document layer. I've given guidance that the compute needs to be done before it goes into the document change control - TextQL lets you cheat, but you don't want to hinge a document process on something like that.

A little off topic, but it's something I wanted to ask from a more technical audience than myself. Asciidoc's table model can be instructed to use any arbitrary character as the delimiter (pipes, commas, tabs, etc) , which led a lot of people to ask me: why not support JSON as tabular format? At the moment, JSON has to be rendered via PlantUML (JSON) block. The only answer I could give (aside from RFC 4180, which is at the heart of adoc's table model) was that JSON, like XML, can recurse a record arbitrarily - making it pretty difficult, from a compute perspective, to render with a given resource. You can have columns in columns in columns. Here's my confession: I'm not really sure my answer holds any water. Any table model that supports merging and splitting (which Asciidoc's does) can support a modest level of recursion. So probably, the real reason, is that it's just too damn hard to extend the table model to JSON data.

iddan · on April 12, 2023

Great stuff! If you'd like to add a spreadsheet to your React / MDX I created a similar small component https://github.com/iddan/react-spreadsheet

diarmuid_glynn · on April 12, 2023

Very nice - I don't think I've come across this project before.

I'm curious, what's your vision for react-spreadsheet? I notice it supports some formulas, and a "single sheet" view. Do you plan to make it a more complete spreadsheet component in future, or do you see that as out of scope?

iddan · on April 12, 2023

Thank you! The vision is to support as many spreadsheet capabilities as possible while retaining the simplest API and a small enough size. So multi-sheet is possible as long as using the component with a single sheet is still simple.

8n4vidtmkvmk · on April 13, 2023

so...this isn't bidirectional? becuase the first thing i did was edit the spreadsheet on the right and i was expecting it to update the markdown on the left but i guess not.

should just disabled editing the spreadsheet if that's how its going to be

sagaro · on April 13, 2023

There are some use cases where it doesn't need to be bidirectional and yet provide users with the ability to run some formulas or sanity checks etc.

For instance I can publish an article on the economics of ecommerce. And someone consuming my article might want to check the unit economics and might want to just divide the numbers by the total units sold. Instead of having to do that on a calculator, he can just do it on the spreadsheet.

diarmuid_glynn · on April 13, 2023

Right, the idea is you can use Sheet Markup to author interactive spreadsheets. Consumers of those interactive spreadsheets can then modify the data / formulas in the spreadsheet to perform ad-hoc analysis (which are not saved).

That said, I think 8n4vidtmkvmk has a point that it would be nice if when authoring the Sheet Markup, edits in the spreadsheet preview would be applied "in kind" to the Sheet Markup (bidirectional sync). This would mean you could author the Sheet Markup using the spreadsheet preview, instead of relying 100% on Sheet Markup for authoring.

8n4vidtmkvmk · on April 14, 2023

Yes, there should be 2 modes of operation. One for authoring and one for consumers. This link looks like it's authoring because it has the markdown on the left. Viewing mode presumably wouldn't let me edit the markdown at all. In that mode, letting me tinker with the numbers and not have them save might be helpful. But I'd go one step further and add a "Make a copy" button under the sheet which exports the data into Google Sheets or downloads an CSV.

blackbear_ · on April 12, 2023

Happy to see the world slowly catching up to org mode in emacs.. ;) jokes aside, cool stuff!

bestouff · on April 13, 2023

That's cool and all, but I would have preferred ```csv as a marker instead.

rad_gruchalski · on April 12, 2023

This looks really cool. I would love to have an embedded spreadsheet on my blog for some use cases. However, it seems that all entered data is siphoned into your service. Correct?

diarmuid_glynn · on April 12, 2023

No, the data you enter is not sent to our service.

If you inspect the "Network" tab in Chrome and you can verify that there isn't any network I/O after you modify the markdown.

Edit: and thanks for the complement! I should mention that most of the look-and-feel is courtesy of StackEdit:

- https://stackedit.io/

Our contribution was to extend StackEdit to render spreadsheets using our Sheet Markup syntax.

rad_gruchalski · on April 12, 2023

I am sorry, I didn't explain properly. I watched your demo at https://www.youtube.com/watch?v=HobLlnD7Im0&t=5s. A sheet is loaded via the API. What happens with the data I enter? Does it stay in the browser? Can I use an API to dump and restore data and formulas?

diarmuid_glynn · on April 12, 2023

Ah, gotcha! The video is referring to a different product (in beta), "EqualTo Sheets". Sheet Markup uses some of our EqualTo Sheets tech, but it's a different product.

So, regarding EqualTo Sheets:

> What happens with the data I enter?

Data entered into an EqualTo Sheets workbook is saved to the EqualTo server.

This is to some extent the value we provide with EqualTo Sheets: you can just paste the code snippet we provide into your code base and immediately have a functioning workbook that saves changes and supports parallel editing.

> Can I use an API to dump and restore data and formulas?

Yes, we have a bunch of APIs. You can export / import XLSX, as well as read / write individual cells using REST and GraphQL APIs. Some more details:

- https://sheets.equalto.com/beta-readme

- https://sheets.equalto.com/docs/

- Join the open beta (just provide an email address and click on a link in the email you receive): https://sheets.equalto.com/

rad_gruchalski · on April 13, 2023

Thanks for explaining. I see that you are a German company. Does that mean the data remains inside of the EU? When I use an EqualTo sheet, can I somehow know where my data ends up?

Asking because your subprocessors list doesn't give an immediate answer.

diarmuid_glynn · on April 13, 2023

Currently, EqualTo Sheets data is stored in the US on Heroku. We have a signed DPA with Salesforce (owner of Heroku), so as to maintain GDPR compliance. Additionally, we can provide self-hosted instances to Enterprise customers, feel free to reach-out to me (email in my profile) if you'd like to discuss this further.

> Asking because your subprocessors list doesn't give an immediate answer.

Fair complaint :) I'll update our subprocesser page tomorrow to make this clearer.

thih9 · on April 12, 2023

If I add a row at the end saying “another | $1000”, the total doesn’t get updated (the SUM still points to B2-B6, so doesn’t include the new field). Is that intentional?

diarmuid_glynn · on April 12, 2023

Yes, that's intentional. Assuming “another | $1000” is on the 7th row, you would need to update the formula to:

    =SUM(B2:B7)

to incorporate it into the sum.

nine_k · on April 12, 2023

I wonder when will Markdown adopt enough features of Org Mode to become a viable replacement for people not using Emacs.

(Yes, Org Mode has spreadsheets, of sorts.)

diarmuid_glynn · on April 12, 2023

Cheers!

I've heard of, but never used, the emacs spreadsheet / org mode stuff. I should probably review it for concepts that I could steal / be inspired by ;P

yawnxyz · on April 13, 2023

wow so cool! Do you guys have plans to open source some of this in the future?

diarmuid_glynn · on April 13, 2023

Thanks!

Some of the tech is open source ( https://github.com/EqualTo-Software/stackedit-sheet-markup ) and some of it depends on tech in our closed-source EqualTo Sheets product ( https://sheets.equalto.com/ ), which is in beta right now. We've considered open-sourcing some / all of EqualTo Sheets, and it may yet happen, but it's not something I could commit to right now.