Hacker News new | past | comments | ask | show | jobs | submit login
PySheets – Spreadsheet UI for Python (pysheets.app)
287 points by tosh 6 months ago | hide | past | favorite | 72 comments



The author of PySheets here: The app is written entirely in Python, running on PyScript, using PyScript-LTK, with two Python VMs, MicroPython and PyOdide. Web server is minimal logic, running on gunicorn on DigitalOcean. Storage is on Firestore. The App can be easily packaged up as a standalone, "on-prem" app, but I have not given that too much priority for now. Would love to hear what you all thing of writing web apps in the browser in Python.


LTK, a Python library to create browser UIs, is part of the open-source PyScript project. See https://github.com/pyscript/ltk. Anyone planning to visit PyCon US in Pittsburgh, I will be in the Anaconda booth most of the time. See you there.


>>> Would love to hear what you all thing of writing web apps in the browser in Python.

I like the idea. I'm not a commercial dev, but a so called "scientific" programmer, meaning that I use programming mainly as a problem solving tool. But once in a while I create little apps for my colleagues to use, many of whom don't program. But they can manage spreadsheets quite well.

I'm pretty committed to Python at this point, but deployment of an app is a headache, and I've explored a variety of solutions. I've written a couple of web apps using *flet*, and they run on pretty much every platform I've tested. This seems like a nice approach.

The thing I'd like to figure out is how to give a web app access to a user's files, though I appreciate why this should be difficult for security reasons.


Any PySheets sheet runs in the PyScript context and can access the JavaScript window and DOM. Therefore, you can use the browser to access the file system: https://developer.mozilla.org/en-US/docs/Web/API/FileSystem.

Alternatively, you could create a form to "upload" a specific file, but instead of uploading it, read the bytes from the PySheet's cell function.


Wow, PyScript has come a long way. I remember when loading it into the browser would take 5-10 seconds. This seems much faster. Great work!!


Lots of blood, sweat, and tears. And MicroPython!


Why no ISO26300 support?


Most people will import external sheets using Pandas. It has numerous conversion methods for table-structured data sources, such as https://pandas.pydata.org/docs/reference/api/pandas.read_exc....


Shameless plug: If you have bigger data sets, check out rowzero.io.

We implemented something like PySheets initially where the formula language was full Python. But we found the Python interpreter to be the bottleneck during (e.g.) large CSV imports, and the GIL prevented parallelizing evaluation. It was also harder for business users to adopt due to small syntactic differences between Python and the Excel formula language.

So we implemented the spreadsheet engine and formula language in Rust. We have a Python code window that allows you to write arbitrary Python functions. Those functions can be called as formulas from any spreadsheet cell. We seamlessly marshall Pandas dataframes from Python land to spreadsheet land and back. It gives you 90% of the benefits of pure Python without compromising on performance.


Rowzero is a better spreadsheet, while PySheets is a better Jupyter Notebook. Although they converge in certain aspects, their distinct target audiences set them apart. This divergence may create some overlap, but it also leaves ample room for user preference.

PySheets currently runs inside the browser, on top of WebAsm, and the limitations there are bigger than just Python's slowness. You have only 4G addressable memory, including the interpreter and libraries. Network bandwidth is also a limiting factor for client-side computation.

That said, PySheets can render a sheet based on a 50,000-row Excel sheet in 0.5s and needs about 20s to do a full end-to-end recompute run. There are limits to what you can do in the browser without using an external kernel that can run Polars on large datasets. But, I think most people will be fine with what PySheets can let them do.

Finally, as the author of PySheets I am honored that a "competitor" sees us as a threat. I am quite impressed by Rowzero myself. Nice work :-)


Kudos on the technical achievement. We considered the thick client approach you're doing, and one of the reasons we punted was because it was so hard.

One really nice thing about your approach is it minimizes infrastructure cost. That positions you well for embedding use cases, like New York Times visualizations, that we struggle to do economically.

Best of luck!


Yes, my total development bill for EVERYTHING, including DigitalOcean, Google, and OpenAI is about $15.


Kudos to you. I would be quite flattered to have built a thing that competes with what a small startup built.


I am feeling pretty Okay now, indeed. I played golf today. It was on a Par3 course, so it only tested my short game. However, I scored -1, with almost a hole-in-one. I blame it on the success of PySheets :-)


I've been trying to get a platform to create dashboards where some data comes from spreadsheets and some data comes from databases. Something like a notebook interface crossed with a grafana interface while also enabling forms for input is sorely missing. While it can be stitched together, speed/performance and flexibility (in terms of JS or Python) seems to be lacking atm.

I want to use such a thing to create internal dashboards similar to retool.


Does it need to be live (i.e when database or underlying spreadsheet updates does it need to be reflected in real time on the dashboard) or are you ok with static display.

Live updating data is a pain I've messed around using javascript to force refresh html iframes on a timer. But I was never really satisfied with this. I've heard you can do things with websockets but that is starting to get too complicated for me (I'm not a programmer).

For static stuff one of the data scientists in my org pointed me to Streamlit (https://streamlit.io/) it's a python package I found very easy to use. Can easily combine SQL with CSV imports and display them all on one dashboard. Can use forms toggle butotns etc to control the display.


You can do that today with PySheets. On the PySheets landing page, you can find a live example. The data comes directly out of a sheet that uses a service to convert metrics into charts. For example, one of the three charts shown on https://pysheets.app/#Traction is directly embedded as an iframe from https://pysheets.app/embed?U=uXNuCGO2JU1E5aL7zcOh&k=C12. If I rerun the sheet that produces the charts, the PySheets landing page updates automatically with the latest data.


You should try http://rowzero.io. We connect directly to DBs and data warehouses, support Python natively, and scale up to hundreds of millions of rows.

Lots of people use us for dashboards.


Rowzero seems incredible, but this and PySheets target the wrong users. You are targeting data scientist while I would target finance people to get traction. So let me tell why I would use it as a Data Scientist but not as a finance guy: 1) It runs on the cloud, I would go with something that runs locally (or on premise) since there are sensible data there (with rust as a backend should be fine, python you need to ship a set of libraries using docker) or should be integrated into GCP/AWS/Azure. 2) You need to create a PowerPoint/Word alternative as well where you can just copy/paste stuff or you need to make the copy/paste in PowerPoint/Word easy 3) Push strong on big data and DB connection, right now those are the bottlenecks, also create some API in python for popular services in finance (Bloomberg, Factset, CapitalIQ, ...) so that they are available out of the box with a subscription 4) Do something for the text part, like getting embeddings for similarity, fuzzy match in python plus probably the interface can be different in analyzing text (highlights in green of keywords, search in text and so on), people in finance often work also with PDF and having all in a platform is nice instead of having two windows as of today


PySheets has been designed to run on-prem and on GCP as well. The beta version you are looking at is just offered as a zero-install experimentation platform. We are actively talking with financial institutions, and both co-founders on the team, https://pysheets.app/#Team, have a long history in Finance, so we are very sensitive to all the (correct) points you make. We will look in more detail at your very helpful suggestions!


Any chance you could expand on how the DAG is implemented in Rust for the execution engine? I'm trying to do something similar (not for spreadsheets but rather for a language: https://docs.yoctoproject.org/bitbake/bitbake-user-manual/bi...). I cannot find any good examples of how to implement something like this in Rust. E.g. should I use a graph library like petgraph, or roll my own?


PySheets is not based on Rust. It is 100% Python.


I replied to the rowzero guy, which is written in Rust.


Both of the solutions seem interesting for different reasons. @breakognize. You said 90% of the benefits. Can you or @laffa give an example of the 10% that would prevent me from using your solution?


Are Row Zero and/or PySheets open source?


A major part is, in the form of Pyscript-LTK. I keep moving more of PySheets to LTK as I find reusable parts. I truly love open-source, but I am also trying to get some revenue for the months of work I spent on developing PySheets.


nah, but it would be nice to have a communist version too.


That was not the point, there is a natural focus in HN towards open source software. Open source is not equal to Communism.


For a non-browser python based spreadsheet application: https://pyspread.gitlab.io/


Great idea, easy to use GUI for non-tech and Pandas for data oriented at same time.

Is there some similar project but selfhosted? I would be uncomfortable with uploading health related data to external service.


There’s a cool one I’ve used called MitoSheet[0]. Runs locally and has some great features, though it doesn’t support TSV files last time I checked. It’s being actively developed still. I believe it was developed with YCombinator funding.

[0] https://www.trymito.io/


grist is sort of similar. its a spreadsheet/database hybrid that lets you use python for formulas, and they have a self-hosted option:

https://www.getgrist.com/product/self-managed


Heavy emphasis on "sort of"; it enforces data types on columns, which is a significant difference from both spreadsheets and pysheets. This enables/requires more database-like behavior and planning (which is great for a lot of applications), but importing spreadsheets is much less intuitive and spreadsheet competence won't get you very far.

Grist's closer to "what if Access had an interface that was more like Excel". Pysheets is more like "what if Python data structures had a GUI that looked like Excel".

To put it another way, I love Grist but _would not_ recommend people who are using spreadsheets to try to bring their spreadsheets into it. I also love pysheets and _would_ recommend it for that usage.


I created buckaroo [1] as a better dataframe viewer for jupyter with built in summary stats. It's built to bring a better dataframe experience to people already using pandas/polars. All of it is extensible [2] so that you can customize stats and transformations to your workflow.

[1] https://github.com/paddymul/buckaroo

[2] https://youtu.be/GPl6_9n31NE


The PySheets server runs anywhere, for example: my laptop, Google AppEngine, and DigitalOcean. I designed it with on-prem in mind, so that PySheets could be deployed at companies that do not want to share data with external services.

That said, only the data stored in the sheet itself is stored in PySheets. Most use cases will load data from another place, filter and convert it, and then render a result. Still, self-hosting would be an interesting use case.


In the docs for my own project in this space, I created a whole related projects page. I figured if someone makes it to my docs, and buckaroo doesn't solve their problem, they should find something that does help them.

https://buckaroo-data.readthedocs.io/en/latest/articles/rela...


That's nice. I will do the same for PySheets, once the dust settles on the original launch.


Not exactly the same but Spyder IDE has very nice spreadsheet functionality for data. Maybe it works for you.


Another good option is Neptyne. Integrates with Google Sheets (provided you're comfortable storing your data there)


Any chance of a video walkthrough or tutorial? I can't figure out what the workflow is and which use cases PySheets addresses from looking at the landing page. I don't want to register an account just to find out.


Yes, I will do some videos in the coming week. I did an extended demo for the weekly PyScript FUN meeting, but it turned out it was not recorded <facepalm>.


That's awesome, really looking forward to seeing more about PySheets!


I just sent 30 minutes with rowzero and pysheets. RowZero seems to support huge datasets. I have been using quadratichq as a python spreadsheet but now I think RowZero has more features (and a cheaper price point). I thought pysheet might be opensource but it seems to be closed and 2x the price and it is limited to 50 rows. Finally i was not able to figure out how to import https://www.w3resource.com/python-exercises/pandas/excel/Sal... into pysheets.

I didn't know about visidata this is incredible. Thanks everyone for the informative post.


In the '00s and early '10s there was a London startup, Resolver Systems [1] , that was trying to bring Python and spreadsheets together. Alas they didn't make it but I wonder if that was because Python had a much smaller mindshare back then.

[1] http://www.resolversystems.com


Yes. They actually had a product, and IIRC, I had downloaded and tried it.

I think they are some of the same folks who later founded PythonAnywhere, which I had also tried.

I read recently somewhere that they were acquired by Anaconda.


Anaconda is also a heavy sponsor of PyScript. The https://pyscript.com website is an online IDE for quickly trying out PyScript and building a website. If you like PythonAnywhere, you will love pyscript.com as well.


I'm a big fan of PythonAnywhere. The team are great.


This looks pretty cool, I am someone who gets annoyed by excel, sheets, numbers for not just letting you code it in a nice language like python and then visualize/query after that.

But then I see "AI-driven", which I should note is the _third_ line of text on the web page. I assume it is an important feature for the author of the page.

I control-f, "ai-driven", it is only used one other time on the page:

"Perform easy AI-driven visualization with Matplotlib"

There is no further elaboration on the home page and I have been unable to find additional docs. (Someone please post a snarky RTFM response with a link to the manual, cuz like I said I am very interested in this. I did google "pysheets docs" which uhh linked to a python library with the same name...)

Last week, for the first time ever, I used noted "AI" ChatGPT to review a resume I had written. I wouldn't normally do this, but the company I was applying for heavily emphasized that they use chatgpt to generate code and review things.

Ever the skeptic, I decided to try it myself. I have to say I was impressed with the results. EXCEPT, ChatGPT, pointed out a grammar error in my resume which literally did not exist. Like the sentence it was critiquing in it's feedback was not found anywhere in my resume nor was there anything similar (from my perspective, I'm sure 1000 layers deep in it's network there was some similarity to something that had the error and wouldn't it be cool if we could effectively debug that).

ANYWAY, when I see ai-driven without elaboration in a spreadsheet program, I am very concerned that my data might be "hallucinated" and I would encourage the author to explain what exactly this means. Will my charts be correct 99% of the time but sometimes a hallucination? What's going on here? I would probably be signing up for the beta right now if I had any idea. Thanks.

(final snark: funny that one of the authors is named Kurt Vile, what are the odds https://www.youtube.com/watch?v=4uAXMl-Bfiw)


If you sign up for PySheets, we give you 7 tutorials. Two explain how to use AI to import data, convert it to Dataframes, and visualize them using Matplotlib. The generated code is impressive and can help novice data scientists explore the Pandas and Pyplot APIs.

The AI is used to generate Python code, not to analyze or generate data in the sheet. I will clarify that on the landing page. Hopefully, that will inspire you to try it out.

This is a different Kurt Vile :-)


Thanks!!!


This looks like a great and very polished project! Leveraging python in soeadsheets is a great idea - probably why excel are doing it already, but it's nice to see an implantation that's so clear and easy to use.

It's hardly a criticism of pysheets specifically, but I wish spreadsheets were more restrictive (I.e. force sheets into a table format) so that people could build out spreadsheets in an org without creating an unholy mess that needs to be picked apart and reversed engineered in something that isn't a spreadsheet.


I envisioned many of the use cases not to store data in the sheet, but to use PySheets as a better Jupyter Notebook: Import data, convert to Dataframes, massage, analyze, learn, and export. A good example is how I have a sheet that loads PySheets usage metrics, converts to dataframes, plots in graphs and then renders as live charts on the pysheets.app landing page.


Very interesting software/app. My current company has a lot of Excel files with a lot of business logic embedded as Excel formulas in them. When we import Excel file to PySheets, does it also recognizes formulas in the original Excel file? Are there any videos that show what PySheet can do? Thank you.


Try cut-and-pasting a sheet from Google Sheets to PySheets. It works quite well. At the moment, PySheets does not handle Excel functions. This is on our possible roadmaps, but we just did not get to it yet. I really only worked on PySheets for about 3 months, since resigning from my last job in February.


Thank you. As someone else has also commented here, a video walkthrough would be very helpful for us to get a sense of what the app can offer.


Python is the new Excel. And now PySheets is the the Python.


:-)


That's not what I hoped it was. Also it's weird to shove AI into this.

I was hoping for spreadsheets I could integrate into Jupyter.


You can load spreadsheets into Jupyter today. With Pandas or Polars, you can import CSV or Excel sheets quite easily. PySheets is reimagining what Jupyter Notebooks would look like if you use a DAG, not a linear execution flow.

Just like CoPilot or Sourcegraph's Cody is used in VS Code, PySheets uses OpenAI to suggest the Python code to write when the sheet contains a Pandas data frame of a certain shape. The AI accelerates figuring out what APIs to call and when. I myself find Matplotlib and Pyplot highly confusing, and a coding assistant that writes my code in this niche, makes me a lot more productive. It is cool to say, "Take the dataframe in E13 and generate an orange bar graph for it," and see the code generated.


No no no. I can do grids of data with Pandas.

What I can't do (or don't know how to) is to have the data presented in an editable and interactive sheet inside Jupyter.

I've tried the widgets, and they're generally painful to use, or maintain the extensions required.


A all python DAG with an spreadsheet interface for taking inputs and displaying output to end users? Sweet!


And it all runs in your own browser without needing any cloud kernels.


For the non-GUI folks:

There is visidata for Python&Spreadsheets&On-Prem(&SQL) as a TUI.

visidata.org


This is awesome, and client-side Python is magic :-)


Thanks, much appreciated.


What's the tech stack?


Browser > JS > WebAsm > PyOdide/MicroPython > PyScript > LTK > PySheets > Your Python.


I think the underlying platform (built by that team) was posted here only a week or so ago... I'll see if I can find it

EDIT: found it! LTK: https://github.com/pyscript/ltk


awesome


Thanks! - the author of PySheets


Interesting.


Thanks!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: