Hacker News new | past | comments | ask | show | jobs | submit login

Spreadsheets are a hammer so if that's all you know then all problems look like the nail. Really, many power users of Excel ought to be looking at more capable, testable and readable solutions from professional data analysts and scientists. Python would be a good start, but there are many options better than Excel for critical systems.



> Really, many power users of Excel ought to be looking at more capable, testable and readable solutions from professional data analysts and scientists. Python would be a good start.

I can write python, but I can get the answer out of excel in a fraction of the time for most use cases.

As an added bonus, excel documents are far more ubiquitous and I can share them with clients, and they can see my workings. Python isn’t very transparent and business users can’t usually check the logic.

Excel is also much more interactive and allows for much more “discovery” and playing with the data.

What I would say though, is most power users don’t know about functionality in excel like PowerQuery, relational data models and DAX that actually do turn it into a serious and repeatable tool.

(I’m not arguing that Python isn’t better for many use cases, and excel definitely isn’t right for most critical applications, but I also think Python is much slower for ad-hoc data analysis so both have different places in the market. Anyone think python is faster? I’m down for a race!).


> they can see my workings

How? Do they check that every cell in a column actually has the same formula? Do they check data format everywhere?

Python script is something that is possible to be actually reviewed, and results - reproduced, and "formula's" there are actually readable.

But for some stupid reason excel files are still shared over email.


It's only "stupid" if one suffers from a catastrophic lack of empathy for people who are not programmers and for the incentives to which they are exposed--which do not include using a software developer's preferred tools, nor do they include the carving out of time with which to learn them.

I would hope that we would be generally wiser than that here.


Like being a "programmer" is some genetic trait or something, or like using excel in any more or less productive manner doesn't require carving out hell a lot of time, it's just that time is taken from user's lives in small pieces, and doing PROGRAMMING seems like taking a university course.

I actually have a lot of empathy for people who forced to deal with all those problems, otherwise I just wouldn't care.


> Do they check that every cell in a column actually has the same formula?

They just have to check the top cell as most of my formulas are array formulas.

You don’t have to drag a formula down - that’s a common misconception in the latest versions. If you do =A1 + B1 and want to apply it to the 1000 cells below you just write = A1:A1000 + B2:B1000.

That’s still not that readable though, so I’ll apply those cells two named ranges “Sales” and “Taxes”.

Then the formula is = Sales + Taxes once and that will populate the whole column of data.

Then there’s M Code and PowerQuery which literally allows you to review the data cleaning line by line and even see the data state at any intermediary step. It also has

> Python script is something that is possible to be actually reviewed

The problem for me is that, as someone who works in consulting, it can’t be reviewed by my boss or a client, neither of whom can program. But they can review a tidy excel sheet.

And then they can’t edit it either, so if I go on holiday and I’ve built some sort of model nobody else can make progress until I come back, or if I move projects I’m also stuck maintaining the model on the old one.


That's terrible practice, ironically, as it's extremely unreadable. How could someone looking at it know what Sales and Tax actually are? You have to go into the formula name box and dig in to find sales = "yada yada" etc. That doesn't seem too bad until you have a decently sized file and you have to dig into 40 formulas to find the one you want, and go check that it's actually referencing what you want.

I work as a banker, and what you do is one of the very first things new employees are taught not to do in excel. It's an amazing solution for the person that built it, but a terrible one for anyone looking to check the work.


> That's terrible practice, ironically.

Can you point me to the 'best practice' guide you are referring to? What authority on excel standards said this?

Personally I tried to find articles saying it's not best practice by typing in "dont use named ranges best practice" or "named ranges in excel are bad" into google, but it mostly brings up articles stating that using named ranges is best practice and improves readability!

> How could someone looking at it know what Sales and Tax actually are?

If you really want to use that example, you would click in the formula bar and it will highlight the ranges, and colour code them automatically.

If it's on a different sheet, you just hit ctrl + g and type in the name, and it will take you directly to the cell it's linked to (which usually in my case, is linked to a sheet that contains all my model's assumptions in one place, each one with a named range describing what it is). It's much easier and quicker than going to =Assumptions!G52.

> I work as a banker, and what you do is one of the very first things new employees are taught not to do in excel.

Seems like a silly thing to teach people IMO. In my experience it makes formulas much more readable (both writer and reader), makes it much faster to build models, and cuts down errors substantially.

I personally find that formulas are much easier to review, because the named ranges provide some intent. If someone writes =(A2Assumptions!92)/Assumptions!91 I've got to really unpick it to work out if it's right, but if someone labels it =(A2Miles_Per_Hour)/Average_Miles_Per_Vehicle then I can see that the formula is wrong almost instantly.

Additionally if I want to write another formula using those values, I can just type it straight into the formula bar without having to go and click on the right cell reference in another sheet.


>Can you point me to the 'best practice' guide you are referring to? What authority on excel standards said this?

Well investment bankers tend to be pretty darn good at excel, and the banks I've been at would scorn you for doing it, as well as all the standardized training given out to fresh recruits to the industry. We had a buy-side deal recently fall through partially because the (fairly sophisticated) model the company selling itself used was absolutely unreadable and unaccountable. They did exactly what you said (naming), including with their assumptions. It was 20 sheets of unpenetrable mass, and we were all turned off by the fact that you couldn't follow it whatsoever, and was basically unaccountable.

>If you really want to use that example, you would click in the formula bar and it will highlight the ranges, and colour code them automatically. If it's on a different sheet, you just hit ctrl + g and type in the name, and it will take you directly to the cell it's linked to (which usually in my case, is linked to a sheet that contains all my model's assumptions in one place, each one with a named range describing what it is). It's much easier and quicker than going to =Assumptions!G52.

This how I can tell you've never seriously worked with excel, because that's MUCH slower than using the native/macabacus auditing tools. Adds up over thousands of times. And what if Sales isn't just a simple cell in another sheet, but is actually tied to a named formula itself, tied to a named formula itself, tied to another worksheet (with named formulas in them!). That's a deep rabbit hole and you to be going down with little transparency, where you're having to look up the formula manager to find whatever the fuck the named variables are actually referring to (what if someone follows your advice and names the tax rate as "Tax" and now ctrl-g doesn't work!)

>Seems like a silly thing to teach people IMO. In my experience it makes formulas much more readable (both writer and reader), makes it much faster to build models, and cuts down errors substantially.

It makes formulas much easier to read, but with zero accountability, and considering someone will very likely be looking at the excel at some point, with no idea what you did, they have to check each and every one out to make sure it's not bullshit. Plus, it doesn't really make modeling faster assuming you've built out your source numbers/assumptions well and use best practices (like A24+A25+A26 instead of A25+A26+A24).

>I personally find that formulas are much easier to review, because the named ranges provide some intent. If someone writes =(A2Assumptions!92)/Assumptions!91 I've got to really unpick it to work out if it's right, but if someone labels it =(A2Miles_Per_Hour)/Average_Miles_Per_Vehicle then I can see that the formula is wrong almost instantly.

Use tracing, and if the assumption tab is well built out it really isn't faster, at all. +alt w,n

>Additionally if I want to write another formula using those values, I can just type it straight into the formula bar without having to go and click on the right cell reference in another sheet.

Use multiple windows, makes life much easier.


> Well investment bankers tend to be pretty good at excel

> This is how I can tell you have never seriously worked in excel.

Ah, so the style guide and best practice is based on “Trust me - I rock and you suck”.

I can guarantee that I do use excel seriously, and my personal experience is the exact opposite. You wouldn’t see beautiful code with variable names like A2, and it’s exactly the same for me with excel.

Besides - a feature you can turn off in about 5 minutes stopped you from doing a deal?! Why not just explode the named ranges out? I suspect the issue here is an overly complex model rather than named ranges - I’ve seen lots of these without named ranges too and they are even less manageable in that state!


This debate about whether to use named-ranges in Excel formulas brings back some distant memories. I was on a team at Goldman Sachs that built Excel tools for the investment bankers and because of that we'd help out with the new banker Excel training held at the giant Chelsea Piers sports complex.

Every year freshly minted MBA's would begin the training and immediately become aghast when the trainers told them to never, ever, use named-ranges in formulas. Not only had the trainees been explicitly taught to use named-ranged in their MBA programs, but any idiot could plainly see that [=enterprise_value/ebitda] was better in every way than [=C13/F22]. More expressive, more readable, easier to spot errors, etc.

The trainers would argue that in an MBA program you build your models, submit them, and move on to the next assignment, so you don't get a view of the longer-term problems that arise from named-ranges. What does [=ebitda] actually tell you? Is that the last quarter's actual number, is it the current estimate of the next fiscal quarter, is it a 12-month blended forward estimate?

And as you try to solve those questions with [=ebitda_est_next_fiscal_qtr] you often end up with two more problems: an unambiguous name for you can unambiguously mean something else to someone else and as you update your model over time, if you forget to update your named-range references, you have formulas that look right, but are wrong, e.g., [=ebitda_est_next_fiscal_qtr] now refers to an actual reported ebitda number, not next quarter's estimate.

[=C47] while not telling you much at all, is unambiguous. It doesn't look "right" or "wrong". It can't mislead you. If you want to know what it is, go look at cell C47. The new MBA's would argue, "Wait a second, C47 could now point to the wrong data as well, and it could be mislabeled with a stale row or column header. And the trainers would reply, "Exactly, and when you go to see what C47 represents you will have the context to recognize those errors and fix them."


You put it in much better words than I could, thanks haha.

Also, very cool experience at Goldman!


heh, your exchange looked pretty much like a holywar between programmers. Except problems looked like if you're discussing merits of different old varieties of BASIC, with GOTO and line numbers. I really don't see how excel is more visual, or easier to grasp after adding just a several columns of data or formulas.

I wonder if there could be made some middleground language, visual enough to not being scary for spreadsheet people, and debuggable and readable enough to not become a mess when model becomes big. Jupyter comes to mind, but it looks like it didn't get any traction outside data science.


Many power users of excel do use more capable, testable and readable solutions. But they also use excel. Programming sucks in many ways that excel does not. Excel is incredibly flexible and adaptable.

If you need to treat one row differently as a special case that is easy in excel but can be a lot of work with programming, both for technical reasons (you need to handle the special case with special code) and for silly reasons (in excel you click the row which is special; in a program you must describe how to find the row.) Certainly there is an argument that one shouldn’t have such special cases as they make maintainability a nightmare but sometimes it matters a lot more to have something now rather than something later.

In excel, the data is the most important thing and in the real world it often is too. In a program, the data (especially the intermediate data) is quite invisible. Power users may spot errors because intermediate values look wrong, rather than by carefully reading the formulas. Though if the values look right or there are a lot of them, errors may not be spotted.

Excel is very ad-hoc which is the problem people complain about but also it’s power in dealing with the changing real world and anything exploratory. Once it is working and stable in excel, it is more wise to do it in python.


Excel is more than just a hammer it’s the fork that is in every single drawer of every single company/org in the world.

It’s not about every problem looking like a nail, it’s that, nobody had to pay for this fork, somehow everyone has it and it’s such a great fork that it can do 90% of the problems you throw at it.

Yea sure some problems would be better solved with specialized tools, (eg knife, spoon, etc) but those require training and budget. No one has those. Everyone has a fork.


Cutting steak with a fork is a pretty good metaphor for working with a spreadsheet heavy enterprise team.


> nobody had to pay for this fork

Actually, everyone had to pay for that fork. Excel is a paid product, and isn't even included in the cost of Windows. It is, however, bundled with MS Office, which includes Word, which is even more of a must have at almost every single company/org.


I know very few people who have paid for excel specifically. everyone pays for MS office, albeit because it includes excel, but in an workplace setting the main thing you're paying for is outlook/exchange the fact that word/excel/powerpoint are bundled is a plus.

The main point I'm making is that no one has to go to their manager and justify paying for it. It's already bought and paid for, and because of its ubiquity most people have a pretty good idea of how to use it.


Even better, because some of those deals include MSDN, when Excel users grow beyond VBA, they just get VB.NET installed and keep doing AddOns in .NET.


While technically true, the office coffee costs more usually. It is generally just there, and no one usually is thinking of the costs, because it’s too small to care about.


I disagree.

Spreadsheets are like a Swiss Army Knife.

I have used them for

- Templates for Printing - Data Analysis - Financial Tracking - Report Generation - Calculation sheet for engineering stuff

and a multitude more uses that I cannot even remember now.

They are not a hammer, though that is one form in which they can work.

Ultimately they are a tool, and a damn fine tool at that.

Their potential is limited only by the creativity of the user using them.


Spreadsheets are great. I just wish folks would use a more appropriate tool for what they really want: a very light-touch database. Something like Airtable is perfect for structuring a bit of data into rows and columns.

But, nothing beats the availability (both of sharing and editing) of an open Google Sheet.


And then there's the pricing.. "We already have Google". No way my company would pay for Airtable, Notion, etc

Its spreadsheets all the way.


Shameless plug: I'm building a Desktop app that serves the niche between Excel and Python.

It enables you to load CSVs and write SQL on those CSVs.

The main benefit is that it is very fast when compared to Excel or Python.

Excel cannot really handle 1GB CSV file. With Python, just loading 1GB file and iterating through every line probably already takes more than a minutes.

Superintendent pushes most logic to C and Sqlite, so it's super fast.

Here's the app: https://superintendent.app


I think of Excel as the modern back of the envelope. You can prototype some things, proof of concept etc.

But if you're running things in prod via excel you haven't actually thought enough about it to call it prod.


How many professional construction sites do not have a supply of hammers? I think your analogy actually fits pretty well seeing as how Excel is an invaluable tool that fits into a larger enterprise.


Spreadsheets are sometimes more readable than code. It’s harder for certain kinds of bugs to get through when you see the whole program, inputs, and outputs, and all intermediate states at once.


what would python be working on then? a .csv, textfile, database, something else? Or would the data be stored as part of the script itself?


It varies and how you're outgrowing it. I heavily use spreadsheets, but if I start to hit some limits or need to formalize a process I'll switch to something like Python. As a data store I may still query the spreadsheet, or use csv, sqlite, or something that scrapes the data source. I really like columnar data because I like mucking with it by piping data on the command line (which has its own, well known ways to bungle data).

As for reasons you're outgrowing a spreadsheet; access and permissions, friendlier error messages or a better "wizard-like" process, formalizing input data, outgrowing data size, smarter caching for computing values, protection or clarity around modifying "constant" or "magic" numbers. Many of those you could implement in a spreadsheet, but its often more complex than using something else. That something else probably varies with which problem you're trying to solve.


You would have the script take in the data as input from somewhere else, maybe your organizations shared drive. could be a .csv or whatever format. could be a database like sqlite. Much better than some big hulking macro heavy excel workbook.


Then how do I edit, extend and quickly filter and search that CSV? I'd probably do it with Excel to be honest.

Lots of data in business isn't generated by another program, it's built and maintained by users.


I think you raise a good point that often gets forgotten. When you switch from a general purpose tool to something specific, you lose visibility and introspection tools.

I see people move from a filesystem based workflow to a database and while that may be faster and more efficient you lose those tools to look at and tweak data. You either need to write those or teach them how to navigate the database. People rightfully hate black-boxes as soon as stuff stops working as expected.

Personally, I love the Linux command line and will pipe stuff in and out or stash things in flat files. I may move the data back into a spreadsheet to get more visibility or look for trends.


Python or R have tons of ways to edit, extend, and quickly filter and search within stuff like a CSV. You could even do all of that with a bash script. It will probably run a whole lot faster on your computer than opening the excel workbook.


I think we might be talking at cross purposes. How, in Python, do I extend a CSV with new numbers, or edit existing values? Use 'input'? I could implement an excel-like interface, but then I already have Excel.

I could just open it in a text editor I suppose, but then I have to deal with escaping, and moving around cells, and that sounds painful.


The Pandas library makes manipulating tabular data easy. And it's better than Excel when it comes to cleaning messy data.


Plus 100 for pandas. It is a truly amazing library. If I need to do some exploratory data analysis with it I'll often do so in a Jupyter notebook, but always make sure serious coding happens in a real editor like vscode.

But things really are hopeless.

Even Microsoft can't control the monster they unleashed on the world. Execs and managers using Excel instead of Planner for project tracking ("Spreadsheets are for manipulating numbers, not word processing!"). Used in that way, Excel isn't so fragile, but why is it being used at all? For f*cks sake, hasn't anyone ever heard of Word tables?

Honestly, anyone who uses Excel as a database should be exiled to... someplace where they can't do any more damage. Maybe connectivity-poor rural Arkansas. That's technological malpractice. It's indefensible, and an embarrassment to the whole tech sector that anyone is trying to excuse it.


You would read in the csv, edit what you need or add/remove data, then write a new file or overwrite the old one. You don't need to implement any interface, there are tools that do all of this. You could probably write the code up for something simple like this in the time it takes for excel to open the file and you to make the change manually by finding the field, clicking away with the mouse, and typing it into the cell. Specifics beyond that can be searched for on the internet as there are a wide variety of methods you could tailor to your purpose.


TSV, you have several tools to parse it. Even AWK.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: