Hacker News new | past | comments | ask | show | jobs | submit login
A Fast Excel Formula Parser and Evaluator (github.com/lesterlyu)
106 points by EntICOnc on May 15, 2022 | hide | past | favorite | 34 comments



I’m surprised that Excel formulas have such poor coverage in text editors like NP++ or VSCode. Developing horribly convoluted formulas with nested IF statements in the formula bar is painful, and the PowerQuery editor doesn’t even have syntax highlighting.

Even the ancient built-in VBA IDE has better support.

Surely Excel formulas are the most commonly across the whole population of computer users?


As I consultant that works extensively in Excel, I understand your point. The lack of decent syntax highlighting, auto-indentation or “IDE mode” instead of the formula bar can be painful. Especially when working with models developed by others.

On the other hand, you can see it as a feature. It forces you to keep your formulas short, break them down in helper columns and overall make your logic cleaner. If you are writing a 4 line long formula, you are doing something wrong. Sytax highlighting would still surely help, but it is not a replacement for clean modelling.


Amateur user of Excel/Calc/Google sheets here. My worst is...

    =CONCATENATE(
        IF(A1<0;"-";"+");
        TEXT(INT(ABS(A1));"00") ;"d";   
        IF((ABS(A1) - INT(ABS(A1)) - INT( (ABS(A1) - INT(ABS(A1)))*60 )/60) * 3600 > 59.5;
            TEXT(INT((ABS(A1) - INT(ABS(A1)))*60)+1;"00");
            TEXT(INT((ABS(A1) - INT(ABS(A1)))*60);"00")
        );"m";
        IF((ABS(A1) - INT(ABS(A1)) - INT( (ABS(A1) - INT(ABS(A1)))*60 )/60) *   3600 > 59.5;
    TEXT(0;"00");
            TEXT(ROUND(    (ABS(A1)-INT(ABS(A1))-INT((ABS(A1)-  INT(ABS(A1)))*60)/60)*3600;0    );"00")
        );"s" 
    )
...which should take a value in cell A1 that is in the range [0,360) and return the value in angle notation (degrees, minutes and seconds). I especially wanted this function all in one cell to save loads of helper cells (the spreadsheet calculates the position of the moon, sun, planets and a physical ephemeris for each planet so a lot of angles to display).

It is a pain to apply, I use a text editor to search and replace the cell reference for each angle. I'd settle for the formula bar display of the formula not reformatting the line breaks.


In Excel you could use =LET(a; A1; ...) to define a variable 'a' local to the formula. You would have to change A1 in one place only.

I'm not sure this works with LibreOffice though. Probably not.


I feel you can do something with floor and trunc to get the remainder, then using round to get the sig figs correct. You can make it abs of it at the start, and then only add a negative at the start of the concat if it's negative.

I rapidly tried and I get something a bit shorter (about half the length), but my rounding ends up with imperfect precision and sometimes I get 10m60s rather than 11m0s due to it. Appropriate rounding at the appropriate points would fix that.


You could put that into a VBA formula, although it wouldn't be as fast unfortunately (as I learned when I tried simplifying excel formulae - which are _compiled_ whereas VBA I think is interpreted? - in any case, excel was like 50x faster for me).

I assume you have tried the '$' signs to see if they will help?


Thanks for reply. Yes, if it was just for Excel I'd make a user defined function that returned a text value. I wanted the formula to work the same on Libre/Open office and on Google sheets so I went for an actual spreadsheet formula. Actually, the ';' separator came from the LibreOffice version.

'$' signs for A1? I'd still need to search and replace for A1 each time I needed to convert an angle in a different cell but I take your point that I could move the output around more easily. Speed does not seem to be an issue (the rest of the spreadsheet has calls to trig functions by the score)


Not sure about the cross compatibility to Libre and Google, but in Excel at least, replace your cell references with INDIRECT() references. Then store the cell you want to calculate on in another cell…say A1 (literally just type C1 in A1). Then reference with INDIRECT(“&$A$1&”). Saves you the search and replace for A1. Also works great if you want to copy formulas across tons of columns and maintain the correct column header references as you can insert the INDIRECT reference into table notation.


If you prepend the column/row with a dollar, it is locked when you drag and extend.

e.g. $A1 locks the column, A$1 the row and $A$1 both

Can be pretty helpful if you put a constant somewhere.


Thanks for your reply. I am aware of the absolute cell reference notation and use it frequently when I have a set of constants (e.g. longitude, latitude, year, month, day, timezone &c) that are referenced in many formulas. However the situation in my post was the other way round sort of.

As a concrete example (just in case I've mis-understood the situation), I have the right ascension, declination and phase angle of Venus as decimal angles in cells G80, H80 and M80 and I want the output strings in cells F3, G3 and H3. I still need to search and replace the A1 cell reference in my original post for each of the formulas.

As the poster a couple levels above pointed out I could do this as a user defined function in VBA, but I wanted my daft spreadsheet to work on Libre/Open office and on Google sheets with little or no modification. Hence the chain of if statements and round() functions.


>I'd settle for the formula bar display of the formula not reformatting the line breaks.

I can't imagine how much time this one thing would have saved over my last 40 years of spreadsheeting.



Wow that’s actually kind of cool.


In the old Excel with only cell grid references I would agree. However in "new Excel" with LET expressions and LAMBDA it is now possible to write legible and self-documenting code where the lack of decent editing support is becoming a problem.


Wish I could +1 this 100x. Break into small clean chunks that are easy to follow.


It’s really new but the Advanced Formula Editor[1] tackles this, and goes further. It’s a collaboration between the Excel team and MS’ Cambridge boffins.

[1] https://www.microsoft.com/en-us/garage/blog/2022/03/a-new-wa...


> Surely Excel formulas are the most commonly across the whole population of computer users?

Maybe, but of developers who’ve installed an editor? And would go through copying formulas back and forth?

Surely at that point you’re using vba, or you’ve moved to scripting excel from the outside, or off of excel entirely?


Some of us have jobs where we write code 50% of the time and the other 50% is yelling at the biologists that work in our labs to please for the love of god stop doing so much stuff with excel formulas.


Yes, but those things have been historically mostly provided by volonteers, and I can understand they didn't want to spend a lot of effort trying to clean the mess of a giant corporation of Microsoft for free.

MS should have provided a formulat parser and interpretter as a FOSS portable C lib decades ago. But at the time, they didn't pretend to be the good guys to win market shares, they publically despised anything open or free, so it couldn't happen.

I guess it would have help OOo a lot at the time, and would help pandas today.

Now that MS has VSCode, maybe they'll end up doing a LSP, but I doubt it. Programmers are not their target for excel.


Python and Pandas > Excel for anything outside of exploratory spreadsheet-y like calculations (I work frequently with 200k rows + of data, I _had_ to learn Python as Excel was too slow eventually and broke...now having a programming language to do formulae, I can't go back for any of my production processes, using excel now mostly for pivot tables and exploratory analysis (Excel is pretty awesome at that)


Hiya! I'm a cofounder of Mito [1] - where a spreadsheet extension to Jupyter Notebooks / Lab. Our most used feature is our exploratory graphing and Excel-like pivot tables - which seems super relevant.

Totally no pressure (although this is clearly a pitch), but if you get the chance to check it out and have feedback on where we fall short of being as awesome as Excel, your feedback would be super appreciated!

[1] https://trymito.io


Unfortunately I'm restricted on the programs I can use at work, but it does look good! Good luck with it!


Well, yes and no - nothing can compete with Excel in terms of modeling speed and agility. A capable analyst can do miracles with Excel, find other routes to a solution, and then produced model can go to Python/Pandas developer for the processing optimization.


Excel > * for communicating with anyone not a data scientist and still knows what they're doing


...Of course, you need to do professional data manipulation (whether in Python or with a real database) for 200k rows of data. Excel is never meant for handling that amount of information


Blockpad (https://blockpad.net) is also pretty good for this in the engineering space.


you might like bamboo

https://bamboolib.8080labs.com/


It seems interesting but their acquisition by Databricks stopped development on Bamboo last year.


Nothing will replace a spreadsheet in financial modeling.


I've always been fuzzy on Excel's precision. It's frequently described as preserving or storing fifteen significant figures. I wasn't sure if it rounds before storing, or if that's just how it displays. After some experimentation, I've satisfied myself that it can store more than it can display.

For example, =PI() displays at most 3.14159265358979, no matter how many digits I ask it to display. If I add 1.999E-15, I get 3.1415926535898. If I type 3.14159265358979 (omitting the invisible trailing 3), then add 1.999E-15, I get 3.14159265358979.

In other words, Excel uses double-precision floating point, but never displays more than fifteen significant figures. You may be in for a surprise if, say, you round trip a spreadsheet through CSV.


Excel uses IEEE 754 double precision which stores up to 17 significant digits but the Excel UI itself only displays up to 15 significant digits.

[0] https://en.wikipedia.org/wiki/Double-precision_floating-poin.... [1] https://www.mrexcel.com/excel-tips/17-or-15-digits-of-precis... [2] https://www.youtube.com/watch?v=WZfjmbEDbfI


Looks great! I have received many requests to add formulas to my grid product DataGridXL (https://datagridxl.com) and this looks excellent. Starred :-)


With Excel formula compatibility, a js Range object (allowing people to keep track of multiple cursors on spreadsheets and to create relationships between multiple spreadsheets in a document), a way to define new functions in js that would work in formulas, and Excel import/export (sans macros/vba), there'd be no need for Excel for me.

edit: just making sure, don't be offended - https://docs.microsoft.com/en-us/office/vba/api/Excel.Range(...


I wish someone can do a Rust/C based implementation for this so that I can use it in other languages.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: