I'm glad William included slide 10 calling attention to the hostile and insulting attitude Wolfram Research has toward mathematicians and reproducible science in general. (I think some of Sage Math Inc's other closed-course competitors likely have similar attitudes, but Wolfram Research seems to be the worst.)
"You should realize at the outset that while knowing about the internals of
Mathematica may be of intellectual interest, it is usually much less
important in practice than you might at first suppose. Indeed, in almost all
practical uses of Mathematica, issues about how Mathematica works inside
turn out to be largely irrelevant. Particularly in more advanced
applications of Mathematica, it may sometimes seem worthwhile to try to
analyze internal algorithms in order to predict which way of doing a given
computation will be the most efficient. But most often the analyses will
not be worthwhile. For the internals of Mathematica are quite
complicated."
For comparison, if you want to audit the Sage Math algorithms that your research depends upon, all you need to do is fire up a text editor (or browse their github). And you won't find any statement in the Sage Math docs telling you not to bother because you're too dumb to understand what you're reading anyway.
This is why when Big Bang Theory debuted, I thought Sheldon Cooper was specifically supposed to be a parody of Stephen Wolfram.
In math, how you came by the results you came by is always relevant. Don't tell mathematicians they don't need to know that. It's their job to know that.
Well, no. They are not at all alike. Stephen Wolfram has a lot more _people skills_ than Sheldon Cooper. Sheldon Cooper is the stereotype of the Asperger scientist who succeeds despite his inability to interact with people, while Stephen Wolfram has been at the helm of a tech company for 30+ years, and that's not something you can really do without having to interact with other people.
I suspect Wolfram also succeeds despite the nature of his people skills. I found this letter from Feynmann[1] to be an interesting and early read on his behavior.
Wolfram invented v1 of Mathematica, and founded a company to sell it to technical users. It takes minimal people skills to hire people and to sell a technically good product.
As a consequence, math publications that rely on closed-source computations are not independently verifiable, and fail to meet the standard of a rigorous proof.
Yes, though of course here Stein is referring there to the Wolfram quote that's on slide 28 (roughly: certain kinds of development can't be done in academia) and not the condescending rejection of inquiry about mathematica's internals from earlier in the presentation.
you can do such a thing with matlab. many, many of the function calls and algorithms in matlab can be directly viewed.
and honestly, it seems like diving into sage may not be as trivial as you make it sound. is it not a massive glue of many different languages and implementations?
The point is not that it is trivial, but that the system is set up so you can do it. This describes why that is important: http:/www.ams.org/notices/200710/tx071001279p.pdf.
This story reminds me of Prof. Tim Davis of Texas A&M, formerly Florida, who I heard had a hard time getting tenure, after making the software and mathematical world a much better place.
He (and his group) developed CHOLMOD and UMFPACK and other sparse solvers used everywhere. Basically, when you type A/b in Matlab, it calls his code.
It was an incredibly challenging task going from sort-of/kind-of being able to solve linear systems to where we are today. Hardly anybody thinks about it. Again, you just type A/b, even when A is poorly conditioned. You can write a crappy solver in less than 100 lines of code, but if you read his papers, building a rock-solid solver was a very difficult task.
Unfortunately this kind of work is important, but pretty thankless.
Duff (who was Davis's post-doc advisor IIRC) also deserves credit for MA57, which is used in MATLAB's sparse symmetric indefinite solvers [1]. And the giant BLAS and LAPACK crew (as well as the previous LINPACK authors) deserve some credit for dense systems.
You're right, of course, and the two little words "his group" don't do justice to the army of PhD, masters, and undergraduate students who spent their time on this massive codebase.
But in this case, and in the context of the OP, it was Davis's oversight and possibly tangible sacrifices (I don't know the details of his career) that made it possible.
Each time we use some numerical methods and it Just Works, we should mumble a tiny prayer of thanks to folks like these, and the writers of BLAS, LAPACK, ARPACK, and many others that provide the backbone of an incredible amount of work.
I heard (and hope it's true!) Prof Davis got a well-deserved license fee from MathWorks...
Prof Doug Lea at Oswego is another unsung hero who has done a ton for programming and particularly parallel programming. A big chunk of java's std library is copyright Josh Bloch and Doug Lea. In particular, Prof Lea built jsr166.
Yeah, he licenses SparseSuite out for commercial use.
His personal payout would be very tiny (if any at all), after U of F's take and money his lab gets. But at least its some funding to keep his lab going.
Yeah, it was the 'go to' seminal text for a while. I've used his malloc implementation in a few projects. He's also an ACM fellow, so, not really unsung (not denigrating his work, just saying that his work is fairly well known in both academia and the industry).
Sorry, I meant unsung in the sense that his work doesn't produce lots of the usual markers of academic success, iepapers. Also, Oswego is not really what you'd call an academic powerhouse. Measured by impact, I'd think there's be a software engineering group at cmu, mit, berkeley, stanford, michigan, madison, etc that would love to have him work there.
I'm not sure this is the same case here, although I am biased I readily admit. Everyone and their mom uses matlab. Sage on the other hand is mainly used in the mathematics community which I'm pretty sure is much smaller than the community (community?) or engineers, physicists, biologists, data scientists...
It's SHAMEFUL that academics like Stein who dedicate their lives to developing amazing open-source software do not get funding and frequently fail to get tenure. These individuals truly are making the world better!
Most academics do not get funding or tenure. Most funded academic mathematicians lose funding at some point. The funding rate for NSF grants in mathematics is certainly less than 33%, and one can only submit a standard proposal once per year. The competition is FIERCE. (I've been on panels: it is terrifying to see who doesn't get funded.) Ask any mathematician -- theorem prover or software writer -- about excellent people she knows who have lost their grants. You will instantly get multiple examples.
Mathematicians who write software are no more likely to be "truly making the world better" than those who prove theorems and teach hundreds or thousands of students every year.
I'm not saying anything about Stein here. He's a rare mathematician: strong on theory and practice, a passionate advocate for his causes, and a respected teacher. But there is a danger of missing the real point here -- most good things don't get funded or recognized. This has gone on for centuries. The funding situation now in the US (for both theory and practice) is better than it has been almost everywhere for almost all time.
It seems trivially obvious that a better use of funds at many universities would be paying for actual academic research at a few hundred K per year, vs a big-name football coach, with all the associated staff, assistants, equipment, etc, for tens of millions per year.
Since when it has become acceptable that the goal of academic institute is/should be dirty ROI?
And then why just football? Start funding cabaret, rave parties and poll-dance events as they surely will have more ROI.
Sports has been the undoing of US education in schools [1] earlier and now it seems even in higher ed.
The sooner they get rid of sports from educational institutes the better for them.
I was on staff at a small science and engineering school with a top 25 football team, and I was the faculty rep for the cycling team. I had no love for football, and I say that as someone who played in highschool. Until I saw the finances and realized they funded the entire sports program. As much as it pained me, I had a hard time complaining after that.
By definition of positive ROI, the university has more money than before, which in particular means it can spend more money on research by building football stadia.
On the wider scale, extracurriculars only affect[2] share-of-students rather than increase the total number of students[1], so such programs have a globally negative ROI. But each individual university is making a rational decision.
[1] I'm assuming there's a negligible percentage of students that would avoid college entirely if no or very few colleges had football programs. It's safe to ignore football scholarships, because you still have the option of giving the students free money, which is cheaper than giving them free money and also running a football program.
[2] I'm also assuming the football program itself doesn't generate enough revenue to offset its costs, and only affects enrollment. I honestly don't know if they make enough money in tickets and trinkets to offset the debt service for a stadium, salaries for coaches, free tuition for students, etc. If the ROI is positive(or even negative, but with a positive cap rate), then it might be rational economically to continue them.
You're assuming that the ROI generated by football isn't a transfer from other universities who lose students to the spending uni. More likely, funding football is a zero sum game which generates no overall benefit for the research community.
Mathematicians who write software are no more likely to be "truly making the world better" than those who prove theorems and teach hundreds or thousands of students every year.
How can this be true in the case of a mathematician who writes something that many/most of the others use?
I think nontraditional research organizations like YC Research and Google's Project Zero can solve this problem.
--
Open source basic infrastructure -- everything from Sage to OpenSSL -- suffers from a market failure.
We rely on these foundational projects for billions of dollars a year in commerce, but they often get minimal funding and are supported by semi-broke volunteers working in anonymity.
* GPG is maintained by one guy, who was about to give up before a few people threw coins in his tip jar after this story a year ago: https://news.ycombinator.com/item?id=9003791
* OpenSSL was comically underfunded and underappreciated until Heartbleed happened and people remembered how much it matters
Stein's story is powerful and shows how neither traditional companies nor universities help here. In the world of math software:
* The companies created a bunch of closed-source walled gardens (Mathematica, Matlab, etc).
* The universities were unwilling to support free and open tools. They gave tenure and support only for authors of research papers, not tools, no matter how useful or widely deployed.
Even the guy who made NumPy and SciPy didn't get recognition for it---wtf.
I think that these new, independent organizations with rich patrons can fill in the gap. Organizations like YC Research, Project Zero, and Canonical.
These projects are a crucial part of the infrastructure of the modern information society. The solution to a lack of funding for these things is not charity from rich patrons, but governmental investment based on taxes.
To be clear, this is a side effect of using publication count as a surrogate for an individual's total contributions to an academic field. Everyone knows this is the metric for success, so you either have to optimize for it, or risk looking like a failure.
Setting up a company can help because it will let other researchers support the project by buying the software on research grants. A grant can also pay for consultant-style improvements to software. It doesn't really matter if the software is also being given away for free. The important thing is to have an invoice to give to the university financial services staff.
I'd prefer it if the granting agencies supported this sort of software infrastructure directly, but, lacking that, a company is a way to hire people to tackle some of the weaknesses of Sage, whether they be in its core functionality or its UI.
"Setting up a company can help because it will let other researchers support the project by buying the software on research grants." This is absolutely correct, and has happened many times now in the last few months (I started the company itself a year ago).
By the way, another way to get financial support for an open source project is to sell a short tutorial book. For example, the creator of Laravel did this on Leanpub (disclosure: I'm a co-founder of Leanpub) and did pretty well: https://leanpub.com/laravel is #7 in lifetime earnings on Leanpub. (Also, one of the core contributors on Laravel has done even better: https://leanpub.com/codebright is #2 in lifetime earnings on Leanpub.) Similarly, the creator of Trailblazer is doing pretty well recently with his Leanpub book: https://leanpub.com/trailblazer is #9 in revenue over the past week.
Anyway, my point is that even if you're in a niche, if you are the clear expert in that niche (say if you created the framework, or in your case, the software), then a book may be one worthwhile component of a monetization strategy. If you can sell a $30 book to 4000 people, you can earn some decent money. (The royalties on a $30 book on Leanpub are $26.50, so multiplying by 4000 results in over $100K.)
Hopefully one day textbook prices can return to a sane level, say between $30 and $80. Today's textbook racket (I don't want to call it a market) is obscene.
As a an academic contributor to a mathematical software project (much less mature than Sage), I want to thank you for giving this talk. Hope it starts a much-needed public discussion about this topic.
Best of luck with the talk! I'm way out in the hinterlands of your world but it struck me how small this world may be: as an MD doing some applied ML stuff out in San Diego I have actually troubleshot a bug with Fernando Perez, and know a few other folks from the early days of what became Jupyter.
The problem with Sage, while it's an amazing piece of software, that it is abysmally documented. As is maxima.
Sure, there's an example on how to do integration and anything on a high-school level but every time I wanted to do something a little bit more complex, I was completely lost. I found the Mathematica documentation to be miles ahead.
You are of course completely right. Almost everybody who works on Sage does so as part of some research mathematics project, often in their spare time, and they just don't have time do the massive amount of work to bring the documentation to the same level of user friendliness as Mathematica. I hope SageMath, Inc. will be able to makes strides to improve documentation for open source math software. Also, this by Greg Bard is a bright spot: http://www.gregorybard.com/Sage.html
I hate to be that guy, but this needs to be said. The guy on page 15 was right. Sage should have at least put a little more effort into their applications, because if it did, it would be much more popular and thus more developed.
Back in 2011 or so, as a young undergrad, I latched onto sage and used it for an undergrad research project. My BS was from a tiny university (one year, I was the only Physics major in the school)..and I tried to turn all my friends and profs onto sage, being small meant there was no dept. standard, so I tried to impress it on the dept. (3 people really) but they stuck with mathematica because sage didn't even have an easy to use ode solver! For pete's sake... I understand that sage is a niche project for the math community, but if that's the case, that's the only place you'll find funding and devs from.
This is often said here amongst the startup nerds: make sure you have an audience willing to pay. Hey, many of us in the "more applied community" would love to have a FOSS tool that rivals mathematica, we exist! But it needs to do things well, or at least well enough that in linear combination with the fact that it is open source, the overall goodness vector for the project's value has a timelike norm. Then, we'd clamor for it, you get downloads, and one day, the funders will go, "hey, that's good shit right there, I better be a part of it!"
They don't need to do things for others, or for others' interest. But then, no one should be surprised when such efforts don't get funding. I mean, doing something niche implies that less people will be interested which implies that less people will fund it, right? It's almost a direct consequence of choosing to serve a niche.
As a developer who has recently begun contributing to SageMath, I have to concur with this comment. It's taken me some time to realize the extent to which Sage is geared toward pure mathematicians. The recent changes, for example, in Sage's piecewise() function have made it much less friendly in numerical evaluation than Mathematica's Piecewise[]. If Sage is to have a wider audience, transitions between symbolics and numerics need to become much more expedient.
Perhaps I can help in making that happen. And so might other people reading this thread!
Please, please help! We desperately need input from you and people with similar numerical/applied experience. Sage is geared toward pure math, only since that's the backend of nearly all the contributors. But our mission statement has always made it clear that we very much want Sage to be of value outside of pure math.
William and I both presented at a RethinkDB meetup this last fall (SageMathCloud leverages RethinkDB changefeeds in awesome ways) and I got to talk to him a bit about some of these frustrations. It really is a rough spot to be in and I wish him all the best.
Every time a user types anything into any document in SageMathCloud, RethinkDB changes propagate those changes to all other users of the document. We use RethinkDB heavily at all levels of SageMathCloud.
Prof. Stein's earlier post that had appeared here on HN (http://wstein.org/mathsoftbio/history.pdf) is enormously absorbing reading also. It has much more to tell about the significant challenges that stand in the way of developing high quality open source math software.
Mr Stein released the first version of SageMath some time ago--not sure how long but maybe 2007.
among other things, SageMath reconciled the confusing namespace soup that is scientific python (numpy, scipy, and matplotlib--three brilliant libraries with partial overlap in functionality and in package names) which SageMath gathered (along with other libraries like SymPy) and put them under a single rubric, 'SageMath'--one (large) install and you have all of scientific python.
SageMath also included a notebook
seems not such a big deal now with Anaconda and Jupyter notebook, but in 2007 it certainly was.
Does make sense - academia is about theory, businesses are part of implementing end user solutions. most of academia runs on a tech stack delivered by commercial entities anyway. Building products is not as much about creativity as delivering a fixed product with a service plan and support chain in addition to product development. Companies have various operations to create full fledged products - academia can supply only the r&d part. And this is a good divisionoflabour, IMO.
I have come to the same conclusion, despite a decade of wishful thinking in the opposite direction. That said, companies can do a much, much better job of working together with academia, and I hope SageMath, Inc. does.
Prof. Stein,
You do not know me, but you have been an inpiration to me. I came across several of your books during a year of post-bac study. They spoke to me, especially "Algebraic Number Theory,
a Computational Approach". They also steered me toward your home page, and your work on Sage Math. I thought to myself, "yes. yes!"
Though I'm not punk as fuck, I'm definitely a 'walk to your own drum beat' believer, and a skateboarding professor that heads an open source project taking on Mathematica would make an awesome lodestar. I was 32 when I quit a great job at a very well know Wall Street investment firm (back-office, not master-of-the-universe stuff, but definitely a good place to be) so that I could study nothing but math for a year. I should point out, my math grades up to this point were:
- D in my senior year in high school
- C in the only undergrad math course I had to take
So, everyone was like, "You're effin crazy, what the eff are you doing, you're making an effin bad decision..." Etc. Well, it was the best decision I ever made. Two weeks after leaving my job I was in a dorm room with an 18 year old football player (very, very awkward), but a year later I was a class or two away from a degree in math. My wife and I decided to add moving (again), wedding planning, and another thing to our life, so I didn't quite finish a degree. I received a bunch of As and a few Bs. It was a miracle. (No, it was a lot of hard work, and having seen the light which is the beauty of mathematics).
I've thought many, many, many long hours about the issues of open source development and how it might be made sustainable. I've had to, as it relates intimately to the reason I left my job and went off on this new path. I've got a couple ideas that I believe are very realistically workable. In short, the first go I'll be making at one of these ideas is, software is developed by a community which then makes the source code open source but not compiled into programs, and with no beautify logos or easy to use UIs. They then copy right that code for a month and charge non-members a small (think Spotify) amount to have access to the compiled, bundled, UI'ified versions that are encrypted with a monthly key. Then, at the end of the month, that software is all marked as "old", put in the public domain, and the keys are "unlocked". If the software is useful, the price is right, and the user is not a programmer, then they'll hopefully pay $10 or $15 a month even though they could use last months software for free. Also like Spotify, paying this fee would gain a user access to all the communities software. The subscription fees will be allocated to programmers who will be paid to work on software per rata according to some weighted combo of votes from users and votes by community members. Community members are, of course, free to work on whatever they'd like to in addition to that. Community members receive a payout from the subscription, basically whatever subscription revenue there is minus that paid out for paid development (per previous mentioned mechanism) minus operating expenses. You can only have your software in that "repository" if you are a member, and you must buy in to be a member, sorta like a co-op.
So, that was a very sloppy explanation, but hopefully you get the general points. My main point however, is, please don't go corporate. Even companies like Patagonia, though it is a "B-Corportation" for the public benefit, are clearly driven by the bottom line. How else could one explain why they charge $35 for 40 different types of hats. We don't need 40 different types of hats. But, it drives their bottom line, so that's what we get (albeit, in addition to the great things they also do).
"You know what I hate about f*cking banking? It reduces people to numbers..." You know, the line from "The Big Short". It's not just banking. It's any venture that is driven by the profit motive. Pure and simple.
Profit motive => Reduce everything to numbers
Not right away. Not in a loud and crash fashion. Not one person. But systematically, insidiously, creeping, all together, a step at a time, with the flash of amazing marketing departments and the financial soundness of a well disciplined finance department. Whatever it is you think you're doing, will be metamorphosed into the fungible unit of exchange, like something out of a Kafka book, both absurd and meaningless, while at the same time horrifying.
This 'comment' is very sloppily written because it is being written with some urgency because, (god bless HN, where else will I get to randomly interact with Prof. Stein?) I sincerely believe that you have changed the center of gravity in the world and this is an impassioned plea to keep on keeping on when it comes to helping us that are building a future where (given that intellectual property will clearly make up the bulk of our wealth) the wealth is a well tended commons and not a well guarded garden.
(Speaking briefly to the "academic" angle of things. I understand a bit what the atmosphere is like. I'm going through a divorce at the moment, and my wife just successfully passed her major comps exam and is on her way to a Ph.D. from Johns Hopkins in their political theory department. I am very happy for her and wish her the best, but my point is that, I get the pressures in academia to prioritize certain things while other things, which should really be valued and promoted, are totally overlooked or even punished. But, the business world is not the answer.)
Thank you for your comment, which I've carefully read. Feel free to email me at wstein@gmail.com, though I can't guarantee I'll have much time to answer, since I'm pretty busy. Indeed, one must constantly guard against the many intrinsic evils of corporations.
What you are describing is the status quo, but it is certainly possible for government to fund software development. As William points out in his presentation, it is already being done indirectly through software license purchases. I think it would be interesting to see what academics could achieve if they were offered some modest grants aimed at developing and maintaining viable open-source alternatives to all commercial software for which government is currently paying license fees.
I sought government funding for Axiom. One of the direct comments in feedback was that the government does not fund software that competes with a commercial product. There were other issues (such as a lack of professional accounting for handling grants) but this issue could not be overcome.
> the government does not fund software that competes with a commercial product
That is such a ridiculous constraint. Do they mean that, if I start selling tapped water for $100 a gallon, the government can not provide its citizens with an alternative? Obviously it both can and does in many important areas (water, education, electricity, roads and defence to name a few). The decision on whether government should be active in a market should be based on an analysis of the benefits it can bring to society – be it savings, innovation or equality of opportunity.
"I think it would be interesting to see what academics could achieve if they were offered some modest grants aimed at developing and maintaining viable open-source alternatives to all commercial software"
I think this would be a waste of academics. Commercial software is not expensive (mostly) because of some secret sauce. It's because delivering a functioning product requires lots of work that is thoroughly mundane and repeatable.
Analogously, one could employ chemists to bottle coca cola or metallurgists to package hammers but that would be just a waste of everyones assets.
Should government make it's own pencils? I don't think so.
Good products require lots of work that is hard to be intrinsically motivated of.
Academics already perform a lot of repetitive and mundane work (e.g. teaching, writing grant applications). A fallacy of the current academic climate is that academic results must be novel. This has lead to pathological behaviour where researchers flood publication venues with incremental results portrayed as breakthroughs. Performing a public service such as maintaining a widely used software package is at best seen as a second-tier achievement.
I am not suggesting that the government should be involved in making pencils. But I do think funding independent development of open tools for research, education and other government-funded work is a good idea.
To support my claim, compare the cost of healthcare in the US, where the government relies on the industry to keep medical products and services cheap, with the price in countries where the government provides its citizens with an alternative.
Another issue is "sustained development". I collected over 100 "computer algebra programs" on a CD. I distributed this at a computer algebra conference. All 100+ were academic attempts, usually by small groups or one person. They are amazing programs that will never get widely used.
Mathematics, Maple, Axiom, Maxima and other programs are large, multi-person, multi-year, multi-million dollar efforts with contributions by PhD-level researchers.
Axiom, I estimate, has about 300 person-years, over many years at IBM Research, with an estimated cost of 42 million dollars. People who invented new areas of computational mathematics were primary contributors. IBM sold Axiom and it was a commercial competitor to MMA and Maple. It is now open source (due to the good graces of the Numerical Algorithms Group, NAG)
Magnus, which I was also involved in, is much smaller and very specialized. It was originally developed by government grants but development fell off once that ended. Magnus was developed at City College of New York.
Based on that experience I feel that computational mathematics development requires company backing to develop any well-maintained and well-documented system.
The downside is that companies tend to die in less than 15 years:
"The average lifespan of a company listed in the S&P 500 index of leading US companies has decreased by more than 50 years in the last century, from 67 years in the 1920s to just 15 years today, according to Professor Richard Foster from Yale University."
and that's for LARGE companies. Small companies die quicker.
So what happens to computational mathematics when Wolfram Research (Mathematica) or Cybernet Systems (Maple), etc. dies? Does your MA* research die? Is there suddenly a huge black hole in the middle of computational mathematics? Can you no longer reproduce your results?
Mathematica won't be open sourced when WR fails because software is now considered a company asset. Even if it was open sourced, my contacts tell me that the internals are not well documented. Computational mathematics is REALLY hard to reverse engineer.
Somehow we need to make it possible to maintain, modify, and extend existing systems. This requires a few things, in my opinion.
We need academic (and grant funded) programs that specifically target computational mathematics. The goal is to develop a stream of people who have the necessary background, not to develop a new system.
We need to deeply DOCUMENT the ALGORITHMS so they can be reproduced in any of the existing systems. Theory is fine but programming involves design tradeoffs, such as a choice of representation, available functions, test suites, boundary conditions, reference results, etc. There are a dozen equations for things like the gamma function but some are better than others for implementation.
We need government focus. Computational Mathematics is vital and is fundamental research. We need a "summer of mathematics" workshop that involves all of the players presenting a reasonably unified approach. OpenDreamKit in Europe is doing something big about it now. The U.S. should step up and participate in some official capacity. Computational mathematics benefits everyone and should be an international effort.
I hope that SageMath can bring these things into focus and lead us to a better place.
Academia is about theory that is correct (and therefore potentially useful). There is a huge potential conflict of interest whenever you use a commercial stack in your research, for the providers of the stack are not really interested in providing error-free robust products. Lingering errors in Mathematica are well-documented, for instance, and all they do about them is to put up smokescreens. Does one need to wait for a huge scandal for the change of attitude here?
I like the idea of SageMathCloud. I had a numerical methods class that I took where we used a web-based Python math environment that the professor was having his grad students build. It was pretty buggy and would go down sometimes.
Last I heard, they switched back to MATLAB. Having taught MATLAB, I wouldn't wish that on anyone. But if SageMathCloud had been around, it would have been a good option.
This is why I love sagemath, and open source in general. I cloned the repo a few weeks ago after hearing him mention how he is mainly the sole developer. It's a huge undertaking and i know how draining something like that can be for motivation.
I may not have time to do a lot but i am gonna join in and help as much as possible. Documentation, bug fixes, whatever. This project deserves it imo
Minor clarification -- I'm definitely not "mainly the sole developer" of Sage. There are over a hundred people that have contributed to Sage during the last year, and our release manager (not me) works very hard putting together these releases. I did most of the work on SageMathCloud though...
If you're reading this @WilliamStein, firstly, kudos for the whole effort. I think the providing Sage as cloud math software is a powerful strategy, especially with the trend of end users preferring low power devices.
I have some thoughts regarding the comment on Slide15 (making Sage good for some applications): I see that python and Jupyter are very popular for machine learning and allied computations. Can SAGE leverage this to provide a service that a large audience would happily pay for -- and then use that to bootstrap a full fledged mathematical software? (Also, on that note, is there any coherence between the leaders of Sage and Jupyter?)
I'm reading this. The business model of the company is basically exactly what you're suggesting. Yes, there is very strong coherence between the leaders of Sage and Jupyter. (The founder of Jupyter is just to the right near the back in the picture of Sage Days 1 in slide 16...)
SageMathCloud is amazing: https://cloud.sagemath.com/, for those that want to try it out. Lots of neat bonus features like Python notebooks, Latex support, and even terminal access. The free tier works great for most applications.
Will, I have been looking for ways to build a company around GNU Octave too. If you find a way to make money, please let me know. Matlab is one of the Ma's that needs a direct replacement. There's so much free Matlab code out there that needs a free environment to run on.
Hi Jordi! Here's what we are doing: https://cloud.sagemath.com/policies/pricing.html
I have so far found a way to loose money. However, I hope that if we can grow usage sufficiently that we'll start making money. You and I are on the same team.
I'm not a math guy, but I am a GNU guy who wants to learn more, and I'm curious what all the math geeks think about GNU Octave? How does it compare to matlab and mathematica? Any quitks or catches that make it unusuable for certain situations? Stuff like that...
Octave is nice, and it compares well with matlab in my experience. However, I haven't used it in quite some time, as I switched from octave+gnuplot to the scientific python stack some years ago (and I prefer it immensely).
Everyone has their own story around this. For me, the reason I did not go with Octave when I chose to migrate from Matlab to Python was that I wanted to be part of a general-purpose language community. Octave had the nice advantage of being very close to the Matlab syntax --- which has some nice properties (e.g. easier matrix-math expressions and matrix-building expressions)
However, I felt there would be more innovation around the programming language itself (think decorators, generators, futures, compilers, and coroutines). The boundary region between "applied-math-programming" and general purpose computing requires a lot of sometimes tedious work. Having a general purpose computing language would mean that more people would be available for that work.
Even now, one of the principal challenges of NumPy which is the foundation of the Python scientific stack is that it combines applied math (fft, linear algebra, polynomials) with computer-science (type systems, data-structures, and multiple-dispatch functionals). Maintaining all of that with one group of people is difficult.
If we could re-factor the NumPy code base into 1) a data-declaration type system --- i.e. look at datashape.pydata.org which is a generalization of dtype, 2) a multiple-dispatch generic function system (the ufuncs) and then 3) a container object. These could all be maintained by separate groups (and even #1 and #2 could be pulled into the Python language itself). See the libdynd project for a reference example of what it could look like.
Then NumPy could be maintained as a set of math libraries on top of that.
Then, it would be relatively straightforward to build the octave DSL on top of the Python computing stack and we would be able to share work.
I can't overplay the importance of a octave like DSL.
Julia is gaining marketshare and mindshare among grad student not just due to its speed, but because it is a more fun and intuitive environment in which to code mathy stuff.
These people will in turn filter into industry and if not them, then atleast their code.
Also macros. As Julia gains more utility for run of the mill data science, Its Dplyr like DSL abilities will be very attractive.
Do you see this type system and generic function library as useful for general purpose programming as well? How would that play with mypy and type hints?
I think the only reason to reach for octave is if you've got a ton of matlab muscle-memory or have some kind of external force influencing you to use matlab and don't want or can't afford the license.
There's a lot more stuff in sage, like good support for graph theory and group theory, which iPython doesn't even touch. It also wraps a number of other pieces of excellent open source projects.
sagemath is such a distribution, but also a rather thick layer of algorithms on top of that. many of them are actually written in cython, to make them fast and to have a good binding with those libraries, etc.
almost none, because sagemath more or less switched to use jupyter as its graphical notebook interface. technically, there is a small preparser (for a little bit of syntax sugar on top of python) and some deeper integration of the plotting capabilities.
Yes, they are bound to sage. I really, really want to break things apart into separate Python modules that can be used outside Sage. However, that's an enormous amount of work that doesn't help at all with finishing a math research paper, so it's unlikely to happen without money. I've proposed and brought up exactly this very frequently on the Sage mailing lists in the last year. If the company makes money, one my dreams is that all of Sage will be available as smaller modules that are pip installable....
Readers should keep in mind that cython originated from the sage codebase. I bet there's lots of gems in there that many people would like to play with independently of sage.
Will you need to release these modules under the GPL?
Recently we did factor out the code in sage for Cython signal handling (so you can hit control+c to interrupt blocking Cython code!) into a separate library called cysignals. We changed the license from GPL to BSD when doing this!
I hope they improve their software development practises. Packaging SageMath for Debian has been impossible; they use about 100 dependencies and have re-invented their own internal package manager to build all of these with sage-specific patches.
In 2007, Tim Abbott (before founding ksplice and zulip) beautifully and properly packaged Sage for Debian, and in fact Sage was included in Debian standard! Unfortunately, he stopped working on the project (when he started those projects), and of course we had no money to hire somebody else to maintain what he did. If I had money, it would be a total no brainer to spend it on supporting packaging SageMath for Debian. Your remarks about it being impossible because of our package manager and dependencies aren't correct, because things we the same in 2007.
OK, more accurately I should have said "much much more cost than is necessary", instead of "impossible".
The point is that, it should not cost significant continual effort to package SageMath for Debian: if SageMath was following good engineering practises, then Tim Abbott's work would still function today, even taking into account necessarily but normal and minimal maintenance costs that Debian volunteers (including myself) would be happy to do for Sage.
The challenge of packaging Sage was primarily around packaging its dozens of dependencies (some of which I had to talk to the authors to fix their licensing terms) and making sure that an up-to-date version of those dependencies was available in Debian. It took about a month of my time to package Sage well for Debian (at the time I maintained over 100 Debian packages for MIT, so I was quite efficient at this).
What killed my Debian packaging effort was that Sage was very large and my package was submitted to the NEW queue (where Debian does copyright review) when all the reviewers were busy with managing a release freeze. So it took more than 6 months for Debian's FTP masters to fully review it, and by the time they did, I had moved from being a grad student to the CTO of Ksplice. It probably would have been just week or two's work for me to update Sage across those 6 months, but running a startup is a lot of work, and I never found the time to do that work :(.
Overall, my opinion is that Sage is well-engineered and not difficult to package given its scale (it has a fantastic test suite, so it was very easy to check if the package worked, and it was easy to get them to merge changes to improve the tooling). The problem is that it's a large project, with a large number of diverse dependencies, new versions of which aren't always backwards-compatible upgrades. If you talk to the folks who package other large projects for Debian, I'd be surprised if you find any that don't involve significant maintenance work.
It would be great if you do pay someone to work on this, but please also keep in mind my points about continual costs. To reduce these, Sage upstream (you) does have to change some of its practises.
"Sage upstream (you) does have to change some of its practises"
This is also being worked on and there's progress being made.
I'm still pushing to completely separate Sage-the-library from Sage-the-distribution. There's been a little pushback but nothing that can't be overcome with basic configuration management practices.
I'm actually one of the few people being paid to work on Sage, and this is one of the tasks I have interest in.
Although my personal work is more focused on Windows support for the time being, this is definitely on the docket. We had a workshop about two months ago in France focused specifically on packaging Sage, and there are some excellent folks from LogiLab who are making serious progress on the Debian packing. I hope to circle back around to that myself after I've made more progress on Windows.
Great! Could you elaborate what LogiLab are doing? There is creating .debs, and there is creating a .dsc source package for inclusion into the official Debian archives. If you guys want to do the latter, you should talk more to the other people mentioned here:
The main barrier at the moment, is that sage patches many dependencies. It is better to upstream those patches, not only because it's good engineering practise, but also because it's unlikely that Debian policy (in practise: the admin, infrastructure, and security teams) would allow us to include (e.g.) a duplicate maxima-with-sage-patches in Debian, just to satisfy Sage.
OTOH if you "just want to" create .debs, the task is much easier. But then there's no chance of it entering Debian officially.
No, the goal is to have it actually in Debian. One argument that's been made against this is that Debian moves to slowly, and an old (but supported!) version of Sage is not useful to its current core user base of research mathematicians who often need the bleeding edge and/or are developing new code directly in Sage.
My counter argument is that we want to expand Sage's user base beyond a small core of researchers, and improve its usability as a Mathematica replacement for students and some scientists who are less interested in things like bleeding edge combinatorics research (they might be but not necessarily the majority). As a Mathematica replacement Sage isn't there yet, but it's good to get a head start on making easier to package and install, as part of that effort. Like for me, if it can solve some differential equations for me and do some integrals I don't care if the version I got through apt is a couple years old.
As for the upstream issues part of the problem there is that some of the upstream dependencies of Sage refuse to accept patches needed for them to integrate with Sage. That's a long story. I think the best approach there, which has already been tried in past approaches to patching for Debian, is to maintain Sage-specific forks of that software that include the necessary patches (IMO they should also be swappable with the originals via update-alternatives if possible). As far as I know there's nothingf legally preventing that, but more the effort involved in maintaining a fork and a package for that fork.
In the long term, I think, it would make sense to completely replace and rewrite some of the code that these external dependencies are used for. But in many cases there's an enormous amount of work involved, and that would only be possible with significant funding. And quite possibly not worth the effort compared to other ways that effort could be spent.
Perhaps we should continue this by email - you can contact me at infinity0@debian.org
> Debian moves to slowly [..] My counter argument is that we want to expand Sage's user base [..]
Beyond that - "too slowly" applies only for "Debian stable"; and users that are OK with less stability can use "Debian testing". Usually this is quite bug-free; things only enter testing if it's been in "Debian unstable" bug-free for 5-10 days.
It's also much easier to get your software into Ubuntu, if it's already available in Debian testing/unstable - and that would likely expand your user base quite a lot.
> some of the upstream dependencies of Sage refuse to accept patches [..] [we could] maintain Sage-specific forks of that software [..] swappable with the originals via update-alternatives if possible [..] [or] completely replace and rewrite [it] [..]
Yeah, the situation is complicated. We could try different approaches for each dependency too, and perhaps some of them will change their mind. Debian does (on purpose) make it quite high-cost to maintain forked packages, in the sense that we would have to argue our way through many layers of admins of different systems, to incentivise us to get patches accepted upstream.
When you have time, could you write up the details of the situation on your end? Something similar to the wiki page I posted earlier - or you could also just edit that directly, if you wish.
I also definitely want Sage in Debian, and don't think the "Debian moves too slowly" argument is valid anymore for Sage. It was a compelling argument in 2007, as mentioned elsewhere, but Sage is much more mature now. infinity0 -- thanks for your encouragement from Debian!!
I just tried to get a binary to install it on Windows. Instead they are handing out only OVM files (VirtualBox images.) As much as they might despise closed-source, there are simple things that businesses necessitate - like not being hostile to your audience. I am a fucking programmer, I don't have a problem using VirtualBox. At the same time, it infuriates me that they did not bundle up an MSI or EXE for their windows audience. Now I feel less inclined to even try it, because I already use Mathematica. This basic understanding of "energy to demo something" is the ethos of HN. I am super surprised after reading that PDF to find these simple energy laws of consumerism violated. Aye...I digress...
I'm sorry. The point of my talk is that Sage is not a viable alternative to Mathematica, etc., for many reasons, e.g., not having a native Windows version. Porting a huge amount of deep technical software from Linux to Windows is the sort of difficult and thankless work that will not get somebody tenure in academia. I tried hard to get funding to get help on a Windows port, and Microsoft donated $30K for this effort back in 2008. However, $30K is not enough to fund such a huge project. In fact, I once met with a bunch of the revolution analytics devs, and learned they were getting millions from Microsoft just to port R to run natively on Windows. This was disturbing, because R is just one of the 100 components of Sage. In my talk, I mentioned the new grant in Europe that is funding the first ever fulltime Sage employee, and it turns out his main job so far is working on a native Windows port of Sage. Unfortunately, though he is incredibly good, he'll probably discover how daunting this project really is. (It's not a one guy for a few months sort of project... And yes, I tried very hard once to port Sage to Windows and failed.)
The good news on that front is most of the problem is not Sage itself but some of its dependencies. Of course some of the problem children are core dependencies and can't just be made optional. But fast progress is being made. I have most of Sage running on Windows, currently with an unfortunate Cygwin dependency. However I have hopes to work on native Windows ports of some of the trickier dependencies, notably GAP and Cysignals.
Thanks for your reply William. I was just frustrated, sorry that I disrespected the work, effort, and time you have put into Sage to get it to where it is today. I will be giving the web version of Sage a try and perhaps contribute to it on an open-source basis eventually.
I have another perspective though; I knew Allan Steel (Magma guy) as an undergrad at Sydney University. He is an extraordinarily smart person and humble and genial as well.
Everyone should be thankful to him and the University of Sydney for having the wisdom to fund the development of Magma.
Allan Steel is awesome!! I really like him and have learned so much from him -- it's one reason I put his picture up for Magma in that slide. (I think you might be overestimating the extent to which U Sydney funds development of Magma.)
Because there's no way to grow. He's been unable to secure grant funding for actual employees, and even had trouble getting money to keep the servers running.
He's looking for ways to make safe and open source software dominate. And there needs to be a lot of growth for that to happen.
I have tried for a while now, and I thought I could do both. But... (1) It is difficult on a personal level--for example last month SageMathCloud got hit by a major DDOS attack 15 minutes before I had to teach a class. I have family and though I love to work, there are only so many hours in a day. (2) There I am at a big old state university, and there are many complicated byzantine conflict of interest and IP rules, which have been a pain to navigate, and our university commercialization office isn't the best. (3) Investors greatly prefer that the person/company they are investing in is not just a side project for the person running it. All that said, the mathematics department at University of Washington is full of supportive faculty; I'm doing what I'm doing more for the people I want to hire than just for myself.
Why not take a leave of absence to at least get the company started and acquire some funding? After that, you could just have a consulting role with the company.
I did during my 2014-2015 sabbatical. Building a successful company is vastly more difficult and demanding of attention than I could have imagined. Maybe I'm just not as good at doing multiple difficult things at once as other people.
He has a higher risk tolerance than me. I would work on this 80% and do the 20% required stuff as a tenured professor, but what do I know. Maybe I am overly enamored with becoming a tenured professor. If he was already spending his time as a professor working on this, I don't understand the difference. Still, all the best and good luck to him.
The work of a tenured professor is more than 20% time. A normal teaching load is 4-5 classes a year, plus significant committee work, student advising, etc. It takes an enormous amount of time. And it can be awesome, fun, and many of my colleagues love doing it. But it doesn't result in creating a free open source alternative to Mathematica.
I think people underestimate how much work a professor does. Everybody in my department who started a company either did it before starting at the university or they did it while on sabbatical.
If you do the bare minimum work as a tenured professor you're going to get an awful lot of people extremely mad at you at all levels of the academic hierarchy.
If you do the bare minimum work and you're spending a lot of time on your own company, at my university, you will almost certainly get a pink slip. Doesn't matter if you have tenure. We are not allowed to work more than one day a week on such things, and that requires approval, which probably won't happen if we're not publishing, have terrible teaching evaluations, and aren't doing a full share of service and advising.
So is the business model likely to be similar to that of RStudio where organisations can buy an annual 'commercial licence' to support the project (in addition to the GPLed version being freely available as mentioned below)?
All too familiar. I am reminded why I stopped writing software for academic research. It's considered low-academic-value work, on a par with system administration or I.T. helpdesk work.
This is a prime example of the tragedy of the commons with FOSS. People wonder what academic or corporate incentives need to change, but I'd argue in many cases this work lies between research and business and that's OK. I think the closest physical-world analogue is civil engineering of public works, and we'd be wise to make something of this (in the US they perhaps both underfunded, hmmm).
The programming language of Sage is Python, which is a better overall language than the custom math-only language of Magma. Also, there are over 80K packages for Python on Pypi compared to a handful of packages for Magma. Ecosystems...
I currently use Magma in my research. I have run into a crash bug I cannot fix or debug, holding up an element of my thesis work. A lot of effort has gone into working around Magma limitations, which Sage doesn't have because it is sanely designed. If you've never had to use Magma, consider yourself lucky.
See Slide 4 for why it's important to be open source.
TL;DR: Researcher A finds things he wants improved in Magma (closed source) but can't. Researcher B tries to write improved FOSS implementation, but lost his job, likely in part by spending too much time writing said code and not doing other things like writing papers. Researcher A moves on and has a successful academic career. Moral: writing FOSS can cost you your academic job; it's safer to find something else to do.
Right - would you rather rely on a theorem which the author claimed is true, but won't let you see the proof, or one where the proof is published and reviewed? That's the choice between closed and open source math software.
Absolutely YES. Sage is and will always be 100% open source. To ensure this, the GPL copyright is spread amongst over 500 people. Also, the software written by the company SageMath, Inc. is also completely open source (https://github.com/sagemathinc/smc).
This. $200 for 4 months for 25 users works out to $2 a month per user. You could double or triple it and still be less than the personal plan.
Those large multiuser plans probably come out of grants, departmental budgets, etc. So they're likely not all that sensitive to price. Pretend you were still in academia, and had found online some great cloud-hosted software you wanted to use in a course, would it change your / your department's decision about purchasing that software if covering 70 students for a semester cost $400 vs $800 vs $1000?
This question keeps me up at night, as I presume it does for my colleagues at Julia Computing. Not that switching to a fully commercial model is necessarily a bad thing, but Julia Labs, like any academic group, always has to worry about where funding will come from.
I think that the problem is that the old model (proprietary development) is pretty unworkable too - but the effects are distributed less starkly. In the old days we had compilers from various vendors at a range of obsolescence, source code would work under a particular compiler and the compiler license was associated with that component. There was no budget for buying a new compiler for a particular project and maintenance gradually got worse and worse, meaning things had to be rebuilt eventually. Good for developers, good for vendors, bad for the bottom line and wider economic development. I wonder what the right size of operation for implementing, innovating and supporting a project like Julia is (not what people would dream of, but the operation that would just about do the job effectively) and I wonder what models could be created to sustain that kind of operation over the right kind of timescale.
The two aren't really comparable -- SageMath integrates a huge list of systems which don't have any support in Julia at all (for example, group theory, my personal area).
DoD is the only organization that is currently providing me with grant funding... (and not much). Microsoft also donated some money to Sage this week! And of course there is Google Summer of Code, which is sponsoring many students to work on Sage this summer.
I'm not familiar enough, but is there any path for you make it more marketable with things like:
-some sort of iOS/android app
-somehow tying in computing modularization? I'm thinking like tensorflow or other such things
-picking off modeling tasks as use-cases. I get this from my step dad whose lab work has been in modeling..well, this: http://esd1.lbl.gov/research/projects/ascem/
I wonder if a simplified variant of SageMath / SageMathCloud would make sense as a Sandstorm app. IPython on Sandstorm is a generally excellent experience, and the sage notebook is a similar concept.
"Algorithm implementations are proprietary.
This needs to end. Science is not done behind a
curtain. At least, it has not been hidden since Tar-
taglia and Cardano fought over solving the cubic."
However, even if the Ma* software were suddenly open sourced it would be obvious that there was a huge problem.
Mathematics rests on several pillars which are currently absent in computational mathematics.
Mathematics rests on proofs. Where are the proofs for computational mathematical software?
Mathematics rests on research papers, books, and references. Where are the explanations of the theory behind the code? Where are the explanations of the design choices, such as which version of an equation was used and why? Computational mathematics needs much more than bare code.
Mathematics rests on courses and students. There may be one course but a whole focused curriculum on computational mathematics needs to exist.
Mathematics rests on funding. Universities, government, and some government organizations, like Oak Ridge, are the primary support.
William and I have had several discussions around our common problem of finding funding. I was the lead on Magnus (Infinite Group Theory) at City College of New York. We struggled for funding all the time. Axiom can't be funded because there is no organization to handle receipts. Funding agency like accountants.
Indeed, accounting is vital to open source funding. I contacted several large organizations asking them to set up an "open source accounting firm" (OSAF) which would accept and administer the grants to open source projects. OSAF would accept the grant, maintain the account, disburse funds for valid receipts, and maintain financial records for inspection. Such an accounting organization is needed if an open source project is going to get government or company funding.
If SageMath could handle the OSAF issue then the various contributing projects used by Sage could apply for grants from companies or government, knowing that there is an organization capable of managing the funds. This has the non-trivial side benefit of making SageMath the primary focus.
Oh, and SageMath could take the "overhead" (more than 50% at most schools) for "paperwork". My provost lived rather well on the grants.
Thanks for posting. (1) I strongly agree that the problems of rigorous computational mathematics are far, far deeper than anything I even begin to hint at in my talk. It's daunting. (2) I think http://www.numfocus.org/ -- a new org I think Travis Oliphant co-founded -- may be an attempt to solve some of the accounting organization issues. What do you think of what they are doing? Maybe the best thing is to work with them.
Not being familiar with the software itself, but how is starting a company, which is just a legal interface around the effort, would advance the effort? Several comments here indicated that documentation was an inhibitor. Would the company structure fix that?
"You should realize at the outset that while knowing about the internals of Mathematica may be of intellectual interest, it is usually much less important in practice than you might at first suppose. Indeed, in almost all practical uses of Mathematica, issues about how Mathematica works inside turn out to be largely irrelevant. Particularly in more advanced applications of Mathematica, it may sometimes seem worthwhile to try to analyze internal algorithms in order to predict which way of doing a given computation will be the most efficient. But most often the analyses will not be worthwhile. For the internals of Mathematica are quite complicated."
Reference: http://reference.wolfram.com/language/tutorial/WhyYouDoNotUs...
For comparison, if you want to audit the Sage Math algorithms that your research depends upon, all you need to do is fire up a text editor (or browse their github). And you won't find any statement in the Sage Math docs telling you not to bother because you're too dumb to understand what you're reading anyway.