Ask HN: How much of your time is spent on classical algorithmic problems?

ispivey · on March 16, 2013

5% - but it's the very important 5% where I'm optimizing existing code. The "classical algorithmic problems" you reference are almost all optimization problems; there's a naive search/sort/comparison, but how can you do it better?

The algorithmic problems I deal with are things like:

- Why the hell is this API call running O(n*m) SQL queries? Can I tweak my logic and use of the ORM get it down to O(1)? The ORM doesn't let you forget about the underlying algorithms, it just makes it easy for you to shoot yourself in the foot if you don't understand them.

- Why is my single SQL query so slow? DB optimization is very much a "classic algorithmic problem", in that you need to understand the operations the DB performs and optimization is reducing its search space / # of operations.

- We're using too many jQuery pseudoselectors like ":hidden" and they're causing framerate drop on a particular page; can we use some dynamic programming / memoization techniques to dramatically reduce use of pseudoselectors?

- We need to figure out what font size will allow a piece of text to fit within a container on a PDF, but the bounds of the font (space consumed) don't scale perfectly linearly with the font's point-size. Finding the best fit is a search problem!

- And that's not even getting into infrastructure scaling issues! Do you know why Vertica might be a better fit for your data set and end user needs than Hadoop? If you understand the difference between an ETL approach and a MapReduce approach to data analysis, you're thinking about algorithms!

I spend a meaningful amount of time on "classical algorithmic problems" while doing frontend and backend web development, even if I've never had to re-implement mergesort.

mattmanser · on March 16, 2013

Apart from maybe the last one, none of them are classical algo problems.

They're basic math problems.

Query takes N milliseconds which is too long, mainly consisting of z & c when I run my profiler. It does x * y = z of these things and a * b = c of these things.

As z is the biggest number, can I reduce x or y?

Basic maths.

And even the last one you're better off actually trying out the technologies to trying to armchair theorize which tech is a better fit as without reading all their code you can't really see what they're really doing compared to what they say they do. Heroku being a pertinent case in point.

ispivey · on March 16, 2013

What's "classical" in your book?

Search trees, dynamic programming, and reducing an algorithm's worst-case runtime are about as "classical" as you get; that's why they're in every beginning algorithms textbook.

rdtsc · on March 16, 2013

Unless you know "classical algorithms" and their typical performance characteristics, you are bound to waste time tweaking and may even rediscover a few complexity theorems, reductions and even complexity classes all on your own.

Oh you could just know the basics and classical algorithms and able to do a smell test of algorithmic performance.

chipsy · on March 16, 2013

In the last two months I started deliberately increasing the amount of coding time that is classical algorithm time. Not by working through books and problem sets, but by redefining day-to-day engineering problems that have an easy way out - bash away at it with "industry best practices" and whatever comes to mind, until it doesn't break - into formalized versions that become first data-driven, and then a DSL, so that each "100 lines" solution becomes an effective 1-liner, in the fashion suggested by VPRI's work on new languages. This motivates studying at the algorithmic level and researching PL theory to synthesize that ideal solution.

As a result of this, I am writing very, very few lines of code, in a start-and-stop fashion. I believe it is over 70% algorithmic at this point. The reliance on deep insight is leading me to actively avoid the typical coder's flow state and spend a lot more time on "sit and think" instead. If I hit flow for a long stretch something is wrong.

And, certainly, I would have more "shipping code" at the moment without going this route. But it's still ultimately guided by problems I'm familiar with from doing them the "best practices" way - and each time I make progress, I think to myself "why did I not do it this way before?" (It helps that I have a lot of freedom, working alone.)

Also, it's quite scary doing things this way. The high uncertainty about what the solution is, even knowing the problems well going in, is absolutely terrifying.

ntoshev · on March 16, 2013

Can you show some code that is the result from this process?

It seems really interesting, although I've always consciously fought with the tendency to overformalize problems I don't yet understand well.

chipsy · on March 17, 2013

I'm embarrassed to show it since I only have parts of a system right now, not a finished system. But as I implied in the OP, the code I'm writing is mostly conventional algorithms and data structures; the twist is in some subtle detail that ripples out towards the top level. I'll give some examples which will hopefully be more instructive than a code dump.

One of my targets was the entity system used to describe a scene in video games. Conventionally this involves an OO hierarchy of some kind, either compositional or inheritance based. The problem is that you don't know up front all of the sorts of properties an entity will need and what algorithms it will work with; the integration of the various subsystems acting upon an entity is a high-maintenance area and tends to involve messy boilerplate data tracking. I pursued the problem further, and came up with a way to move the integration process out of the subsystems code.

This was done by creating a customized set of collections(array, key-value, bag) that associate their data against an entity structure, can handle multiple instances of data associated against the same entity, can efficiently access data of a specific entity, and can remove all of the data related to that entity upon a despawn request. Once these constraints are satisfied, the only thing each subsystem has to do is work with those collections.

Also of interest, and still in progress, is an implementation of J. Paul Morrison's "Flow Based Programming" concepts to control overall program structure. I was looking around at ways to minimize coupling, and this particular method added a degree of structure and contractual behavior.

The first thing that happened after taking on this concept was to work through the ramifications for interactive programs: I had to start with a big data structure containing all the state, and then split it out and merge it together to attain synchronized behaviors. This in turn motivated the entity system idea. It's a very spatial model, so I am now doing some work on a graphical tool.

NoFlo is an example of a more finished FBP implementation than mine: http://noflojs.org/

I have some other ideas too, but they're in a very early stage so there's little to say. Having prior experience in the domain is a huge help, and I wouldn't be taking this approach to conquer arbitrary new problems. I'd try to make at least one mess first ;)

yen223 · on March 16, 2013

My current job is to write software for those huge assembly-line robots you see in factories.

I entered this field (taking a significant pay cut) thinking that maybe I could put some of my algorithmic knowledge to use. Turns out 90% of my job is rote logic - rewriting process flows and the like. Nothing too smart. Really boring tedious stuff.

The 'cleverest' task I did was implementing a Kalman filter to smooth over some encoder readings. That was the most fun I had in a while.

mindcrime · on March 16, 2013

Averaged over my entire career, about 0%.

In isolation, I've had a few brief stretches where I spent some time doing low level algorithmic stuff. I wrote some sorting/searching stuff by hand once, because I was working on external files, not doing it in memory. There was, at the time, either no library that did exactly what I was trying to do, or I didn't know about it.

More recently I've done some graph theory based stuff (Floyd-Warshall algorithm for "all pairs - shortest paths"), as I was experimenting with some social network analysis stuff that I wanted to do. But even that probably won't go anywhere. As I explore the capabilities of existing off-the-shelf graph database products, I'm finding most - if not all - of what I need.. At one point I thought I might need to roll my own, but it's a burgeoning field and there are more and more OSS options popping up all the time.

karamazov · on March 16, 2013

The most useful part of studying algorithms is knowing what problems they solve. If you've never heard of sorting algorithms, you'll waste time implementing a bad solution.

(Everyone knows about sorting algorithms, but many algorithms are not well-known. They're almost never needed, but when they're useful, they're really useful.)

betterunix · on March 16, 2013

I would say the most useful part of studying algorithms is knowing how to analyze them and understanding the meaning of asymptotic analysis. Understanding that asymptotically fast algorithms might be slow for the problem sizes you are dealing with is the most important thing an algorithms course can teach.

BruceIV · on March 16, 2013

This. I've been marking undergrad assignments, and you would not believe how many sets and sorted sets I've seen implemented by linear scan over an unsorted array (or linear scan over a sorted array with linear time insert, and a few other, worse things). This lack of understanding of algorithms leads to slow, unscalable solutions, and even writing business middleware you at least need to know which standard library collections to use for good performance.

gsoltis · on March 16, 2013

Leading up to my current job, virtually none. However, at my current job (Firebase), it now makes up a significant percentage of what I do. The way I've described it to friends is: "You know how most companies ask you tree traversal/sorting/search questions in an interview, then when you show up for work the first day, your job turns out to be 'add this button to this web form'? We actually work on those algorithm problems." It helps that we are an infrastructure company and that our user interface is an API.

If this is something you're looking for, my recommendation is to look at either large companies that have large infrastructure needs (Google, Facebook, Twitter, Amazon, LinkedIn, etc.) or small infrastructure-related startups, depending on your preferred working environment. If the hard parts of your job are offloaded to the database/language runtime/middleware, look for companies that work on databases, language runtimes or middleware. And in the interview, ask them for examples of hard problems that they have solved.

Also, I have absolutely nothing against the companies that don't do this kind of work. In fact, Firebase exists so others don't have to do this kind of work. This is just an observation that occasionally, the skillset tested in these interviews is not in line with the responsibilities a successful candidate will end up having, so make sure you ask.

yankoff · on March 16, 2013

No one spends time on job solving "classical algorithmic problems", they are solved already ;) Unless you work in research and looking for more efficient solutions.

But understanding algorithms brings you to a new level and opens a wide scope of other problems you can solve by applying this knowledge.

If you are not using knowledge about algorithms in your programming job it doesn't mean such knowledge is useless. Maybe that means your problems are not hard enough for this knowledge to be applied.

obsurvey · on March 16, 2013

I dont spend a lot of time on these kind of problems. But the times I do, it's an exillerating experience. Knowing the theory of big O notation and time complexety of algorithms has allowed me to build amazing things, I've even used for GUI and JavaScript in the browser.

The most exciting stuff has been implementing things where I have been unsure if what I was trying to would be possible at all in a browser. The feeling you then get when it actually works, and you know that you've pushed a limit that you have never seen anyone do in the browser before, can' be beaten.

If you push yourself to the limit, think outside the box etc. You are going to need knowledge of algorithms.

pkaler · on March 16, 2013

I did game development in a previous lifetime. Probably half my day was implementing algorithms and thinking about the correct data structure for a problem.

I spend less time these days implementing algorithms. However, you need a good understanding of algorithms to decide between using ActiveRecord or a service layer. You need a good understanding of algorithms to pick between a relational DB, a key-value store, or a document store.

Maybe you aren't picking difficult enough problems if you don't have to think about this kind of stuff.

bm1362 · on March 16, 2013

I've mainly done web development and typical university style assignments for the bulk of my programming experience- barely encountering algorithmic problems that weren't well defined.

Recently, I started creating a throwaway physics/game engine for fun and it is an awesome way to learn some really cool algorithms. Also, I did a small image manipulation app that exposed me to some cool stuff as well.

As a new grad it definitely pushed me more towards game development- would you mind sharing your reasons for leaving?

ispivey · on March 16, 2013

Game development is hard. The average engineering role at a games company involves a lot of work on hard deadlines.

It's also tough for some engineers because there's usually a very clear product vs engineering divide -- you're implementing stuff other people come up with, and when they change their minds you throw it out and do it again a different way.

That said, making games is also a ton of fun and is a great way to be exposed to a lot of interesting engineering challenges. If you like games and you can put up with the issues above, you should definitely consider it.

barredo · on March 16, 2013

Pretty much 0%. Mostly because of libraries already solving the problem for me.

freework · on March 16, 2013

Just about every project I've worked on contained one "chunk" of the project that was an algorithm, while the remaining 95% of the functionality was achieved by implemented by Postgres, Apache, Django, etc. This is why I (and other people that work this way) am able to get things done so fast. As I've gotten better at launching projects, my algorithm writing skills have not gotten much better, but my "open source understanding and researching" skills have multiplied hundred-fold.

atirip · on March 16, 2013

In 25 years of coding i needed to sort something about three times (not counting sql) in total. So when somebody asks me to write some sorter as like in job interview, i'll probably fail...

betterunix · on March 16, 2013

Well, here's something to think about: is it possible that sorting some data could have improved the performance of your code? Here is an example that my adviser used when he last taught algorithms at the undergrad level:

Given three arrays of integers, determine if it is possible to take a sum of three integers, one from each array, that equals a target value (say, 0).

It is not immediately obvious, but sorting the arrays will allow you to solve the problem much faster (whether or not this is the fastest solution possible is actually an open question). This problem has connections to several computational geometry problems, and I have heard that variations of it come up in certain real-world applications.

Also, there is more to algorithms than just sorting.

femto113 · on March 16, 2013

My experience over about 20 years of writing software professionally is that I need to tackle one algorithmic problem a year. Probably amounts to a day or two of work, and usually it is a refinement or adaptation of an existing algorithm, rather than totally new code. Call it less than 1% of my time.

rcj_ · on March 16, 2013

I work on a large commercial database mainly on optimizing joins and a lot of the work that has to be done is algorithmic.

Library functions normally are to generic and therefor we end up implementing lots of different variations of fast and memory efficient searching, sorting, graph optimizing, etc.

shepik · on March 16, 2013

At my last job, the coolest project i did was to create an image duplicate detection system (so that it would tell, given an image, that the image is new, or it's a version of what was uploaded before, or the image contains fragments of some known image, etc). It used opencv for feature points detection, so i had to maintain an index to search for those points (there were around 7M images, and load was up to 50 rps). I came up with a rather clever algorithm, which as i later discovered (thanks, wikipedia) is a kind of implicid k-d tree.

That task was the most algorithm-intense task I ever had, and that algorithmical part took 20% of the time. 80% was straight coding, glueing things together, testing and putting all that in production system.

_hgt1 · on March 16, 2013

100% of my time is spent on cache invalidation and naming things.

ntoshev · on March 16, 2013

It's off-topic, but could anyone critique a rule for (not) naming things I came up with recently:

If something (an expression or function) is used more than once, give it a name. If it's used once, name it iff you would otherwise comment to explain it. Otherwise, inline it.

andrewem · on March 17, 2013

Sure, that's basically a rule for when to do the "introduce explaining variable" refactoring [1]. There's a bit of taste as to when to name things in this instance, but certainly if a name can replace a comment then it's better to use a name. You might additionally use a name for something you wouldn't comment, if the logic is straightforward but the code is long enough that it would otherwise be hard to read.

If you use something more than once, then you either have to give it a name or repeat it, and generally people like to avoid repeating things. Some people will accept two copies and get rid of the duplication when they get to three copies, though purists would dislike that.

[1] http://www.refactoring.com/catalog/introduceExplainingVariab...

nrgc · on March 16, 2013

A lot of CS people complain about that. Especially in college and those who are just joining the work-force.

The classical problems were problems back in the day. There is no point in re-implementing them nowadays. But it is very important to know how they are implemented, and the logic behind them.

Why?

Because we are tackling much, much harder problems. Fields like data analysis, NLP, machine learning are yet to reach their potential. If we cannot master the easy classical problems, how are we expected to tackle the bigger fields? By master, I mean understand and not re-implement/re-invent the wheel every single time.

jes5199 · on March 16, 2013

Conversations like this make me wonder if there's some disruption waiting to happen. My understanding is that from a theoretical perspective software should be reducible to a combination of algorithms and business rules, but we seem to spend our time on something else. Is there some meta-solution to the rote stuff that we just haven't noticed? Doesn't it seem like the most tedious tasks - Extract-Transform-Load, for example - could get automated away? Is there something preventing that from happening? What is it?

cdavid · on March 16, 2013

You are assuming that you know both the algorithmic and the business rules when you start the project. That's rarely if ever true.

People in charge of setting those business rules also rarely know what they actually want in details. This is what is meant by the joke that writing an accounting software is order of magnitude harder than building a kernel, even if the latter is much more challenging technically speaking.

vbtemp · on March 16, 2013

First, in education, sorting algorithms are used as prime examples of well-defined problems with well studied solution.. they are not taught because every student is expected to work on sorting problems for the rest of their life. It is usually the first part of a course on the design and analysis of algorithms.

As for my answer: Most of my time is definitely in the usual grind of software engineering: designing, testing, maintaining, etc... However, at my lab we have been engaged in a number of projects involving network protocol engineering and the development of formal code/model verification tools. As a first step toward the development of systems like this, you must be able to construct a precise formal model, and then show that in such a model, then so-and-so properties hold true (including memory usage, upper/lower bounds, stability [in cases of routing algorithms], intractability of certain values [in the case of secure protocols], etc...). If you are developing software for usual line-of-business type applications, you'll do much less basic algorithm development than if you are, say, developing the routing software for a wireless ad-hoc mesh network.

jplur · on March 16, 2013

When I was writing games, There was a lot of re-implementing nearest neighbor searches in python. Now that I'm working on a Saas web app, it's 90% UI busy work.

rm999 · on March 16, 2013

A lot. I work with large amounts of data so relatively minor changes in algorithmic complexity can determine if a method is practical or not.

Recently I found a great new dataset but it didn't work well with our database - the naive method of doing lookups was at least 10x too slow to be practical. Studying the underlying algorithms of our database helped me find a solution that took up 10x as much space but was 100x faster.

alexdowad · on March 16, 2013

How much of my time is spent on algorithmic problems? Not as nearly as much as I would like!

I once wanted to stay away from web development because I knew a lot of it would be boring stuff like learning to work around browser bugs (which change with every generation of browsers, rendering your previous learning useless), rather than the mathematical/algorithmic mind puzzles which I really enjoy. I gave in because there's so much work available (and I found some clients who pay really well).

I think having experience working on both low-level embedded software and algorithmically intensive software (natural language processing, etc) gives me some advantage as a web developer. When the need to devise an algorithm or do something else hard-core comes up (which it does from time to time), I take it on with pleasure rather than shrinking back in fear. Or if I find a bug in my platform, I can drop down into the C source, figure it out, and submit a fix to the maintainers, rather than waiting for someone else to help.

norswap · on March 16, 2013

I'm writing my master thesis in computer science, so quite a bit (it's a lisp-like macro system for Java). But then it's not a real job.

ScottBurson · on March 16, 2013

That seems to be a popular topic. At least, I just came across this yesterday: http://www.cs.utah.edu/~rafkind/papers/dissertation.pdf

norswap · on March 17, 2013

Thanks, this is relevant to my interests :)

dereferenced · on March 16, 2013

0%, these are solved problems that you can easily use a library for.

alexdowad · on March 16, 2013

I guess that depends on which "classical" algorithmic problems he is referring to. The field of "algorithms" is like an infinite sea which will never be fully explored; there will always be more algorithmic problems for which no library is available.

For example, last year I worked on a software package which was intended to match clothing automatically to form outfits. The client was a fashion consultant; the software had to duplicate what he did in his day-to-day work, algorithmically. You won't find that one in "The Art of Computer Programming"! (Or in any library, for that matter.)

army · on March 16, 2013

Depends on what you're doing. There are plenty of cases where, for example, the Java collections library doesn't quite do the trick for various reasons. E.g. the standard collections are very memory inefficient for storing primitive types, or the standard linked list implementation doesn't support all the operations you might want - e.g. being able to clone an iterator.

sophacles · on March 16, 2013

Directly writing? Almost none. Indirectly? A lot.

Queueing theory turns out to be a big part of the system I'm working on, as do graph traversals. I'm not proving anything new, but I am certainly transforming known solutions to be present implicitly in the way data flows through the device (a network gateway).

I've also done work on numerical solutions, implementing various formulae the researchers have come up with, and dealing with various optimization (in the mathematical sense) problems.

Lately there has been a lot of talk on "new classic" things, applying machine learning algorithms to data sets, and finding relevant classifiers and adjustments to the ML algorithms that provide better results. Hopefully I'll get into implementing and doing real work on that soon.

I do work in academia though, in a department focused on implementing research in ways that industry can actually use it.

lutorm · on March 16, 2013

Zero. It's all refactoring ugly, old, unmaintainable and unexpandable code to remove all those characteristics. It's an exercise in "engineering" that involves zero algorithmic work apart from the occasional replacing of a linear scan through an array with some more efficient data structure.

smilliken · on March 17, 2013

At Mixrank we process a lot of data, so a lot of our time goes into designing efficient data systems. Algorithmic complexity really matters, and small tweaks can result in large improvements.

If you'd like to spend more time designing algorithms, then working on big data systems is a sure bet.

pjungwir · on March 16, 2013

Only a few times, but when I needed to solve something like that, it was really important. A few examples: processing a giant, highly-interconnected graph that was too big for memory, so figuring out how to split it up into parts small enough to handle; learning Bayesian statistics to make strong inferences from millions of weakly-indicating data points; making inferences about the nature of edges in a graph with tens of millions of nodes.

pidge · on March 16, 2013

~50% - I'm in "big data" right now, partly because I get to worry about the algorithmic complexity of things all the time. As the scale of a task goes up and the feasibility of just throwing more machines at it decreases, you can suddenly justify spending a lot of programmer time on interesting things.

dhruvbird · on March 16, 2013

Your job (much like your life) is what you make of it. If you seek out problems that are unsolved & require solutions that involve thinking up a clean/simple solution, you'll find yourself spending much more time solving algorithmic problems than just writing/copying rote logic.

alok-g · on March 16, 2013

Zero percent. On the other hand, for as much as 50% of the problems I solve, it needs understanding the nature of the problem and its solution in terms of the classical algorithms, which then are by definition already available somewhere at least as pseudo-code.

orangethirty · on March 16, 2013

Professionally: 0

My own project: At least once a week.

ams6110 · on March 17, 2013

Zero. I have not implemented any "classical algorithmic" code since college. Nobody writes search/sort/comparison code, or even implements data structure code such as balanced trees, skip lists, etc. because your framework or libraries provide those.

eLobato · on March 16, 2013

About 1/4th I'd say, another 1/3rd is thinking about architectural decisions (or restructuring bits that are broken) and the rest is mundane work including meetings. I work on infrastructure stuff

acgourley · on March 16, 2013

At BitGym we solve vision and sensor problems, and it has to be efficient because the result will run on an iPad. So yes, we spend a fair amount of time on it. Maybe 20% of our heads down time.

yesimahuman · on March 16, 2013

I think it depends. I've been building GUI builders and there is a ton of tree manipulation and traversal. Beyond that though most operations are already available in the standard libs.

zaphar · on March 16, 2013

roughly 50% of my current duties involve parsing and transforming an AST. As well as performing static analysis.

All of it is to support web developers by giving them tools to handle html and css well. But this is not the usual case for my career much of it has been exactly what you describe instead.

I'm just glad that I knew enough of this stuff to be able to tackle these problems when there was a need to.

djbender · on March 16, 2013

Zero.

manoji · on March 20, 2013

I would say less than 1 % . Most of the time its picking the best from an already existing solution to suit the need.

herge · on March 16, 2013

Not much of my time but a lot of my value.

logn · on March 19, 2013

Take a job involving map-reduce. Search/sort/comparison in about all you'll do.

graycat · on March 16, 2013

First, let's look at what is an 'algorithm': The examples that got famous were the sorting and searching algorithms in Knuth's TACP. So, people quickly coded up bubble sort which has running time proportional to n^2 for sorting n items, and in practice n^2 grew too darned fast. So, people were eager for Shell sort, quick sort, heap sort, merge sort, etc. which all had running times proportional or nearly so to (n)log(n) in average case or even worst case. Then there was radix sort which could beat (n)log(n). So, people got excited about 'algorithms'.

So, what did we get? We wanted to sort (or work with trees, etc.). We knew how to do that manually. We could write the code, but our first cut code tended to be too slow. Heap sort, etc. are really clever. Stare at the code (typically are doing an informal proof of correctness by mathematical induction based on the definition of the natural numbers) and can confirm that it sorts, and with analysis as in Knuth can confirm the running time, e.g., worst case (n)log(n). So, net, there's not much question about what it does, that it actually sorts, and how fast it is. Great -- for sorting and searching.

Do I ever do such things? Actually, early in my project I wrote a bunch of heap sort routines, polymorphic, etc. I wanted to write some 'stable' versions but didn't get around to it this time. Also can write versions that sort by 'surrogate', that is, move only pointers and not the data. For some of the details of how Microsoft's Visual Basic .NET works with strings and instances of objects, there is some question about the value of sorting by surrogate! Then with the results, can have an O(n) routine that will take the 'permutation' of the pointers and move the data. Etc. I didn't write everything I could think of in sorting, but some of what I wrote is good and used in my code. And I got some polymorphic code that is only 2-3 times slower than a non-polymorphic version. So, yes, I did some work in algorithms, even some quite old ones.

Next I needed a 'priority queue' so borrowed the heap data structure from heap sort and, presto, got a nice priority queue routine. It's in my production code.

So, yes, I worked in algorithms. And for my project, that work is crucial in the sense that without it my software wouldn't work or, if I had used code I knew about from others, not work as well.

But now there is another issue in computing and software: There are old examples; since they are more explicit, let's start with those. For a positive integer n, suppose A is an n x n matrix of real numbers and we seek x, n x 1, to solve Ax = b where b is also real numbers and n x 1.

Now we need an 'algorithm'? Not really! An 'algorithm' as in sorting is not enough! Instead we need some mathematics prior to any sense of an 'algorithm'! Usually the mathematics we use is Gauss elimination, and a good chunk of a course in linear algebra makes clear just what that is doing and why it works.

Here's much of the difference: Looking at a description of code for, say, heap sort, it's easy to see that it sorts. Looking at a description of code for Gauss elimination, it's totally obscure why it solves Ax = b, that Ax = b has none, one, or infinitely many solutions, and if infinitely many how to characterize them. Instead of just the code, just the 'algorithm' for Gauss elimination, we need the math from, say, linear algebra. Similarly for a numerical solution to an initial value problem for an ordinary differential equation.

Then more generally, for some software, it may be that some math, new or old, is crucial for knowing what the software will do, much as the linear algebra was crucial for knowing what some Gauss elimination code will do. E.g., in Gauss elimination, turns out have some choice about the 'pivot' elements. Well, typically want to select the pivot that is largest in absolute value. Why? Some math! Next, do a lot of 'inner product' arithmetic. Then want to accumulate inner products in double precision. Next, when get a solution, want to do a little more work with double precision and do some 'iterative improvement'. And would like to know the 'condition number' to get some error bounds -- all in the code and all needing some math to know why it works.

So, net, for many good solutions to real problems, an 'algorithm' alone has not enough to recommend it and need some prior logical support from, say, some applied math. So, in code, algorithms are not everything that is important; in such cases, an algorithm alone does not have enough to recommend it.

In my project, some original math was crucial, and got converted into code. But to have any confidence in what the code will do, can't just stare at the code and, instead, need the math. That is, just an algorithm was not enough.

So, in my project there has been work in both algorithms and applied math. That work is crucial in that without it the project would flop or nearly so.

But, right, also is working through a lot of routine logic. E.g., the code for yesterday was to be clear at the start of the code for a Web page just where the user came from. There were three cases, the first two legal and the third a symptom of an attempt at hacking. So, I needed to work out the logic. For that I needed some answers to some of how Microsoft's ASP.NET worked. E.g., if transfer to page B from page A via in page A code server.transfer("B"), then in the code of page B, what is the value of Request.UrlReferrer? Use the TIFO method -- try it and find out. So, right, such 'logic' is routine but takes time.

Then where most of the time went! Right, to none of the above! Most of the time, likely over 80%, went, may I have the envelope, please? Yes, here it is (drum roll): The nominees for where most of the time went are (1) working through poorly written technical documentation, (2) software installation, (3) software configuration, (4) data backup and recovery, especially for the boot partition, (5) system security, e.g., fighting viruses, and (6) fighting bugs. And the winner is, (1) working through poorly written technical documentation! Heck, reading Knuth, Ullman, Sedgewick, etc. was fast, fun, and easy, but reading the documentation on the 'platform' was painful and wasteful.

So, the crucial stuff unique to the project took the least time. The rest of the work unique to the project took much more time but still was fast, fun, and easy. Work on (1)-(6) essentially independent of the project took maybe 80% of the time.

In particular, the most valuable work, that gives the project some potential, took the least time; routine work still unique to the project took much more time but still was fast; nearly all the time went for work not unique to the project.

Why? There is good/bad news: Some of the good news is that there is now so much infrastructure software that write less of own code and use code written by others. The bad news is that when using code written by others, also need to use their documentation, and for the industry so far writing good documentation is, in practice, essentially too difficult. E.g., I had to use just the TIFO method to determine what would be the value of Request.UrlReferrer. As far as I could tell from my 4000+ Web pages of documentation, the only way to know was the TIFO method. Then, of course, in my code I did put some documentation. I wish I didn't have to write my own documentation of the tools I am using from others.

Net, what's crucial and unique to a project may not take the most time. That is, the work that takes the most time may not be the most important work. And, then, can't measure importance to a project of some work by the time the work took.

natch · on March 16, 2013

Interesting, but I noticed you write about your project, singular. Have you had any other projects?

graycat · on March 16, 2013

Sure. At one time I was working to support my wife and myself through our Ph.D. programs. The work was in applied math and computing on mostly US Navy problems. At one time, there was an urgent request: Evaluate the survivability of the US SSBN fleet under a controversial scenario of global nuclear war limited to sea.

Point: A claim is that if find an SSBN, then can kill it with a nuke -- likely true. Another claim is that can't find an SSBN -- not really true but close to true since a promising effort to find the SSBNs would be 'provocative' and raise alarms. Also, if find and sink one, then that's the start of nuclear war, and the other SSBNs will be free to fire. So, what is tough is to find all the SSBNs at one time and sink them all at once. But if in the special scenario, then could find and sink the SSBNs one at a time. Then how long would they last?

The Navy wanted their answer quickly, in two weeks. That was about right, since my wife already had us scheduled for a vacation cabin in Shenandoah starting the day after the due date!

So, I derived some math, wrote some software, and both the Navy and my wife got what they wanted on time!

My understanding is that my work was later sold to another group interested in US national security. I could tell you what that group was, but then I'd have to ...!

There have been other projects!