Hacker News new | past | comments | ask | show | jobs | submit login
Frequently Forgotten Fundamental Facts about Software Engineering (2001) [pdf] (kictanet.or.ke)
178 points by fauria on April 28, 2015 | hide | past | favorite | 73 comments



When you say things like "I consider these to be facts" and "you may not agree with these facts", it's a clue that we may not in fact be dealing with facts.


The lack of citations is concerning. I could read his book but it worries me when numbers like these are thrown around without some way to verify the analysis.

The "28x productivity" (or 10x or Xx ...) claim always triggers warning bells for me. http://morendil.github.io/folklore.html does a pretty good job explaining how the research backing that widely accepted "fact" may be questionable.

I also really like Dan Luu's review of the research behind static typing at http://danluu.com/empirical-pl/ as an example of how a "well studied" claim can still be questionable due to the massive difficulty in evaluating software engineering empirically.

Software engineering is a very human activity and that makes it very hard to measure and quantify.

A book that does a better job is Making Software (http://www.amazon.com/Making-Software-ebook/dp/B004D4YI6G/re...) but as my first link points out, it still has some issues. At least its goal is to get more rigorous in our scientific analysis of software engineering.


"The Leprechauns of Software Engineering" is a fun read that takes a look into the origins of some of the software engineering folklore: https://leanpub.com/leprechauns/read


Thanks for the reference! It IS a fun read and informative.


In his defense, the full quote is actually not so bad.

I don’t expect you to agree with all these facts; some of them might even upset you. Great! Then we can begin a dialog about which facts really are facts and which are merely figments of my vivid loyal opposition imagination!

It's interesting that he was (is?) a vocal dissenter on Open Source.


Wikipedia says this:

>in 2000, Glass criticized open-source software, predicting that it will not reach far, and "will be limited to one or a few cults emerging from a niche culture." Glass's basis for this bold prediction was that open-source software "goes against the grain of everything I know about the software field"

Not to judge off something that small, but if "everything he knew" told him open source was not going to reach far, then perhaps his other facts are off.


I don't see in what sense his "facts" are factual.

There are no references. They look more like random opinions - some interesting, but with a fair sprinkling of platitudes - supported by some equally random numbers, of the "Did you know fifty percent of statistics are made up?" kind.

I think it's interesting how dated the piece looks. Equivalent writing today on Quora, Medium, or HN - never mind an IEEE journal - would be more likely to discuss real research.

It might not be certain to have facts, but I think standards of argument have improved significantly - possibly because it's so much easier to find and reference studies than it was when a lot of debate happened very slowly in print.


I think he puts the "Facts" out there with a grain of salt - and admits it in the preamble. They're very much a source of discussion rather than hard and true facts. (Software engineering is a social science where experiments are virtually impossible to replicate. As such, it's only slightly more rigorous than "Facts" in History or Political Science classes.)


'phenomena' might have been a better choice. We have a bad habit of forgetting things that can actually be empirically measured.


One of the first things here is about the relationship between problem complexity and solution complexity. But how is this complexity measured empirically?

We have the notion of cyclomatic complexity, an empirical measurement that can be used to assess the complexity of a piece of code, and presumably that code is the solution to a problem. So, OK, we have something like a measurement of solution complexity, but how is problem complexity measured, and what is its empirical relationship to the measure of complexity of the solution? Hell, what are the units we would measure such complexity, even?

I think people who do what we do have a certain insecurity about how much of what we do is decidedly not science. And we have this tendency to talk in science-y ways about what we do, which sound good to our ears, but aren't quite rooted in anything concrete.


The problem with cyclomatic complexity measurements, is that if you take two similar pieces of code, one with error handling for all potential failures, and one with absolutely no errors handling you get a large variation in the resulting complexity metric. Both pieces of code will probably pass simple correctness tests. But you can't really even make a statement about which is more robust, as the code with all the error handling paths may be full of latent bugs of worse severity than (another thing hard to quantify) simply not handling the errors.

So, when people start talking about the cyclomatic complexity, it seems to me that we are better off understanding we are deep into a philosophical discussion rather than a scientific one.


Cyclomatic complexity may just be another way of saying "lines of code":

(http://www.scirp.org/journal/PaperDownload.aspx?paperID=779)

We found that due mostly to issues regarding population variance, that the linearity of the relationship between these two measurements has been severely un-derestimated. Using modern statistical tools we develop linear models that can account for the majority of CC by LOC alone. We conclude that CC has no explanatory power of its own and that LOC and CC measure the same property. We also conclude that if CC does have any validity as a measure of either complexity or test space size, then we must conclude these factors grow linearly with size regardless of software language, paradigm, or methodology.

I was looking for another reference, and I found this one, which seems far better researched.


You can estimate complexity by the time to add or fix functionality in that part of the code. That's pretty much why you're measuring complexity in the first place. So you know if you're hemoraging or not.


The article author, Robert Glass, took these from his book, "Facts and Fallacies of Software Engineering" (ISBN 978-0321117427), which is definitely worth picking up, or in the very least worth skimming the table of contents¹ of.

¹ For example, here: http://blog.codinghorror.com/revisiting-the-facts-and-fallac...


> One of the two most common causes of runaway projects is unstable requirements. Requirements errors are the most expensive to fix during production. Missing requirements are the hardest requirements errors to correct.

Hmm, doesn't this conflict with the idea of "rapid prototyping", where new requirements are thrown in whenever necessary?


In my experience, "rapid prototyping" is an euphemism for "hacking some shit together". Once a product is perceived to be complete, management will insist to keep adding to it incrementally instead of discard any part of it.

They will also avoid allocating time/resources to fix modules that are known to carry significant technical debt. Only until said technical debt begins to cause severe problems any official attempt to address the issue will be done.

In the mean time, savvy senior engineers will fight a guerrilla war to keep technical debt at bay. They will require, incremental improvements to be carried out in parallel with bug fixes and new feature development, typically whenever you touch a file/function known to be in a bad shape. This is more or less OK, but can result in an irrational fear in the team to change stuff for any non-essential reason, eventually defeating its own purpose.

At the end, I always assume that whatever code, no matter how crappy, I get to commit into the source control system, will eventually find its way to our customer's machines. No matter what claims of temporarility are made by people, if it gets out of your localhost, it will be promoted to officialdom in no time.


At one point I had been so burned by this that any time I was asked to do a POC for something, my POC would output all of its data to console and nowhere else. I proved I could sort things in some weird way or read an impossibly large file, but there was no way in hell they could ship my prototype.

Then we had a calm discussion about how long it would take to productize it (usually weeks or a couple of months). It might not seem like that big of a coup but attempting to ship 2x as much as you actually can is a sure way to be absolutely miserable by version 2.1 or 2.2 of the product.


I do mostly web development, so this would be tricky, but I'd love to see a Balsamiq-like Bootstrap theme, so that you can present a "working wireframe" with the same connotations.


In cases where you perform A/B testing it may require a slick interface, whilst still being a POC. In this case I have looked for ways to reduce complexity and ideally any services which require hosting should be clearly hacked together - such that anyone seeing it can give a clear indication that this will not go into production. Choosing to ignore thread-safety and use local in-memory datastores (e.g. non-persistent) will help to not only ship the POC quickly, but also make its temporary nature rapidly apparent.


Cool, I like your approach. In many environments, CLI is very clearly (though probably incorrectly) perceived to be "not done yet".

Maybe it is possible to marshal user expectations by carefully controlling how much of the GUI is produced. Not make any pages/windows until the underlying business logic is production ready... it's a wild guess, but might work


Your proposed approach is very much in line with Joel Spolsky prescribed in his iceberg essay.

http://www.joelonsoftware.com/articles/fog0000000356.html


Consider prototypes a way of quickly getting the user to confirm: no, that's not right :-)

Only a half joke, too: it's easier to critique and improve something concrete, even if it's objectively rubbish, than to go from 0 to 100% from just abstract discussion.


The only problem is that quick prototypes and temporary solutions have a habit of becoming permanent, especially if you are understaffed.


Rapid prototyping isn't a silver bullet (there is no silver bullet). Some approaches can help you better stay on top of unstable requirements than others, but if your requirements are missing or unstable, your project is unstable and poorly defined, period.


The way I've understood it: Prototyping maybe be an effort to develop Requirements. It should not be used to develop a Product.


> It should not be used to develop a Product.

I disagree with that conclusion. Prototypes help in both clarifying requirements as well as eliciting requirements. However, for the latter, a prototype may be necessary but not sufficient.


How can I explain this to my boss? :)


Make it look like a prototype? I've taken to plastering big warning on non-production systems. Otherwise someone looks at a fragile dev environment, doesn't see it breaking outright, then goes and sells live customers into it and uses the prototype/demo system to setup real accounts. No amount of verbal or email warnings will change this. Hurray, now the one-off demo system is a production environment! As a bonus, you get to write data migration programs to move these customers to the entirely-unrelated final product!

But instead, if there are big red Xs and so on in the UI, then it becomes clear "oh this is a prototype/demo/".


Instead of explaining it, there can be steps taken to ensure a transition between prototype and product, such as prototyping in a language or for a stack that your company does not use in production.


Alternatively, if the company still forces you to push the prototype to production, this can be a great way of sneaking in the technology stack that you wanted to use in the first place.


In addition to what brianmcc said: Given that the requirements are going to change, what are you going to do about it? The ability to change rapidly gives you a chance (not certainty) to respond successfully to a requirements change.


What certainly helps:

* Having small, self-contained, loosely-coupled modules.

* Having an extensive test suite for each module.

* Having small (or zero) amount of technical debt.

* Talking to your customers constantly, preferably before you commit a lot of resources into a new development.

Easier said than done, but definitely not impossible.


I'd add obviousness here. Make the design and code as obvious as possible. Simple and easy to understand.


Correct. I guess I was thinking more agile rather than rapid prototyping.


Does this book contain references for the facts he cites? I would like to use several of his arguments in my own work but it's hard without proper references.


Yes, each of the facts listed in the book has a Sources paragraph and bibliographical references supporting the fact.


From the Coding Horror article, number 30 is:

"COBOL is a very bad language, but all the others are so much worse."


I always say "people don't have requirements, they have problems they want solved". IMO too many people in software get hung up thinking "requirements" means some kind of detailed design document. You're gonna wait a long time for someone with a problem to tell you how to solve it...


Yes! But often they have a hard time expressing it in terms of a problem, but instead think of a way to solve it, and present that to you. This is bad both because you as the developer don't know what they really want to do, but also because, being inexperienced at software design, their proposed solution is usually not very good.

So often I find the first step is to back up from their proposed solution, by gently probing what their reasoning is, then once the problem is better defined, only then going forward again sketching out a solution.


This exactly.

I always get people wanting another "email when this happens". When I dig bit more into the details, a confirm page on save is a far better solution.

I find the worst are semi technical managers who have been promoted too early, as they come up with junior level programer type solutions and think they are being helpful working out that for you.


Problem is, they want fixed prices, fixed deadlines, no budget for testing or redoing something.


So, they just want low-quality software? The market happily addresses this desire :-\


This is a prime example of where the writer is stuck in a pre-Agile world. (This was 15 years ago, and most of his hands on work was well before then) If one follows and believes in the Waterfall method, tightening requirements is the most important thing to ensuring that the original goals are met. Modern software engineering has realized that changing requirements are a reality, and forcing sign-offs doesn't help as much as a flexible process.


This was the agile party line, but nothing in agile development helps with the core problem of changing requirements. If you design/code a solution to one set of problems using an underlying set of assumptions and when someone shows up and changes those assumptions it doesn't matter if your using an agile process or a heavyweight waterfall process. The only difference is going to be in the amount of time it takes to iterate the design and tear the existing product apart and rebuild it with the new set of assumptions.

The real savings with "agile" methodologies is the understanding that the code is the documentation. This doesn't free you from having requirements or design documents, it just allows you to spend less time on that part of the process. For any sufficiently complex project not having block diagrams of how everything fits together, and basic documentation of subsystem interfaces just means you waste a ton of time reading the detailed implementation before you can understand how the system works.

In other words its the same problem you have with heavyweight processes. If you have to read 500 pages of design documents to understand how to integrate your routine, that is the same has having to read 50,000 lines of code to understand how to integrate a piece of code.


I think we're going the same direction. The investment is less before you have to change. If you spend 6 months gathering requirements that are obsolete (or wrong for unanticipated reasons) before they're finished, you've lost 6 months. Working in an iterative process ("Is this what you want?", "No, how about this?", "Or this?") reflects the reality that requirements change sometimes for external reasons, and sometimes because people don't know what you want.

Agile also shouldn't be an excuse not to document. My (perhaps not fully informed) view is that it's more about iteration.


Agile/scrum makes a valiant attempt with "user stories", but these frequently end up being precise specifications for the wing feathers of the actually desired magical flying unicorn pony.


So I have a database, 4 years on from the initial design, a relationship between two central tables goes from one-to-many to many-to-many (I asked at the time and was promised it wouldn't happen).

Neither Agile nor Waterfall are going to save me from a pain in the arse of a change that affects a lot of the application.


4 years on, neither will save you. That's an issue of trading off short term schedules for long term flexibility. (Not an absolute right or wrong)

Agile could have helped uncover the need earlier, or enabled the project to react to it mid-project. Post production changes don't seem better or worse served by either.


"Agile could have helped uncover the need earlier, or enabled the project to react to it mid-project."

The project is an in house database tracking the samples and pretty much everything else for a DNA sequencing centre. The (sequencing) tech changes fairly rapidly and database migrations are frequent. It is constantly mid-project.

If I asked three and a half years ago and was promised, "no it won't ever happen", how would "agile" have helped uncover it earlier? Would a load of unit tests written before have made a difference? A stand up meeting every morning? A scrum master? Sorry, that's just bullshit.

It came up when it came up, a couple of weeks back. For the time being the workflow in the organisation will have to workaround. Its not a common case, and its really not worth the effort at the moment.

For the record I don't do TDD, Scrum type Agile. I do follow a lot of the principles in the agile manifesto, I have to. I do what works for me and the organisation, and blindly following methodology wouldn't.


"Modern software engineering has realized that changing requirements are a reality, and forcing sign-offs doesn't help as much as a flexible process."

a "flexible process" sounds great. However, the reality is that when you allow flexibility, the project rarely ever gets completed on time or within budget..which also seems to be the main requirements for a business.

I've seen it way too much.


requirements creep is a nightmare when people get into the habit of it.


couldn't agree more.


A lot of this mirrors what I've seen in the real world.

"...except for the additional maintenance task of “understanding the existing product.” This task is the dominant maintenance activity, consuming roughly 30 percent of maintenance time."

I've definitely lost a large chunk of my programming life trying to understand unclear code in large systems!

Edit: removed ambiguous quantification


Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. — John F Woods



Someone seems to have told my colleagues that the violent psychopath only shows up during code reviews - so we never do them.

)-;


I code as if I'm going to be the poor soul figuring out this code in a year's time. Which is normally the case, so it works pretty well.


Worth reading. I especially love the section about estimates. I have seen all those problems at my clients in real projects. This is what got me interested in #NoEstimates in the first place (http://devteams.at/how_i_got_interested_in_noestimates)...

All other section from the essay are interesting too, and most are still very relevant - 14 years later!


I agree, totally worth reading, clear, concise, and probably very accurate. I liked the section about efficiency best. "Efficiency is more often a matter of design than of good coding".

Thanks for posting this.


I like this one - and remember it every time I try to pay up for a good programmer. One also has to wonder how much it's that poor programmers are overpaid.

P2. Good programmers are up to 30 times better than mediocre programmers, according to “individual differences” research. Given that their pay is never commensurate, they are the biggest bargains in the software field.

Much of the list stands the test of time, though it's very clear that his thinking predates Agile. Most old school software engineering experts talk of the need to tighten up unstable design. Modern thinking is to create processes that better react to the instability.


Can you judge how good a programmer is from an hour or two's worth of interview?


It's hard to differentiate Just Good Enough from Almost Good Enough in a 10 week internship. :-)

The 30x ones usually have a professional reputation that preceeds them - you know before the interview starts, and spend the hour or two selling them. They don't send resumes out. You either have to build people into them, or go hunting once you hear that their companies are struggling. (Or if you hear they are being mistreated)


> "REU3. Disagreement exists about why reuse-in-the-large is unsolved, although most agree that it is a management, not technology, problem (will, not skill). (Others say that finding sufficiently common subproblems across programming tasks is difficult. This would make reuse-in-the-large a problem inherent in the nature of software and the problems it solves, and thus relatively unsolvable)."

Of all these topics he talks about, I think this is the one that has changed the most in 14 years. Open Source librarys, github, etc, have made reuse-in-the-large so much easier and it's so much more common now.


I interpret it differently. Layering and libraries were solved ages ago (although we have many, many more layers now). This isn't what people were striving for when they talked about reuse in-the-large. It was more about domain models and objects like having a common Customer entity or reusing an insurance domain object model across different customers.

Horizontal re-use has always been possible but vertical reuse is a pipe dream.


Why is it that these lessons remain forgotten? I find them to be accurate. I can't help but to think that we are all doomed to repeat the same mistakes over and over again.


Because with our new Silver-Bullet Development Methodology(tm), it's a completely new world where none of your decades of experience apply! This completely changes the nature of software development forever!

p.s.: we sell certifications!

(repeat for a new Silver Bullet roughly each decade)


Individuals don't forget, organizations do i.e. when they loose the staff that knew it and replace them with staff that doesn't.


Because management doesn't want them to be true.


Those aren't forgotten facts so much as they're often "not-learned-yet" facts. Many points of varying validity, but nearly all only come from experience.

One item on the list I believe would change: REU2, reuse-in-the-large. With so many new services available, the number of addressable "common" use cases has become very granular. So, reuse-in-the-large takes on a new definition for me.


Agreed. It seems to me "frameworks" usually qualify as "in the large".


What a stellar list. The reuse part in particular needs to be read by many people. As he points out, reuse in-the-large will never happen for very good people reasons, not technology reasons.

> REU5. Pattern reuse is one solution to the problems inherent in code reuse.

Patterns are just Reuse in-the-small anyways.


I like the title. I think it should be an abbreviation of its own, like "FAQ": FFFF, fee-four. Like "Let's write a FFFF on memory management" or "Where can I read a FFFF on web app security?"


Q1. Quality is a collection of attributes. Various people define those attributes differently, but a commonly accepted collection is portability, reliability, efficiency, human engineering, testability, understandability, and modifiability.

Q2. Quality is not the same as satisfying users, meeting requirements, or meeting cost and schedule targets. However, all these things have an interesting relationship: User satisfaction = quality product + meets requirements + delivered when needed + appropriate cost.

As someone who specializes in quality, I'd say either he was wrong, or the practical definition has shifted.

What he calls "user satisfaction" is what I would equate with "product quality," at least in terms of what we mean when we aim for a particular quality bar before releasing. The various factors he lists as comprising quality are aspects of satisfaction that may or may not apply to a given audience of users, but the absence of one or more does not necessarily indicate low quality unless it's relevant to the user.

Compare Twitter when it started with Twitter now, for example. When it started, it was a relatively unreliable product, but still high quality for its set of users--at the very least, it was high enough quality that spending time and money to raise it may not have been a good idea. It was successful as it stood. Now the set of users and their expectations have shifted, both because the field's bar has raised in general and because it has enterprise use, so reliability is a much bigger deal. It's still high quality, but for different reasons.

Some of those things (portability, testability, modifiability) are simply conflating code quality with product quality, unless what you're developing is a code component. Even efficiency is meaningless to quality unless inefficiency causes the -customer- to bear more load. Maybe he meant to conflate those things, but I don't think they belong together. They're rather orthogonal: vim is a high product-quality app with (supposedly, haven't read it) pretty awful code quality. Conversely, I've seen plenty of pretty codebases that produced crappy apps.

And meeting requirements, delivering at an appropriate time (which is another form of meeting a requirement) and at an appropriate cost (yet another form of meeting a requirement), these are all absolutely important aspects of quality. The scale of quality a free-as-beer product is measured on will always be different than one a paid product will be measured on.

Basically, he's taking a very absolute approach to quality, rather than considering context. There's really no such thing as a "high quality product" in the absolute. It's all relative to the intended audience and use.

I will agree, though, that quality goes way beyond sheer absence of objective software defects. It's a shame that most of the industry tries to define it that way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: