This is quite a treasure trove of data for Word 1.0:
- 55 man years
- 38 man years by full time employees (FTEs)
- 7000 LOC/yr code written per FTE
- product + tools = 347 KLOC
- 12,511 bugs, 9377 deemed fixable
- 1197 bugs postponed (13% of that were fixable)
- 247 bugs/yr fixed per FTE
- 38 bugs/KLOC
- Total 5 yrs (1st yr just 1 person, last yr had 13 max, other yrs had 10 avg)
It is quite amazing that a young company allowed such a long running project and kept at it despite of many schedule misses for 5 years. This is now almost never observed even in resource rich mega corps.
Also it seems developer productivity and bug count per KLOC has more or less remained same all these years. I haven't spent much time on researching this but if its true then I wonder what we gained out of improvements in languages and other development infrastructure? May be perhaps ability to scale to teams of 100s of developers per project totalling million KLOC? Interestingly World 1.0 team never went above 13 people despite of so many delays - may be because it was hard to scale using infrastructure available at the time (for example, no OOP)? Regardless the fact that average developer productivity in terms of KLOC and bugs/KLOC has remained fairly constant is very interesting.
I know I'm showing the grey in my beard to say this, but I LOVED Word for DOS. It was fast and incredibly capable, and IIRC it was the only word processor available at the time (late 80s) that had a solid & usable stylesheet implementation.
The thing is, though, most people didn't use it. Then as now, people just manually formatted titles and whatnot, without ever realizing that they could define "Header 1" to be whatever they wanted and using it throughout, etc.
The folks who loved it, though, were people working in larger documents. At the time, I had a student job at my university, and more than once I helped grad students with dissertations and theses manage their stylesheets, and to watch the light bulb come on was pretty neat.
word 5 for DOS was so solid that I used it as my primary word processor for YEARS after it was technically out of date. WFW was a DOG initially -- incredibly slow, ridiculously unstable, etc. Even the final DOS version (5.5) was weak; they reskinned it to use a text-mode implementation of the CUA-style menu structure, replacing the idiosyncratic menus of the original DOS version and thereby destroying its speed.
(Kids: Back in the day, every program had its own menus and commands; there was no real uniformity of interface at all. It kinda sucked, but in this case it was a real strength of the product.)
I loved the WordPerfect "RevealCodes" mode - made style management really easy.
Word hid everything in paragraph markers (or section markers). Suddenly things become opaque and you get in to the mode of "create a new doc and copy and paste everything into it except for the final paragraph marker" - this is the equivalent of Ctrl+Alt+Del for a Word doc, and I use it to this day... Sigh...
To some degree WordPerfect also failed due to a classic "Innovator's Dilemma" situation. They had a huge customer base of MS-DOS users and tried to keep them happy by maintaining a similar user experience in the Windows edition. And they were fairly successful it that goal, but the product ended up looking weird and counter intuitive to new users who were familiar with the standard Windows look and feel.
The lesson is that sometimes you need to have the courage to disregard what existing customers want.
> The lesson is that sometimes you need to have the courage to disregard what existing customers want.
The problem is that this can kill you, too. If you abandon your current customers, chances are that they’ll choose a new solution because you’ve taken away their “easy path” upgrade and also pissed them off. You also send a message to your potential new customers that you won’t care about them if it becomes inconvenient.
This is the real innovators dilemma: there is no path to guaranteed (or even likely) victory. You can take a chance on the future but it is likely to kill you faster than clinging to the past.
Sure there are no guarantees. Even leaders who understand the innovator's dilemma eventually miss a disruptive shift in the market and lose their dominant position. That's why I'm not too concerned about the current dominance of large tech companies like Apple, Google, and Facebook. In 50 years MBA students will probably be reading case studies about how they screwed up and were acquired for pennies on the dollar by competitors who don't even exist today.
Windows 3.0 HW requirements was 8088 with 386K-2MB of memory.
The word 1.0 can be run on that OS with such limited resource, that was a very impressive achievements. I assume non-trivial amount of time was probably involve on design/debugging all the GUI features around such limit amount of memory.
Today's it would be very hard to find anything non-trivial that can be run with 2MB of memory - not to mention working word processor with OS/GUI.
Everything is different today, yet nothing is. A quote from the report:
The methods of scheduling used were fatally flawed. A schedule should be considered a tool used to predict a ship date, it should not be considered a contract by development. Because there was so much pressure to meet the schedule, development got into a mode which Chris Mason refers to as "infinite defects".
Developers get credit every time they can check a feature off, so they are more inclined to mark off their current feature and go on even though it really is not done. There was a prevailing attitude of the "testers will find it" when thinking about potential bugs in code being developed. In many cases they did find it, and that is what caused our stabilization phase to grow from the expected 3 months (which is a pretty random number anyway), to 13 months.
Because every task was cut to the bare minimum, performance work that should have been done was neglected until the very end of the project, reducing what we could do in a reasonable amount of time.
A member of that team told me that for much of the project the implementation of the function to calculate the height of a line of text, which is a very complication operation involving fonts, sub- and super-scripts, etc., was implemented as "return 12;" (that was the entire body of the function) and it was marked as "DONE". With a bug of course but we can fix the bugs later
I thought this was a super interesting read. Despite the year, so many of the problems are the exact same. This project was massively late. Development went on for five years and they always thought they were a year or less away from shipping. My favorite section is on Schedule Analysis. Particularly this quote:
"A schedule should be considered a tool to predict a ship date, it should not be considered a contract by development."
"The idea that a schedule is God leads to infinite defects, as explained above. Also, the belief that a schedule must be ambitious so that the development team will work hard is severely misguided." (bottom of page 12)
Those two sentences, written almost 30 years ago now, pretty much sum up my perspective on how software projects are scheduled. I think plenty of companies have yet to learn this lesson. In fact, so many of the things in this document reflect things many organizations, to my knowledge, still struggle with today.
I'm also just amazed at how deep this goes and how personal it is. Its creation must have upset a lot of people.
The idea that a schedule is God leads to infinite defects
A great example of this is recounted by Joel Spolsky [1] who was a project manager for the early Excel team (but presumably the problems in the Word team became legend):
[In] the very first version of Microsoft Word for Windows ... the project managers had been so insistent on keeping to the “schedule” that programmers simply rushed through the coding process, writing extremely bad code, because the bug fixing phase was not a part of the formal schedule. There was no attempt to keep the bug-count down. Quite the opposite. The story goes that one programmer, who had to write the code to calculate the height of a line of text, simply wrote “return 12;” and waited for the bug report to come in about how his function is not always correct. The schedule was merely a checklist of features waiting to be turned into bugs. In the post-mortem, this was referred to as “infinite defects methodology”.
Immediately brings to mind: The first book I read on TDD recommended writing "return 12" and waiting for a test that fails. So I guess PR/marketing could say Microsoft pioneered an early form of TDD in WinWord, combining it with the power of pair-collaboration between a developer and a tester. Just one more in a long line of MS innovations.
Your book seems to be selling a peculiar flavor of TDD. Typical TDD says you write the failing test first. You don’t wait for it to appear after writing your broken code.
It's not that it is the first thing written--it's that it recommends writing broken code like "return 12" until a test knocks it down. I think that book (it's been 15 years!) went through a currency-exchange example, but you know how they go: write your first test "The sqrt of 4 is 2".. make it pass "return 2" etc. In all cases, the person writing the code knows it won't stand, but they do it anyway. That was the parallel which prompted my comment.
Indeed. About the only difference in essence between this and well written contemporary post mortems is that individuals are named. In other words, it's not a "blameless" post mortem; on the contrary, the knives come out and people are savaged by name.
Which actually takes away from it in several places, IMHO.
If you read the savagery about the standard dialog stuff,
1. The author inserts pointless opinion about the likely success or failure of that project
2. The author may have been very wrong? I am not a Microsoft archeologist, but it looks like the goal of having significant sets of standard dialogs as a shared library was in fact, wildly successful?
In a lot of ways, it was a bright spot in programs that fucked the UI up for everything else - the standard dialogs were so standard and good that when java and other things had their own file/etc dialogs, everyone noticed and complained.
SDM dialogs live even today in the bowls of office Apps. Although it has since been supplanted by at least two “hot new things” since then. Last I heard there wast talk of moving to React.
Never does a new technology fully replace the old.
I often find myself reading 'old' (15+ years) books on planning, Government and NGO working group papers on Major capital projects / Project Management (PM), journal articles on various PM topics, and sometimes it feels as if very little material progress has been made in the discipline since the 70's. If anything, our propensity for relying on technology to 'take care' of particular parts in some cases takes us backwards.
As someone who schedules and oversees complex programes and their governance for (most of) their living, a few observations that are common across all domains:
- We are still very bad at understanding the total scope of work (whether it has been done before or not)
- We are still very bad at understanding the work that must be done to deliver that scope (whether it has been done before or not)
- We are still very bad at understanding how long it takes to do the work, especially if the work is more cerebral (whether it has been done before or not)
- We still think we can predict the future with certainty with enough information
- Things are becoming more complex and (perhaps) more uncertain
- We are still bad at making decisions and usually avoid them, or make them without giving proper thought to their consequences
A schedule's value is not just in '''predicting''' an end date. A schedule primarily does three things: Past / Status / Forecast
- It records work/effort in the past
- It gives you the status against plan
- It forecasts the likely times when future work will take place and the amount of work left to complete the project
This in term serves multiple project stakeholders in different ways. Leaders want to be able to budget, because the cost of money can be more or less expensive at different times. The cost of resources can be more or less expensive at different times. Workers want to know what they should be doing and how long they have to do it. Managers need to balance limited resources and (re)direct effort.
Even with risk and resource loaded probabilistic scheduling with tens of thousands of montecarlo iterations, your actual path is still only one of those possible realities. But as in all models, there are sensitivities and if properly constructed understanding the sensitivities in the precedence graph is sometimes as important as calculating the critical path.
a schedule must be ambitious so that the development team will work hard is severely misguided
Several times in my career I have seen a CEO shamelessly say well done team another record breaking year but we didn’t meet our “stretch goals” so there will be no bonuses (for you). Managers occupy this weird world in which their staff is clever enough to work on advanced technology but too stupid to see through such transparent deception.
This is why “build one to throw away” is an anti-pattern. No one ever throws it away, so building it like it’s a prototype means prototype code eventually makes it to production.
The little sticker on the first page of the memo indicates that we're able to read this because it's part of the Comes V Microsoft antitrust litigation. See http://iowa.gotthefacts.org/
Perhaps a good time to remember not to put things in writing at work (this includes Slack) that you wouldn't want seen in exhibits for the prosecution.
I browsed around randomly (the exhibits are untitled) and found http://iowa.gotthefacts.org/011607/0000/PX00697.pdf : "we have sorted criticisms of Microsoft into three categories: lying, cheating, and arrogance" (to Ballmer)
I find the snippet on CRLF funny. Even Microsoft didn't like them!
>In Opus paragraph marks are represented by two characters (carriage return-line feed) except for section marks which are only one character (chSect) and possibly other format files (Unix files with only line feeds and Mac files with only carriage returns). In Mac code all paragraphs end with a single character. Our model caused many
problems and complicated code on top of the complications arising from being different from Mac Word.
Which, for those who don't know, inherited it down the chain from raw teletype machine baudot code (eg: see the videos by CuriousMarc on restoring early TTYs) - so it's 'technically correct'.
Back then the machine could either feed a line OR return the carriage to the start. Not both!
That distinction between carriage return and line feed is in turn a general feature of physical typewriter mechanisms, though even manual typewriters would give you an automatic line feed every time you returned the carriage: https://www.youtube.com/watch?v=r97JHr13T98 . Line feed could also be manually controlled by using the platen knobs on the sides of the platen (the big roller which the paper was fed onto).
This is a great piece of software development history! Very interesting to see what things the industry seems to have improved on (somewhat) and what things we still struggle with.
Things we have improved:
- Developers are generally expected to be responsible for testing (and now operating) code.
- Code reviews and ownership are common practice.
- Distinctions between prototype/development/bugfixing stages are no longer strictly enforced.
Things that we still struggle with:
- Overspecialization/knowledge "siloing", and low bus factors generally.
- Navigating the tradeoffs between re-using existing solutions vs. building your own.
Very interesting, thanks! Does anyone know what exactly the "programmer assistants" did? It sounds like a job that was left by the side of the road of technological progress
Thanks for spotting that. It's the first time I've come across the title. They seem to have been a combination of tester and sysadmin. Take a look at the table on pages 2 and 3 where you'll see 'PA' under the 'Position' heading.
I've just started on the doc - where did you find the expansion of 'PA' into 'programmer assistant'?
Only one of the 19 SDEs were female, whereas three of the nine PAs were. That's in line with the gender ratios I've seen in testing and development groups during my career. Testing is a feminised role, in line with its perceived low status.
(I'm a (white cis male) tester, and I've always seem myself as a programmer assistant, as well as taking on extra tasks such as those falling under the 'PA' role here.)
In Fred Moody's "I Sing The Body Electronic"[0], there is a passage where the author gets to read such postmortem documents and is surprised by the brutal honesty, but does not give any quotes. Now I see what he meant.
Thanks for the recommendation. I have to add G. Pascal Zachary's "Showstopper! The Breakneck Race to Create Windows NT and the Next Generation at Microsoft" as another excellent look inside Microsoft's software development practices during the mid 1990s.
Hardly wild - I would say that as software has got easier with lower barrier to entry, development gets less and less "serious". I mean, how serious was the Apollo guidance computer project?
For example, Jeff Harbers, who gets savaged on page 4, seems to have had some very perverse incentives imposed on him. What was it about the organisation that caused that? Considering that Microsoft was earning orders of magnitude more from each product than it was costing to build it, why did he treat the schedule as "a contract between the development team and himself"? So far, the answer to that is not in the document.
I kind of like how personalized the report is. A lot of modern methodologies seem to want to treat devs as interchangeable resources instead of real persons that may do good and bad things and learn from them. To me that makes work feel very dehumanizing.
Which isn't realistic in some cases. For example, if the postmortem determines that project management was lacking, that traces back to somebody, even if he or she isn't named.
But we have to look deeper. Project management doesn't exist in isolation. Any software development organization will typically have a project management office (PMO) with documented methodology, policies, procedures, templates, best practices, etc. If a particular project had poor project management then why did that happen? Did the PMO fail to provide the right resources? Is the hiring process for project managers faulty? Are we giving project managers the necessary training before assigning them to real projects?
Occasionally individuals really are to blame. But organizations get better results when everyone assumes positive intent by individual employees and focuses on finding process problems.
The Microsoft culture was one of fighting for your own resources. If you failed to receive them you were to blame.
Bill Gates formed a highly competitive internal environment. You were expected to stand up and fight for your cause - even against BillG himself. If you were right against Bill it was extremely good for your career. This was a time where the success of a BillG review was determined by how many F-bombs came out, and it was never zero.
To expound, if you focus too much on “Jeff did X wrong”, the take home message of the postmortem is “don’t be Jeff”. If you focus on “trying to solve X by Y failed because Z”, you have the more generally applicable message, “don’t do Y to solve X unless you can deal with Z”.
Also keep in mind that Microsoft today is a robust corporation with many jealously defended fiefdoms; carefully picking the people to throw under the bus to protect your fiefdom and attack your internal competitors’ fiefdoms is an important skill set in such an environment. You could be seeing that in this postmortem, and that is exactly what people are trying to stop by keeping postmortems “blameless”.
It also depends on what the consequences of that fuck up are. I would love to be told exactly about all the things I am doing wrong. But I also need to get the chance to learn from them and improve.
> I think you limit them if you don't cast blame at all.
Yes. Then we get face-saving but at best pointless cop-outs like "we just didn't communicate enough".
"At best" pointless, because this easily leads to "hey we need to communicate more", which leads to an incessant meetings and things like 10 Slack channels to monitor...
The point is not to avoid pointing out mistakes. The point is to (a) allow mistakes and (b) learn from them.
Reading this, it is interesting that Word for Windows 1.0 was usable at all. But I did use it quite a lot. It had problems, but generally, the "save often, each time with a different name" enabled one to get good results in editing documents.
Although that bit is written in the third person, Peter Jackson is the report author (according to the second page). That doesn't change your point but it does add a bit of flavour to some of the text.
Was Bryan's illness related to the stressful environment? If so, interesting that he later returned to be Project Lead, esp when he was "better, but not well"
One interesting thought, 7 KLOC per FTE/year works out to about 28 lines of code per work day! It does not seem like much until you realize each line of code requires integration, testing, bug fixing, and documentation if you are lucky!
I think it would be good for more organizations to use data like this to set realistic schedules because it always takes longer than you would think you need. Luckily with tools like git this is much easier.
"Peter continually experimented with strategies to improve morale ... But the amount of administrative overhead ... and the distractions of going from one to the next probably did more damage than good."
I know this feeling - watching someone try to treat the symptom (stress, burnout), without addressing the root cause.
But knowing that the root cause is a systematic issue (very hard to fix), what can a project lead do to help morale and the team?
I find it interesting that even though they knew that they were building a multi-platform product, they insisted on inheriting CRLF from DOS anyways, knowing full well that it created issues. That little decision he described as a failure painted a good picture for me.
I keep my CV in Latex, and did my undergraduate thesis that way; but I'm very aware that for day to day non-maths work this is not a simple workflow and that most people would prefer a system in which they can see the effect of their changes immediately. (See the discussion in another comment about styles in word and how few people use them).
WYSIWYG and DTP was the liberatory technology of the 80s and early 90s - suddenly everyone could produce documents with little training.
- 55 man years
- 38 man years by full time employees (FTEs)
- 7000 LOC/yr code written per FTE
- product + tools = 347 KLOC
- 12,511 bugs, 9377 deemed fixable
- 1197 bugs postponed (13% of that were fixable)
- 247 bugs/yr fixed per FTE
- 38 bugs/KLOC
- Total 5 yrs (1st yr just 1 person, last yr had 13 max, other yrs had 10 avg)
It is quite amazing that a young company allowed such a long running project and kept at it despite of many schedule misses for 5 years. This is now almost never observed even in resource rich mega corps.
Also it seems developer productivity and bug count per KLOC has more or less remained same all these years. I haven't spent much time on researching this but if its true then I wonder what we gained out of improvements in languages and other development infrastructure? May be perhaps ability to scale to teams of 100s of developers per project totalling million KLOC? Interestingly World 1.0 team never went above 13 people despite of so many delays - may be because it was hard to scale using infrastructure available at the time (for example, no OOP)? Regardless the fact that average developer productivity in terms of KLOC and bugs/KLOC has remained fairly constant is very interesting.