New: Book Report: The Mythical Man-Month (a Study Guide)

If this book report seems a little heavy on the questions? It's because it's the first draft of a study guide? For people reading the book? Oh man it's way too long? But hey give me a break, it's a first draft?

The Mythical Man-Month

This is a study guide for folks reading the Mythical Man-Month. When reading a book like this, it's useful to have some questions buzzing around in the back of your brain. In case you don't already have enough of those, this study guide provides a few extra.

This book is about managing large software projects. Fred Brooks wrote it in 1975; back then, there wasn't a big literature about such things. There hadn't been many such projects. Many of them had been debacles; nobody wanted to brag about them later. Actually, Fred Brooks had presided over something of a debacle himself--his project was famously late. This book is a post-mortem: why things went wrong, lessons learned, how we can (or why we can't) avoid similar stumbles.

Pick up a recent edition of the book--the 20th anniversary edition includes chapters at the end pointing out which parts of the original have [not] withstood the decades. E.g., the Waterfall Method of project planning isn't SOP anymore. In the preface, he also points out something important about software project management reports in general:

In preparing my retrospective and update of The Mythical Man-Month, I was struck by how few of the propositions asserted in it have been critiqued, proven, or disproven by ongoing software engineering research and experience.

As you read this book (or anything), always be on your guard for snake oil, untested assertions, and handwavery. Some techniques will help your team, some will harm it. Learning to recognize which are which is an important part of leadership.

Preface to the Original edition

Brooks gives us the whole book in a nutshell in the Preface.
I wanted to explain the quite different management experiences encountered in System/360 hardware development and OS/360 software development.

...

Briefly, I believe that large programming projects suffer management problems different from small ones, due to division of labor. I believe the critical need to be the preservation of the conceptual integrity of the product itself.

Got that? He's telling us that Management is important. He's saying that Leadership is important. He's saying that Design is important. 120 people produce no more than 12 do if they're not working towards the same goal. And unless you can help them all see where that goal is, they will not all work towards it. He's not going to teach you new algorithms, tools, none of that. This is people skills. He's reminding you that this stuff matters.

The Tar Pit

Brooks' OS/360 debacle was largely a schedule slip. Here, he points out that junior programmers are optimistic about how long it takes to implement part of a large system. They don't think about the (considerable) time it takes to provide the correctness and polish they'll need to be part of a real-world product. the (considerable) time to design and implement the interface between their system component and the rest of the system;

Try to remember your first projects working on larger software systems. It took longer to get things done than your quick-and-dirty hacks for school homework assignments, didn't they? Where did the time go?

As a rule of thumb, I estimate that a programming product costs at least three times as much as a debugged program with the same function... A programming system component costs at least three times as much as a stand-alone program of the same function.

If you're not sure whether a junior programmer has considered these things, their schedule guesstimate might be 9x optimistic. When you get an estimate from a co-worker, how can you find out whether they've allowed for correctness and API design?

In practice, actual (as opposed to formal) authority is acquired from the very momentum of accomplishment.

The Mythical Man-Month

Here, Brooks points out more things that go wrong with managing schedules. They're hard to estimate. They're hard to boost, too. This chapter gives us Brooks' Law: "Adding manpower to a late software project makes it later." When you add new people to a project, for the first few months, they slow you down as you teach them things and they bump into things. Only later when they're up to speed might they help your schedule. Unless they slow down trying to manage them.

Why would anyone ever think to throw people onto a project at the last minute? Maybe a clue is the context in which this book was written, IBM in 1975:

I wanted to explain the quite different management experiences encountered in System/360 hardware development and OS/360 software development.

Maybe back in 1975, most of IBM's experience was with hardware. If design went slowly, maybe there was a temptation to make up time with more people: run extra shifts at the manufacturing plant.

The need for intercommunication slows everyone down. Can we ease this by designing better interfaces? Can we design our software architectures so that not every engineer needs to pester every other engineer to learn every interface?

Gutless estimating It's bad enough that the junior engineers on your team underestimate how long it will take them to accomplish something. Sometimes an executive does, too. They lean on you, that's no fun. Something that happens at often at some companies: someone pre-announces an unrealistic date to the world. Do you have any stories like this? If not, ask some veteran programmer to tell you some. Maybe buy that veteran a drink first--they tend to be sad stories. In hindsight, could the teams have dodged these disasters?

The Surgical Team

This chapter explores the idea of a small programming team. Some parts of this idea have aged better than others.

Aged well: team of people with complementary roles and diverse skill sets.
Aged less well: some of the suggested roles now seem absurd.

Brooks thinks that every senior programmer needs a secretary. And an editor (tech writer) and another secretary for the tech writer. Brooks was a writer in a time of typewriters, large presses, and other awkward tools. Nowadays instead of giving the senior programmer so many people to manage correspondence and documentation, we have Gmail and the wiki. The "program clerk" role largely went away when revision control systems came along—a few admins can "clerk" the "programs" of many, many engineers.

Suppose you wanted to plan a "surgical team" for your organization in the modern day. What roles would it have? What assumptions do you need to make about how this team would fit in to the organization at large?

Aristocracy, Democracy, and System Design

If one person designs a system, that design captures only one person's knowledge. For a large system with many differrent pieces, some pieces' designs will be clunky.

If many, many people design a system, that system never gets designed.

Brooks praises Reims Cathedral:

The joy that stirs the beholder comes as much from the integrity of the design as from any particular excellences. As the guidebook tells, this integrity was achieved by the self-abnegation of eight generations of builders, each of whom sacrificed some of his ideas so that the whole might be of pure design.

Invoking a cathedral as metaphor might set of alarm bells in your head. Didn't ESR use that same metaphor in his essay The Cathedral and the Bazaar? There ESR praised the "bazaar" over the "cathedral": harnessing hundreds of open source folks to work on a project instead of a small, limited group. But that essay points out that some tasks scale better than others. He wrote

So does the leader/coordinator for a bazaar-style effort really have to have exceptional design talent, or can he get by through leveraging the design talent of others?

I think it is not critical that the coordinator be able to originate designs of exceptional brilliance, but it is absolutely critical that the coordinator be able to recognize good design ideas from others.

Brooks and ESR would agree that you can't just let loose a horde of 150 programmers and hope that a great design emerges. You need some architects with "taste".

How much of a system's design should the architects figure out (leaving details to the horde)? Brooks thinks it's the interfaces: the APIs and UIs:

By the architecture of the system, I mean the complete and detailed specification of the user interface. For a computer this is the programming manual. For a compiler it is the language manual. For a control program it is the manuals for the language or languages used to invoke its functions. For the entire system it is the union of the manuals the user must consult to do his entire job.

Suppose you're Fred Brooks and you have 15 great programmers and 150 good ones. How might you divide tasks between them? What are some ways you could set up bottom-up feedback without drowning in noise?

The Second-System Effect

Have you ever worked on a "second system"? You work on a successful system. It's working. You look for ways to improve it, to "get it right." You get around to elegantly adding those improvements that you couldn't "bolt on before". And somehow... the result is a mess. Years later, you figure out that many of those "improvements" were gratuitous; some of them made the system worse. Have you? Or seen one from a distance?

How can we guard against this effect? Brooks says every design team needs a third-system veteran, someone who has experienced their second system. How might a system like this work at your organization? Anyone can design something, you can't force them to take advice from a veteran. What can you do?

Passing the Word

This chapter is mostly of interest to the historian, or people who like to hear stories of the old days when "we had to walk back and forth through the snow, twenty miles, uphill both ways".

The book assumes that most of a project's "architecture" happens up front, then stays frozen. Wow, it's the opposite of agile. This chapter explains the clumsiness: This chapter talks about project communications back in the 1970s.

Back then, communicating a spec change was a major ordeal. Word processor software was a newfangled clunky thing. There was no revision control system software to keep track of changes. There was physical typesetting; physical pages to truck to far-flung teams.

The section titled Conferences and Courts talks about their change process. He mentions that it happened less often than it should have--Brooks wishes that his architecture team had been more agile. But when you hear about what it took to communicate a change... it makes one glad to be working in these relatively easy times.

Why Did the Tower of Babel Fail?

It's another chapter about the difficulty of communication, but this chapter describes problems that are still with us.

So it is today. Schedule disaster, functional misfits, and system bugs all arise because the left hand doesn't know what the right hand is doing.

Part of the chapter describes the 1970s solution: write everything down. Since they didn't have an intranet and a wiki, and they generated a lot of information, they used microfiche. Ha ha ha. If you get bored reading about 1970s communications technology, you might skip ahead to the section titled Organization of the Large Programming Project

Like the "Surgical Teams" chapter, this section discusses roles. But instead of people working on a small team, these roles are the "diplomats" that keep those teams moving in the same general direction: producers and technical directors. If you work on a large project, who are the people who have this high-level role? How would you decribe their roles? What challenges do you think they face?

Calling the Shot

This chapter has more ponderings on schedule estimation, including some still-relevant thoughts about why things might take longer than you'd expect. Along with this, there are some good reasons for "fudge factors" to apply when you hear a schedule from a naive estimator.

Have you ever estimated how long it would take you to complete a task? How far off were you? Why?

Has someone else ever estimated how long it would take you to complete a task? Why was their estimate so far off?

Have you ever estimated how long it would take someone else to complete a task? How far off were you? Why?

This chapter measures programming output in Lines Of Code. Yes, people really did this back in the 1970s and '80s without irony.

Ten Pounds in a Five-Pound Sack

This chapter discusses the tragedy of metrics: once programmers know what you're trying to measure, they will optimize for it, sometimes at the expense of overall system quality. E.g., if you tell programmers to use less memory, you might notice the system slows down with swapping:

...Digging-in showed that the control program modules were each making many, many disk accesses. Even high-frequency supervisor modules were making many trips to the well, and the result was quite analogous to page thrashing.

You need to measure things, but leaders need to make sure that things stay on track.

The project was large enough and management communication poor enough to prompt many members of the team to see themselves as contestants making brownie points, rather than as builders making programming products.

Have you seen this kind of problem, where a high-level strategy gets mis-applied at the low level? Could a different approach have yielded the benefits without the low-level problems?

Representation is the Essence of Programming is a fun section, reminding us of the importance of designing our data structures.

The Documentary Hypothesis

Brooks was a writer; it's no wonder that he advises that a project's mission, architecture, and everything be expressed in writing. This chapter describes some things worth writing down. If your customers can see your source code, then API documentation is simple. But he wants more.

One document he suggests maintaining is an org chart:

This becomes intertwined with the interface specification, as Conway's Law predicts: "Organizations which design systems are constrained to produce systems which are copies of the communication structures of these organizations." Conway goes on to point out that the organization chart will initially reflect the first system design, which is almost certainly not the right one.

Even with our great software tools, editing an org chart is tricky. People get tetchy when you mess with it. Some inertia is good. Brooks advises some inertia in changing system documents, too... and not just because of clunky 1970s documentation technology:

...the best engineering manager I ever saw served often as a giant flywheel, his inertia damping the fluctuations that came from market and management people.

Have you worked with customers before? What sorts of second-guessing have you had to apply to their suggestions? How do you prioritize their requests?

The task of the manager is to develop a plan and then to realize it. But only the written plan is precise and communicable.

How much precision do you want in a plan? Are there other ways you might communicate it?

Plan to Throw One Away

Twenty years later, Brooks was no longer fond of this chapter. This chapter points out that your first attempt at implementation might not be good; it's OK to scrap it and start over. Later on, as he moved away from the waterfall model, Brooks was OK with the idea of more incremental improvements.

But there's still some good advice here:

Project after project designs a set of algorithms and then plunges into construction of customer-deliverable software on a schedule that demands delivery of the first thing built.

Have you ever been on such a project? Did your customers forgive you?

Plan the Organization for Change reminds us of tricky things to keep in mind when changing things.

Management structures also need to be changed as the system changes. This means that the boss must give a great deal of attention to keeping his managers and his technical people as interchangeable as their talents allow.

I'm not sure I've ever seen a management structure change that was described as in response to a system change. Have you? What was the reason? How was it communicated?

Sharp Tools

This chapter is mostly of interest to the historian. It's a fun read, but if you're in a hurry, I'll summarize it for you: tools are better now than they were in the 1970s. We gripe about our tools, but they are awesome.

The Whole and the Parts

This chapter discusses the creation of high-quality systems:

Have you worked on a large project? How did the project find bugs hidden in the spaces "between" modules worked on by multiple teams? Did teams do anything to make this easier?

Hatching a Catastrophe

Earlier, we learned that everyone is terrible at estimating software development schedules. Here, we learn that they're also terrible at noticing schedule slips. He recommends using scheduling software to find bottlenecks and watch those carefully.

The bosses also need to let underlings safely report schedule slippage.

Have you worked on a project that had a schedule?
Have you worked on a project that did not have a schedule?
Any differences in the projects that could be attributed to the difference?

The Other Face

This chapter discusses how your product interacts with customers: usability and documentation. Some parts of this chapter have aged better than others.

The flow chart is a most thorougly oversold piece of program documentation. Many programs don't need flow charts at all; few programs need more than a one-page flow chart.

Flow charts have fallen out of favor. What would you say are the most oversold pieces of program documentation nowadays?

The chapter has some interesting ideas about using code comments to clarify code. What did you think about using ASCII art arrows to show the path of GOTO statements? Do you still think C is a primitive programming language, now that you've seen PL/I?

No Silver Bullet--Essence and Accident in Software Engineering

If you're reading an old edition of the book, you don't have this chapter. Fortunately, it's out there in a place called the internets.

This 1987 essay points out that software development had made great strides recently. Software development was getting faster! But the easy speed-ups were gone: the remaining problems could not be so easily knocked down with tools.

Our tools were getting really good at dealing with the same problems. But the essential problems hadn't been tackled at all.

I believe the hard part of building software to be the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation. We still make syntax errors, to be sure; but they are fuzz compared to the conceptual errors in most systems.

Putting together a software system isn't like assembling homogeneous bricks.

...it is necessarily an increase in the number of different elements. In most cases, the elements interact with each other in some nonlinear fashion, and the complexity of the whole increases much more than linearly.

And of course, by the time you design a decent solution, the requirements have changed

The software entity is constantly subject to pressures for change. Of course, so are buildings, cars, computers. But manufactured things are infrequently changed after manufacture; they are superseded by later models, or essential changes are incorporated in later serial-number copies of the same basic design.

How can we approach some of these essential problems? If we want to do something similar to what others have done before, we might be able take advantage of their work; e.g., if you want to write a simple database-backed web application, there are several platforms that make this pretty easy. But probably the very fact that you're thinking about large-scale software development means that you're thinking about some project that's not so straightforward.

When you consider the lifetime of a major software project: the evolution from hallway conversation to design to tinkering to large-scale development to refinement to deployment to support to maintenance and improvement: where does the time go? If you wanted to accomplish all this with half the effort, what would need to change?

Some of Brooks' possible-solutions-on-the-horizon have come to pass, at least partially. Are there any of these that you think might speed up software development further in the future? Or have these veins been mined out?

Labels: , ,

Posted 2009-11-08