The Mythical Man-Month

Published in

Keqiu’s Management Notes

27 min readFeb 21, 2023

by Frederick P. Brooks, JR. Reading notes v0.3

Even though 2023 just started, I feel this book will be one of my best read this year. An interesting anecdote of this book is about the Chinese translation of its title, it literally reads — The Mythology of Man on Moon (Moon shares the same character as Month, 1 month = the time it takes moon to orbit earth). I have heard of Chinese version of this book for a long time but I thought it is a made up mythology.

My previous org leader recommended me this book, and told me I will learn a lot from this book. When he mentioned the name of the book and asked if I’ve ever read it, I paused quite a bit till I connected its name to its translation. Dang! It is Man Month not Man on Moon!

The book is Dr Brooks’s reflection and his learnings managing the IBM System/360 and OS/360 projects, which are two extremely complicated projects in terms of technical program management. The book provides numerous frameworks and ideas to understand some of the common software management challenges. Almost 50 years have past since the initial release of the book, it is shocking to see not much has changed in software engineering. The technologies have drastically changed, the society has arguably advanced significantly, still no silver bullet to software engineering (or in general to insolvable problems), people still make the same mistakes, the core of humanity and software engineering have not changed — and might not change in the future either.

I will take the chance of writing reading notes of this book to reflect many of my own experience as well. Hopefully it doesn’t go too long.

Chapter 1 The Tar Pit

There are two ways a program can be converted into a more useful but more costly object. These two ways are represented by the boundaries in the diagram

Moving down across the horizontal boundary, a program becomes a programming product. This is a program that can be run, tested, repaired, and extended by anybody. It is usable in many operating environments, for many sets of data. 3x more work.

Moving across the vertical boundary, a program becomes a component in a programming system. This is a collection of interacting programs, coordinated in function and disciplined in a format, so that the assemblage constitutes an entire facility for large tasks. To become a programming system component, a program must be written so that every input and output conforms in syntax and semantics with precisely defined interfaces.

One rarely controls the circumstances of his work, or even its goal. In management terms, one’s authority is not sufficient for his responsibility.

This diagram depicts a common mistake in underestimating the effort to productionize a solution. When you ask less experienced developers to estimate the effort of a task, they usually only estimate the cost of A Program. Similarly, when you ask someone who is not familiar with the domain (say AI engineers to estimate the effort of an infra effort), they usually only estimate the cost of “writing the code”. However, that is at most 10% of the work, most of the effort is spent on making this piece of code a Programming Product, that can be used by many teams, it needs good test coverage to avoid future regressions, the other face — documention well done, and designed to be extensible. We also rarely build standalone features, it has to fit into the existing ecosystem, for example, works with the resource management frameworks, compliance, security, meet the input/output expectations of the overall ecosystem etc. All adds up, we need 10x more effort to productionize an idea.

1x to build a proof of concept (experimentation) and 10x to productionize the solution (production). It is a common challenge in building ML systems that empowers both innovation acceleration and productionization stability. We dreamed and tried to build an omnipotent production solution that also happens to be productive for experimentation use cases, it didn’t work out at all and our AI users hate our product.

On the product development side, the continuous integration and delivery pipeline (CICD) doesn’t usually change and the same system rarely needs an overhaul. However, in AI, nobody knows what will come out in 2–3 years, and the AI & Data community is more vibrant and prolific than any other areas. What makes things worse is that, AI innovations can pose drastically different requirements on the underlying ML system, it is not realistic to build an ML system that covers all the use cases, even just for the upcoming 2–3 years. For example, the shift from linear models to hybrid linear + deep learning models (<100M parameters), to large deep learning models (1B — 100B), to multi modal learnings, multi task learning, graph neural nets, to ultra large (1T+) genrative and DLRM models, each new paradigm puts drastically different requirement on the system.

The best way out from my experience is to provide at least 2 levels of APIs (pierceable infrastructure), the lower level of API can be directly consumed for fast innovation & AI research, and a higher level of API for production & re-training. The high level production API is thin and built on top of lower level APIs. In other words, the lower level APIs or components are decoupled from the production API, but high aligned with the design principles of the system. When users want to productionize a pipeline, they can easily customize the components and chain them up to form a production pipeline.

Chapter 2 The Mythical Man-Month

Good cooking takes time. If you are made to wait, it is to serve you better, and to please you — Menu of Restaurant Antoine, New Orleans.

The Mind of the Maker, divides creative activity into three stages: the idea, the implementation, and the interaction. A book or a computer or a program comes into existence first as an ideal construct, built outside time and space, but complete in the mind of the author. It is realized in time and space, by pen, ink, and paper, or wire, silicon and ferrite.

Cost does indeed vary as the product of the number of men and the number of months. Progress does not. Hence the man-month as a unit for measuring the size of a job is a dangerous and deceptive myth. It implies that men and months are. Interchangeable.

Men and months are interchangeable commodities only when a task can be partitioned among many workers with no communication among them.

For some years, I have been successfully using the following rule of thumb for scheduling a software task:

1/3 planning.
1/6 coding.
1/4 component test and early system test.
1/4 system test, all components in hand. This differs from conventional scheduling in several important ways:
The fraction devoted to planning is larger than normal. Even so, it is barely enough to produce a detailed and solid specification, and not enough to include research or exploration of totally new techniques.
The half of the schedule devoted to debugging of completed code is much larger than normal.
The part that is easy to estimate, i.e., coding, is given only 1.6 of the schedule.

Until estimating is on a sounder basis, individual managers will need to stiffen their backbones and defend their estimates with the assurance that their poor hunches are better than wish-derived estimates.

Oversimplifying outrageously, we state Brooks’s Law:

Adding manpower to a late software project makes it later.

Due to the complexity of large software projects, it is close to impossible to design every piece ahead of time (this part is covered in later chapters). As a result of this, large software projects require extensive communications among team member, across teams, and across different levels of organizations. A simple design change can implicitly impact multiple teams, considering the complexity of humanity, it makes the communication cost even higher.

Imagine each org has a chain from engineer -> mgr -> sr mgr -> director -> sr director -> VP, 5 layers, if you include TLs at each layer, that is probably at least 10 ppl from each side of the communication. Now there is a disagreement we need to align between VP org A and VP org B originated from the leaf team. If you follow the communication chain with 10 hops from manager from Org A to Org B, the other side will hear completely different message at the end. So you need some fully connected communication between the two orgs if they collaborate very closely (like one team builds the interface, the other team builds the impl of the exact same product), that is 10 x 10 = 100 communications. This is the communication hell I experienced 2 years ago, I have to communicate a message directly to 20 folks otherwise they hear completely different stories and for the folks I communicated to, they chose their own communication path and often ended up screwing up the actual message I wished to communicate.

We will discuss the purpose of organization in the next few chapters…here the point is communication is a lot of overhead if not properly organized. The best strategy is to ensure proper funding, reasonable organization and explicit role clarity among individuals at the very beginning, teams so the project can progress properly. Adding human capitals in the middle of the project rarely helps the schedule.

Decision making framework (and role clarity) is often more critical than expected, this part is expanded in the next section.

Chapter 3 The Surgical Team

These studies revealed large individual differences between high and low performers, often by an order of magnitude.

Note the differences between a team of two programmers vs the surgeon-copilot team.

In the conventional team, the partners decide the work, and each is responsible for design and implementation of part of the work. In the surgical team, the surgeon and copilot are each cognizant of all the design and all of the code. This saves the labor of allocating space, disk accesses, etc. It also ensures the conceptual integrity of the work.
In the conventional team the partners are equal and the inevitable differences of judgement must be talked out or compromised. Side the work and resources are divided, the differences in judgment are3 confined to overall strategy and interfacing, but they are compounded by differences of interest — e.g., whose space will be used for a buffer. In the surgical team, there are no differences of interest, and differences of judgment are settled by the surgeon unilaterally. These two differences — lack of division of the problem and the superior-subordinate relationship- make it possible for the surgical team to act uno animo

So far, so good. The problem is how to build things that today take 5000 man-years, not things that take 20 or 30. How is the surgical team concept to be used on large jobs when several hundred people are brought to bear on the task?

The success of the scaling up process depends on the fact that the conceptual integrity of each piece has been radically improved — that the number of minds determining the design has been divided by seven. So it is possible to put 200 people on a problem and face the problem of coordinating only 20 mounds, those of the surgeons.

…let it suffice here to say that the entire system also must have conceptual integrity, and that requires a system architect to design it all, from the top down. To make that job manageable, a sharp distinction must be made between architecture and implementation, and the system architect must confine himself scrupulously to architecture. However, such roles and techniques have been shown to be feasible and, indeed, very productive.

Chapter 4 Aristocracy, Democracy, and System Design

This great church is an incomparable work of art. There is neither aridity nor confusion in the tenets it sets forth… it is a zenith of a style, the work of artists who had understood and assimilated all their predecessors’ successes, in complete possession of the techniques of their times, but using them without indiscreet display nor gratuitous feats of skills.

I will contend that conceptual integrity is the most important consideration in system design.

It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas.

The purpose of a programming system is to make a computer easy to use. Ease of use is enhanced only if the time gained in functional spec exceeds the time lost in learning, remembering, and searching manuals.

Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design. Neither the function alone nor simplicity alone defines a good design.

For a given level of function, however, that system is best in which one can specify things with the most simplicity and straightforwardness. Simplicity is not enough….(Mooer’s system) they are not, however, straightforward.

The separation of architectural effort from implementation is a very powerful way of getting conceptual integrity on very large projects.

By the architecture of a system, I mean the complete and detailed spec of the user interface.

The architect of a system, like the architect of a building, is the user’s agent. It is his job to bring professional and technical knowledge to bear in the unalloyed interest of the user, as opposed to the interests of the salesman, the fabricator, etc.

Architecture must be carefully distinguished from implementation. As Blaauw has said, “Where architecture tells what happens, implementation tells how it is made to happen.”

Conceptual integrity of a system determines its ease of use.

…as to the aristocracy charge, the answer must be yes and no. Yes, in the sense that there must be few architects, their product must endure longer than that of an implementer, and the architect sits at the focus of forces which he must ultimately resolve in the user’s interest. If a system is to have conceptual integrity, someone must control the concepts. That is an aristocracy that need no apology. No, because the setting of external spec is not more creative work than the designing of implementation. It is just different creative work.

The worse buildings are those whose budget was too great for the purposes to be served.

Long before the external spec are complete, the implementer has plenty to go. Given some rough approx as to the function that will be ultimately embodied in the external spec, he can proceed. He must have well-defined space and time objectives, he must know the system configuration on which his product must run.

It is such a great summary that the purpose of a programming system is easy of use. The goal is never to build more features, or be extreme on simplicity. It is simplicity + straightforwardness that come from conceptual integrity.

There are many counter examples in real life that fail badly. On the more feature front, there have been numerous products that are full of features but hard to use and there is not clarity on the purpose of this product. Examples like smart phone before iPhone came out, some databases with hundreds of configurations that users need to tune. Simplicity is not the truth either, at least not the end goal for customer facing product. For example, MapReduce brings powerful primitives to run parallel workload at scale, however, it is far from straightforward. That is why when Spark comes out (though it is not straightforward either for many users), it beats MR both in performance and straightforwardness. I’d also argue that performance is the baseline, what differentiates Spark more than MR is its expressiveness and easy to use. There was a callout in Innovator’s Dilemma that disruptive innovations not necessarily comes from performance or disruptive technology, but more often come in terms of ease of use to another magnitude.

The aristocracy vs democracy is an interesting one. I have been contemplating the differences between organizing a software team vs organizing a society. If we argue it is better to have a few minds ”dictating” the design of software, why democracy becomes the mainstream of the humanity? (Almost all governments claim to embrace democracy, regardless of the reality). I think the nuances lie behind the implementation of democracy. For a certain period of time, for example, the President of the U.S. bears great power (executive orders, nominate supreme court judges, appoint secretaries etc.) and they can operate in an aristocracy alike state. In order to re-elected, they do have to perform their jobs following the expectations of the general public and their promises during the election campaign. For software organizations, it is kinda operating in a similar way. There is no lifetime architect, or an architect for all, the executives of an organization is bounded by a governing board (board of directors). However for each individual project, it should definitely be led by 1 primary architect (or 1 Direct Responsible Individual) to make decisions. In the end, someone has to make a decision, if no decision can be make, the organization is paralyzed. The quality of a leader is defined by his or her decision making quality and velocity.

Chapter 5 The Second-System Effect

The architect has two possible answers when confronted with an estimate that is too high: cut the design or challenge the estimate by suggesting cheaper implementation. This latter is inherently an emotion-generating activity. The architect is now chAllen going the builder’s way of doing the builder’s job. For it to be successful, the architect must

remember that the builder has the inventive and creative responsibility for the implementation; so the architect suggests, not dictates;
always be prepared to suggest a way of implementing anything he specifies, and be prepared to accept any other way that meets the objectives as well;
deal quietly and privately in such suggestions.
be ready to forego credit for suggested improvements.

How does the project manager avoid the second-system effect? By insisting on a senior architect who has at least two systems under his belt.

Chapter 6 Passing the Word

The manual must not only describe everything the user does see, including all interfaces; it must also refrain from describing what the user does not see.

The telephone log: one useful mechanism is a telephone log kept by the architect. In it he records every question and every answer. Each week the logs of the several architects are concatenation, reproduced and distributed to the users an implementers.

Chapter 7 Why Did the Tower of Babel Fail?

“Come, let us go down, and there make such a babble of their language that they will not understand one another’s speech” Thus the Lord dispersed them from there all over the earth, so that they had to stop building the city.

Organization in the Large Programming Project If there are n workers on a project, there are (n2 — n)/2 interfaces across which there maybe communication, and there are potentially almost 2^n teams with which coordination must occur. The purpose of organization is to reduce the amount of communication and coordination necessary; hence organization is a radical attack on the communication problems treated above.

The means by which communication is obviated are division of labor and specialization of function.

Let us consider a tree-like programming organization, and examine the essentials which any substrate must have in order to be effective. They are:

A mission
A producer
A technical director/architect
A schedule
A decision of labor
Interface definitions among the parts

All of this is obvious and once National except the distinction between the producer and the technical director. Let us first consider the two roles, then their relationship.

Producer: assembles the team, divides the work, and establish the schedule. He acquires and keeps on acquiring the necessary resources. This means that a major part of his role is communication outside the team, upwards and sideways. He established the pattern of communication and reporting with it he team. Finally, he ensures that the schedule is met, shifting resources and organization to respond to changing circumstances.

Technical director: conceives of the design to be built, identifies its sub parts, specifies how it will look from outside, and sketches its internal structure. he provides unity and conceptual integrity to the whole design; thus he serves as a limit on system complexity.

The producer and the technical director may be the same man. This si workable on very small teams, 3–6 programmers. On larger projects it is very rarely workable, for two reasons:

The man with strong management talent and strong technical talent is rarely found. Thinkers are rare; doers are rarer, and think-doers are rarest.
On the larger projects each of the roles is necessarily a full-time job, or more. It is hard for the producer to delegate enough of his guides to give him any technical time. It is impossible for the director to delegate his without compromising the conceptual integrity of the design.

The producer may be boss, the director his right-hand man. The director maybe the boss., and the producer his right-hand man.

I think the producer as boss is a more suitable arrangement for larger sub trees of a really big project.

The best part of this chapter is the discussion around a manager vs a technical lead as in today’s terms. The exact analogy was used by previous manager to tell the differences between a manager and a TL. So when I read it the first time, I thought he “stole” it from the book.

Manager and tech lead might overlap in some responsibilities but the priorities are different between the two roles. The manager is responsible for setting the cast and budget, the TL is responsible for the design and conceptual integraity. It is ok for the same person to do both jobs though it is hard to scale or sustain so we usually have two persons. And it is ok for either to be the boss, but it can’t be both are the boss. It is similar to the previous discussion around architect vs implementer. It is ok for the architect to give a default implementation when asked, but they should not dictate, it is the implementer’s choice.

In the previous leadership team of our org, there was an infamous conflict between the TL and director due to ambiguous role clarity and that unfortunately rippled across the whole org — every layer of managers and TLs fight to be the boss. Looking back on hindsight, it is hard to say whose decisions were better. But it’s clear that it is horrible to operate an org without role clarity. For the current setup of the org, it is interesting to see our leader who has quite strong opinions in mind, but he doesn’t often express it and mostly delegates to the technical leaders inside the org. He does exercise strong opinions in the people management though and believes that his primary job is to put the right people in the right place, and things will work out naturally, this seems to have worked out pretty well.

Chapter 8 Calling the Shot

Portman found his programming teams missing schedules both about one-half — each job was taking approx twice as long as estimated… these showed that the estimating error could be entirely accounted for by the fact that his teams were only realizing 50% of the working week as actual programming and debuting time.

effort = (constant) x (number of instructions)^1.5

Chapter 9 Ten Pounds in a Five-Pound Sack

More function means more space, speed being held constant. So the first area of craftsmanship is in trading function for size. The second are of craftsmanship is space-time trade-offs.

Representation is the Essence of Programming Beyond craftsmanship lies invention, and it is here that lean, spare, fast programs are born. Almost always these are the result of strategic breakthrough rather than tactical cleverness. Much often often, strategic breakthrough will come from redoing the representation of the data or tables. This is where the heart of a program lies.

Tough decisions are all about making intelligent trade offs. Easy decisions are usually caused by accidental complexities. Similarly, challenging or innovative software decisions are also mostly about trade offs. This “representation is the essence of programming” resonates me deeply with a recent project to rewrite the whole TensorFlow Avro Reader based on additional internal schema information. With prior knowledge of the data presentation, we can effectively provision array buffer size ahead of time to avoid resizing array size on the fly. Similarly, we chose to fuse a few common operators into one single operator (similar to kernel fusion) to minimize ops launch and lifecycle overhead. Combined, we were able to speed up I/O speed by more than 100x.

Chapter 10 The Documentary Hypothesis

Documents for a Software Project What: objectives. This defines the need to be met and the goals, desiderata, constrains, and priorities. What: product specifications. This begins as a proposal and ends up as the manual and internal documentation. Speed and space spec are a critical part.

When: schedule

How much: budget.

Where: space allocation

Who: organization chart. This becomes intertwined with the interface spec, as Conway’s Law predicts: “Organizations which design systems are constrained to produce systems which are copies of the commutation structures of these organizations.” Conway goes on to point out that the org chart will initially reflect the first system design, which is almost surely not the right one. If the system design is to be free to change, the org must be prepared to change.

Why have formal documents?

Writing the decisions down is essential.
The documents will communicate the decisions to others. Since the manager’s fundamental job is to keep everybody going in the right direction, his chief daily task will be communication, not decision-making, and his documents will immensely lighten this load.
A manger’s documents give him a database and checklist.

Can’t agree more with Conway’s Law. It was discussed in the previous chapter that the purpose of organization is to avoid/reduce communication and coordination (tree structure than fully connected). This means most of the communications happen at local level, and each layer will try to maximize the collaboration within its sub tree nodes. So if you have an early stage functionality pre maturely scaled into 2 different VP organization, it becomes a disaster and escalation hell. I discussed about my own team in previous blog posts that we used to separate the interface and implementation of our deep learning product into two VP orgs, in the end, we ended up spending at least 60% of the energy escalating and aligning directions. The interface side has no idea of the implementation details and thinks the impl team does all the fun & hard core engineering job; the impl team is segregated from the customer interactions and the team felt they did all the hard work but the interface team took all the credit. Here I am talking about a product that only had a total of only 10 ppl, split half in each VP org.

Chapter 11 Plan to Throw One Away

There is nothing in their world constant but inconstancy. — Swift

The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. The only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers.

Plan the Organization for Change The common failing of programming groups today is too little management control, not too much.

Cosgrove offers a great insight. He observes that the reluctance to document designs is not due merely to laziness or time pressure. Instead it comes from the designer’s reluctance to commit himself to the defense of decisions which he knows to be tentative.

The total cost of maintaining a widely used program is typically 40% or more of the cost of developing it.

Chapter 12 Sharp Tools

A good workman is known by his tools. — Proverb

One feels that if all those scattered tool builders were gathered in to augment the common tool team, greater efficiency would result. But it is not so.

That, it develops, was a better way to allocate and schedule. Although machine utilization may have been a little lower (and often it wasn’t), productivity was way up.

Harris data suggest that an interactive facility at least doubles productivity in system programming.

My takeaway out of this chapter is — you can’t bet on a platform team or a tools team to build everything you need. Nor a platform attempts to build everything everyone needs. The platform should focus the right abstraction that benefits most customers and allow users to extend it. The will be a fine boundary between who should build what? As described in this chapter, if the specialized tool makers are decentralized inside the org, it is generally fine.

Chapter 13 The Whole and the Parts

Top-down design. Each of the architecture, implementation, and realization can be best done by top-down methods.

Many poorer systems come from an attempt to salvage a bad basic design and patch it with all kinds of cosmetic relief. Top-down design reduces the temptation.

Experience shows that they are not the whole truth — the use of clean, debuted components saves much more time in system testing than that spent on scaffolding and thorough component test.

Perhaps there are no bugs? No! Resist the temptation! That is what systematic system testing is all about.

Chapter 14 Hatching a Catastrophe

How does one control a big project on a tight schedule. The first step us to have a schedule. Each of a list of events, called milestones, has a date.

Two interesting studies of estimating behavior by government contractors on large-scale development projects show that:

Estimates of the length of an activity, made and revised carefully every two weeks before the activity starts, do not significantly change as the start time draws near, no matter how wrong they ultimately turn out to be.
During the activity, overestimates of duration come steadily down as the activity proceeds.
Underestimates do not change significantly during the activity until about three weeks before the scheduled completion.

And chronic schedule slippage is a morale-killer. As we have seen, one must get excited about a one-day slip. Such are the elements of catastrophe.

The PERT technique, strictly speaking, is an elaboration of critical -path scheduling in which one estimates 3x for every event, times corresponding to different probabilities of meeting the estimated dates.

Reducing the role conflict. The boss must first distinguish between action information and status information. He must discipline himself not to act on problems his managers can solve, and never to act on problems when he is explicitly reviewing status.

…this whole process is helped if the boss labels meetings, reviews, conferences, as status-review meetings versus problem-action meetings, and controls himself accordingly.

Chapter 15 The Other Face

For the program product, the other face to the user is fully as important as the face to the machine.

Showing them how the job is done is much more successful.

Chapter 16 No Silver Bullet — Essence and Accident in Software Engineering

I suggest:

Exploiting the mass market to avoid constructing what can be bought.
Using rapid prototyping as part of a planned iteration in establishing software requirements.
Growing software organically, adding more and more function to systems as they are run, used, and tested.
Identifying and developing the great conceptual designers of the rising generation.

The complexity of software is an essential property, not an accidental one. Many of the classical problems of developing software products derive from this essential complexity and its nonlinear increases with size.

Promising Attacks on the Conceptual Essence

Build versus buy. The most radical possible solution for constructing software is not to construct it at all. Any such product is cheaper to buy than to build afresh.
Requirements refinement and rapid prototyping. For the truth is, the clients do not know what they want. They usually do not know what questions must be answered, and they almost never have thought of the problem in the detail that must be specified. So in planning any software activity, it is necessary to allow for an extensive iteration between the client and the designer as part of the system definition.
Incremental development — grow, not build, software. Any software should be grown by incremental development. The approach necessitates top-down design, for it is a top-down growing of the software. It allows easy backtracking. It lends itself to early prototypes. Each added function and new provision for more complex data or circumstances grows organically out of what is already there.
Great Designers. The central question of how to improve the software art centers, as it always has, on people. The differences between the great and the average approach an order of magnitude… I think the most important single effort we can mount is to develop ways to grow great designers.

It is surprising to see that the build vs buy discussed here…the top has been there for ages!

A provocative idea I have is — this buy > build doesn’t only limit to software products but also software engineering teams. One honestly confession when folks asked me what I would do differently a few years back, I think it is — I would rather do a talent acquisition of a team with relevant expertise and a working product than building my current team from scratch. It took me 2+ years to build a team I am proud of and enables the company with 100x more AI training capabilities, but it is 2+ years. Assume the company makes 10B every year, and 20% comes from AI, that is 2B dollars. Suppose we acquired a team 2 years ago with a similar or reasonable worse plate of talents, let’s say only 10% capable of the team we have now. We could have improved AI productivity of all those hundreds of AI developers by at least 10x with that hire. Assuming a 10% conversion rate from AI productivity to revenue, we lost 100% of the current avenue contributed by AI, that is at least $1B dollars. We can practically acquire any startup with that amount of money.

Chapter 17 No Silver Bullet Refired

System complexity is a function of myriad details that must each be specified exactly, either by some general rule or detail-by-detail, but not just statistically. It seems very unlikely that uncoordinated works of many minds should have enough coherence to be exactly described by general rules.

It takes a lot of effort to accept the existence of insoluble problems.

Productivity numbers. Productivity numbers are very hard to define, hard to calibrate, and hard to find.

Chapter 19 The Mythical Man-Month after 20 Years

The central argument: Conceptual integrity and the Architect.

Conceptual integrity. A clean, elegant programming product must present to each of its users a coherent mental model of the application, of strategies for doing the application, and of the user-interface tactics to be used in specifying actions and parameters. The conceptual integrity of the product, as perceived by the user, is the most important factor in ease of use.

Any product that is sufficiently big or urgent to require the effort of many minds thus encounters a peculiar difficulty: the result must be conceptually coherent to the single mind of the user and at the same time designed by many minds.

The architect. I argue in Chapter 4 through 7 that the most important action is the commissioning of some one mind to be the product’s architect, who is responsible for the conceptual integrity of all aspects of the product perceivable by the user. The architect forms and owns the public mental model of the product that will be used to explain its user to the user. The architect is also the user’s agent, knowledgeably representing the user’s interest in the inevitable trade offs among function, performance, size, cost, and schedule.

Separation of architecture from implementation and realization. Architecture versus implementation defines a clean boundary between parts of the design task, and there is plenty of work on each side of it.

Today I am more convinced than ever. Conceptual integrity is central to product quality. Having a system architect is the most important single step toward conceptual integrity. These principles are by no means limited to software systems, but to the design of any complex construct, whether a computer, an airplane, a Strategic Defense Initiative, a Global Positioning System.

The Second System Effect: Featuritis and Frequency Guessing Designing for large user sets. System architects for machine-vendor-supplied software have always had to design for a large, amorphous user set rather than for a single, definable application in one company. Defiing the user set. Since an architect’s image of the user consciously or subconsciously affects every architectural decision, it is essential for a design team to arrive at a single shard image. And that requires writing down the attributes of the expected user set, including:

Who they are?
What they need?
What they think they need?
What they want?

Frequencies. For any software product, any of the attributes of the user set is in fact a distribution, with many possible values, each with its own frequency. How is the architect to arrive at these frequencies? Surveying this ill-defined population is a dubious and costly proposition. Over the years I have become convinced that an architect should guess, or, if you prefer, postulate, a complete set of attributes and values with their frequencies, in order to develop a complete, explicit, and shared description of the user set. Write down explicit guesses for the attributes of the user set. It is far better to be explicit and wrong than to be vague.

User power versus ease of use. One of the hardest issues facing software architects is exactly how to balance user power versus ease of use.

Incremental transition from novice to power user.

Information hiding is the only way of raising the level of software design.

People Are Everything (Well, Almost Everything) The quality of the people on a project, and their org and management, are much more important factors in success than are the tools they use or the technical approaches they take.

The Power of giving Up Power Creativity comes from individuals and not from structures or processes, then a central question facing the software manager is how to design structure and process so as to enhance, rather than inhibit, creativity and initiative.

The Principle of Subsidiary Function teaches us that the center will gain in authority and effectiveness if the freedom and responsibility of the lower formations are carefully preserved, with the result that the org as a whole will be happier and more prosperous.

How can such a structure be achieved? That large org will consist of many semi-autonomous units, which we may call quasi-firms. Each of them will have a large amount of freedom, to give the greatest possible chance to creativity and entrepreneurship. Each quasi-firm must have both a profit and loss account, and a blacken sheet.

The key thrust was delegating power down. It was like magic! Improved quality, productivity, morale. We have small teams, with no central control. The teams own the process, but they have to have one. They have many different processes. They women the schedule, but they feel the pressure of the market. This pressure causes them to reach for tools on their own.

After all, software engineering, like chemical engineering, is concerned with the nonlinear problems of scaling up into industrial-scale processes.

This chapter summarize many of the changes and learnings 20 years after the initial release of the book, a lot of takeaways.

I am also convinced that the conceptual integrity and a centralized understanding of the product is key to the ease of use for the product. The concept should be crisp, and the design principles should be practiced in designing every piece of the software. Every tradeoff should be made based on the concept.
Understand the target audience is key to success. Note it is not to rely on the target audiences to tell you what to do, it is to understand who are your target audience and understand their asks — but don’t just do what they want, understand what they need, and the rationale behind their wants.
I am getting convinced on the idea about frequencies…and the architect is expected to guess/postulate the attributes of their products, not based on surveys. Of course, you need a good architect, that is the pre-requisite to everything (almost everything).
People are everything…people are absolutely everything. I would further argue the leadership team is close to everything. You can have a B+ team and an A+ leadership to gradually grow that B+ team to an A team. If you have an A+ team, but with a B+ leadership, you end up with a team of B+. The A+ members either quit or become demotivated and devolved into B+.