My favorite books on software engineering

I started to teach myself programming when I was 8 years old.  It was a brand new Macintosh Plus, and I was totally hooked on Hypercard.  I mostly learned by tearing apart the examples it came with, and figuring things out myself.  And, while this was fun, it really didn’t teach me a lot about software engineering.

I personally look at the difference between programming and software engineering as the difference between building a dog house and building a real house.

You can learn how to build a dog house on your own: completely self-taught.  You can build a dog house over a weekend or two of dedicated effort.  You can build a dog house on your own.  The consequences of completely screwing up your dog house are pretty minimal.  All that said, it’s not actually easy to make a super nice dog house, so there’s some real effort involved if you’re going to do it well, and there are some spectacular dog houses out there.

However… building a real house is an entirely different kind of endeavor.  To even start, you’re going to need to seek out some professional education.  You certainly will need a lot more than a few weekends.  You’re also going to need a team of specialists to help.  And, of course, the consequences of screwing up a real house will be very expensive: if not actually dangerous.

So, when I went off to school to learn how to go from being an avid programmer to a professional software engineer, I had a lot to learn.

I may talk about my schooling at some future point, but it was actually the reading I did on my own, at the recommendation of some of my early mentors, that taught me the most.  So, I want to pay it forward, and recommend a couple of those books I found most formative.

Code Complete, by Steve McConnell is where I was first introduced to most of the essential concepts of large-scale software engineering.  As each concept was introduced, it was accompanied with lots of examples in code, and excellent, in-depth explanations.  It is no exaggeration that this book completely changed how I wrote code. If you only ever read one book on coding, make it this book.

Design Patterns, by Erich Gamma et. al. is the foundational book for the entire concept of design patterns.  In short, straight-forward chapters, it teaches some of the most fundamental and re-usable building blocks of complex software systems.  The patterns described in this book have become the very language of software design.

Refactoring, by Martin Fowler outlines the most essential skills in changing existing code. Whether it’s to add a new feature, fix a bug, or anything else you’ll want to learn and follow the procedures outlined in this book.


To be fair, these are three among a great number of amazing books on software engineering, so the fact that I don’t list a particular book here doesn’t mean it’s not valuable.  There are a lot of other books I have on my shelf and have read cover-to-cover (some of them multiple times).  However, of all the books in my collection, I view these three as having been most personally influential on the way I write and think about code.

Fail early, fail loudly

One easy step to help simplify your programs is to follow the adage “fail early and fail loudly”.  By failing early, I mean that your code should actively look for problems and stop as soon as something wrong is encountered.  By failing loudly, I mean that your code should raise the alarm in a way that makes it obvious to other parts of the system (and people reading the code) that something unusual has just occurred.

Naturally, this doesn’t mean that your code should explode whenever the slightest thing is wrong.  On the contrary, it means that each object/package/component in your system should demonstrate good cohesion by refusing to operate in a situation it shouldn’t have enough context to do so.  Instead, each component should fail when encountering something that it isn’t supposed to be able to handle, and delegate responsibility for handling that situation to the caller.  The caller can then decide whether it has sufficient context to handle the situation, or to pass along the responsibility to another component.

Failing early is a benefit because you can then build whole sections of your program which don’t have to consider some particular error case.  For example, consider building a web service which accepts objects encoded in XML as input.  If all the necessary error checking is performed up front (e.g. the XML is validated against the DTD, required elements needed in the API are found to be present, and various values are confirmed to be within acceptable ranges), the remainder of the web service call can be written to assume that the request was perfectly valid.  The notion of Short Circuit statements I discussed recently is another variation on failing early.  All of them reduce the mental load of reading (and writing) the code which follows them.

Failing loudly is a benefit because it reduces the mental effort required to follow up on errors which occur in other parts of the system.  If a component fails loudly, a caller must ensure that they correctly respond to error conditions in order to be correct themselves.  The best example of this is a checked exception.  In Java, for example, any method which can possibly throw a checked exception must explicitly declare it in its signature.  Calling methods are forced—by the compiler—to either to catch the exception or declare that they throw the same exception themselves.  This radically reduces the mental effort required to follow up on the error, and provides a built-in mechanism to remind you when you’ve forgotten to do so.

(The fact that some API’s abuse checked exceptions horrendously is a topic for another time.)

There are a number of cases where these two ideas apply, but they are especially true around where data enters a program’s runtime environment.  Some examples include: reading from a database, accepting user input from the mouse/keyboard, accepting an incoming TCP/HTTP connection, or simply reading data from a file.  In all of these cases, doing all of your error checking as soon as the data has entered the runtime environment will allow subsequent code to assume all the data was correct.

Failing early and failing loudly can also be seen as corollaries of Cohesion and Coupling.  If a component is well-integrated to a single purpose, it will not attempt to handle conditions which are outside of that purpose.  Instead, it will fail as soon as it is asked to cope with such a situation.  In order for a component to offer loose coupling, it must reduce the amount of knowledge needed to interact with it.  One aspect of this is to make it very clear what to expect in case the component has problems.  Both of these concepts lead back to preserving the Unit Economy of the reader (and author!) of the code.

How to stop hating to write tests

Pretty nearly every developer I’ve ever worked with either hates writing automated tests, or doesn’t do it at all.  And why shouldn’t they?  After all, it’s a ton of tedious work which doesn’t impress anyone looking at the final product.  Yeah, yeah, it improves quality a bit, but still… it take so much time and effort in the first place, and even more effort to keep them from breaking all the time.  Right?

Of course not.

The problem is that we’ve mostly not been taught to write tests, and our testing frameworks tend to lead us in the wrong direction.  For example, consider this made-up little example which follows a pattern I’ve seen all too often:

class MyObnoxiousUnitTest(TestBase):

    def setup():
        # Do a little work to set things up.  Maybe this is creating
        # a database connection, maybe clearing out a directory of
        # stale test results, etc.
    def test_something(self):
        # here's about 5-10 lines of code to set up some test data
        # ...
        # ...
        # ...
        # ...

        # and here's another 5-10 lines of code to verify the results
        # ...
        # ...
        # ...
        # ...

        # now let's have another 3-4 lines of code to tweak some
        # little thing
        # ...

        # and now another one or two lines to verify that
        # ...

Had enough? And that’s just one test… what about your next one?  I suspect it will look very much the same, and be documented just as well.  Except, you’ll copy-n-paste a little bit from the first setup block, and tweak it some so it looks similar without being quite the same.  The same for the next one… and the next…  And good luck if someone else wrote the tests in the first place.

Before long, you’ve got a test file which is hundreds of lines long with code which has been copy-pasted into existence, but none of which is documented or easy to follow.  So, now what happens when you want to add another test?  More copy-pasta?  Probably.  And the problem gets even worse.  No wonder everyone hates testing.


A Better Way

Fortunately, this isn’t the only way to write automated tests, and there are even a number of frameworks which can help (e.g., rspec in Ruby, mamba in Python, or mocha in JavaScript).  This “better” style of testing grew out of a movement called Behavior-Driven Development (BDD).¹

Let’s start with a pretty typical example testing a hypothetical CSV reader class², and then pick it apart:

with description("with a CSV file full of valid data") as self:

    with before.each:
        self.csv_doc = CsvDoc("my-test-data.csv")

    with it("should contain a list of headers"):
        self.csv_doc.headers.should.equal(["alpha", "bravo"])

    with it("should contains only elements of the right form"):
        for element in
            sorted(element.keys()).should.equal(["alpha", "bravo"])

    with description("when the data is modified and re-read"):

        with before.each:
  {"alpha": "a", "bravo": "b"})

            self.csv_doc2 = CsvDoc("saved-data.csv")

        with it("should have the same contents as the first doc"):

The first thing you’ll notice is that this approach is a lot more structured. There isn’t just one big test function with a bunch of code in it.  Instead we have a definite pattern where we:

    1. Define, in English, what the state we’re testing is (i.e., with description)
    2. Write some code to make that state true (i.e., with before.each)
    3. State, in English, one specific thing which should be true now (i.e., with it)
    4. Write some code to verify that really did happen.

You can see that exact pattern repeated several times in this example.  This makes the tests much easier to follow, and gives a great deal of built-in documentation as to exactly what conditions are being tested, and what the expected outcomes are—in plain English.

The second thing you’ll notice is that this pattern not only repeats, but becomes progressively more nested.  Each nesting means that any state which happened in the outer layers will also be applied to the inner layers.  So, in our final test case (i.e., the last with it statement) we get both of the with before.each statements run before our test.  This provides an exceptionally easy way to share state between individual tests, thus saving us the massively problematic copy-pasta in the more conventional approach.

Finally, the third thing you’ll probably notice is that each with it block is super short.  Since each one only has assertions, and each one only asserts a single condition (described with an English sentence), it really doesn’t need much code.  This makes the tests both extremely well-documented, and very easy to modify.  Stop for a moment, and think how eager you would be to add a missing test case at 16:45 on a Friday with each approach…


The important take-away here is that the architecture of your tests matters.  We’re often fed the line that test code is throw-away code, and therefore the same rules don’t apply as when writing “real” code.  This is a colossal mistake.  Written badly, your test code will massively slow down a development team, and be a major source of conflict among its members.  Written to the same standards as any other code, it can be fast to write, easy to change, and save you a ton of time and trouble.


¹ While BDD gets the credit for originating this mode of testing, it recommends going way, way beyond what I personally do or would recommend. ↩︎

² I’m using Python along with the mamba and sure libraries in this example only because that’s what I happen to be working in these days. ↩︎

Diving Deep on Coupling

Last time, I described how Cohesion applies to everything from writing a single line of code all the way up to designing a remote service. Now, let’s consider the same thing for Coupling.

Recall that Coupling is the mental load required to understand how a particular component relates to another compoent. If we take a line of code as a single component, then what defines how it is connected to the lines around it? For a start: the local variables it uses, methods it calls, conditional statements it is part of, the method it is contained in, and exceptions it catches or throws. The more of these things a single line of code involves, the more coupled it is to the rest of the system.

As an example, consider a line of code which uses a few local variables to call a method and store the result. This could be more or less coupled depending upon a number of factors. How many local variables are needed? Are any of the variables static or global variables? Is the method call private to the class, a public method on another class, or a static method defined somewhere? Is the result being stored in a local variable, an instance variable, or a static/global variable? Depending upon the answers to these questions, that one line may be more or less coupled to the other lines around it.

The implication of having coupling which is too tight for a single line of code is that you have to understand a lot of other lines in order to understand that one. If it uses global variables, then you have to also understand what other code modifies the state of those variables. If it uses many local variables, then you have to understand the code which sets their values. If it calls a method on another object, then you have to understand what impact that method call will have. All of these things increase the amount of information you need to keep in mind to understand that line of code.

Now, consider what coupling would mean for a remote service which is part of a large distributed system (e.g. The connections such a service has are defined by the API it offers, the other services it consumes, and how their APIs are defined. For the service’s own API, consider the following: does the API respond to many service calls or just a few? Do the service calls require a lot of structured data to be passed in? How easy is it for a caller to obtain all the necessary information? How much is the service’s internal implementation attached to the API it presents? How common is the communication protocol clients must implement? For the other services it consumes, consider: how many other services does it use? How are their APIs defined (considering the questions above)? Just as with a single line of code, the answers to these questions will define how tightly coupled a service is to the rest of the system around it.

Having coupling which is too tight for a remote service carries troubles, too. Changes to downstream systems may force the service to need an update. Any change to the API may require upstream services to change as well. It may be impossible to change the service’s implementation if it is too tightly coupled to its own API. Finally, it may be difficult to break the service into separate services as it grows in scope. It can be a costly and painful mistake down the road to allow too much coupling between services in a distributed environment.

Diving Deep on Cohesion

The concepts of coupling and cohesion apply at all levels of programming from writing a single method all the way up to planning the architecture of As you build each piece (an individual line of code, a method, an object, or an entire remote service), you have to make sure it has strong cohesion and loose coupling. Does the component do exactly one thing which is easy to describe and conceptualize? Does it have relatively few, easy-to-understand connections to the other components around it?

Consider what it means to have strong cohesion for a single line of code. To have good cohesion, it would need to produce a single clear outcome. On the other hand, a line of code with poor cohesion will tend to have multiple side effects, or calculate many values at once:

int balance = priorBalances[balanceIndex++] - withdrawals[withdrawalIndex++];

float gravitation = UNIVERSAL_G *
    (bodyA.mass * KG_PER_LB) *

    (bodyB.mass * KG_PER_LB) /

    ((bodyA.position.x - bodyB.position.x) *

    (bodyA.position.x - bodyB.position.x) +

    (bodyA.position.y - bodyB.position.y) *

    (bodyA.position.y - bodyB.position.y));

In both of these cases, the code is doing multiple things at once, and in order to understand what is going on, you have to mentally pull it apart, understand each piece, and then integrate them back together. Both of these lines can easily be re-written as several lines which each demonstrate much better cohesion:

int balance = priorBalances[balanceIndex] - widthdrawals[withdrawalIndex];

float massA = bodyA.mass * KG_PER_LB;

float massB = bodyB.mass * KG_PER_LB;

float xRadiusPart = bodyA.position.x - bodyB.position.y;

float yRadiusPart = bodyA.position.y - bodyB.position.y;

float radiusSquared = xRadiusPart * xRadiusPart + yRadiusPart * yRadiusPart;
float gravitation = UNIVERSAL_G * massA * massB / radiusSquared;

Each of the re-written examples has statements which are simpler, easier to understand, and clearly accomplish a single result.

At the far other end of the size spectrum, consider what strong cohesion means for a single service in a massively distributed system (e.g. In Amazon’s earliest days, there was a single, central piece of software, called Obidos, which was responsible for everything from presenting HTML to calculating the cost of an order, to contacting the UPS server to find out where a package was. This ultimately resulted in single program which constantly broke down, was impossible to understand fully, and actually took over a day to compile. The crux of the problem is that Obidos tried to do too much, and wound up with terrible cohesion. There was no way anyone could get their head around the essential functions it performed without dropping all kinds of important information.

That was many years ago, and since then, Amazon has considerably improved its situation. As an example, there is now a single service whose sole purpose is to compose the totals and subtotals for an order. It communicates with other services which each compute individual charges (e.g. shipping charges, tax, etc), and all it does is put them together in the right order. This new service is much easier to understand, far easier to describe, and much, much easier to work with on a daily basis.


“Thank you” to Adam M. for pointing out an error in the code example!  It’s been fixed up now.

Coupling & Cohesion

In my previous post, I discussed how the mind is naturally limited in the number of things it can consider at once, and how we create abstractions to increase the range of our thinking. By using abstractions, we can hide the details of how something works, thereby allowing ourselves to handle more information and still only have to keep in mind a small number of discrete items. This concept is called unit economy.

A consequence of this limitation is that we naturally design complex systems by breaking them down into simpler pieces. If any one piece is still too complex to build, then we break that piece down even further. The act of breaking a system into pieces serves the same function in engineering that creating abstractions does in thinking. Both allow us to ignore the details how a part of the system works, and just keep in mind the overall notion of what it does.

In order for this decomposition to work, however, we must follow two principles: coupling and cohesion.

Coupling is the extent to which two components are interconnected. This connection can be defined in terms of actual connections in the final design, but, for our purposes, consider it in terms of how much one has to know about one component in order to understand the function of the other. The crucial point is that coupling describes the mental load required to understand the relationship between the two components.

To take some examples in the physical world, consider a toaster and a gas stove. A toaster is loosely coupled to the rest of the kitchen. It has a single plug, which is an industry standard, and which is shared by nearly every other electrical appliance in the kitchen. On the other hand, a gas stove is tightly coupled to the rest of the kitchen. It requires a gas main, a vent to be installed above it, an exhaust pipe, and it must be mounted flush with the rest of the cabinetry. When installing a toaster, you simply have to find a flat surface near a plug. When installing a gas stove, you need to understand quite a bit about the structure of the whole kitchen. The mental effort required to understand how a toaster is connected to the rest of the kitchen is far less than that required for the stove.

Cohesion is the extent to which all the parts of a component serve a unified purpose. For the purposes of computing mental load, we measure this by how easily we can come up with a single sentence which describes the essence of what the component does, and by whether each part of the component is needed to accomplish that task. The crucial point in terms of unit economy is that we are able to come up with a simple abstraction for the component which allows us to ignore the details of how the component works.

For some examples of strong and weak cohesion, consider a television set and a swiss army knife. In the television set, the description of what it does is pretty simple: “A television set converts a TV signal into a visible picture”. On the other hand, describing a swiss army knife isn’t nearly so simple. Attempting to come up with a similar statement gets pretty awkward: “A swiss army knife is a multi-function device which provides the ability to conveniently store and reveal tools to: cut things in a variety of ways, drive screws of various kinds, etc.” When considering building some kind of system with these things, it’s much easier to keep in mind a simple definition (like the TV set) than a rambling, complex one (like the swiss army knife).

For further reading:

  • McConnell, Steve. “Code Complete: A Practical Handbook of Software Construction, 2nd Edition”. Chapter 2 (Amazon)