Fundamentals of Software Engineering - Nathaniel Schutta and Jakub Pilimon
Fundamentals of Software Engineering - Nathaniel Schutta and Jakub Pilimon
Engineering
From Coder to Engineer
With Early Release ebooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take
advantage of these technologies long before the official release of these
titles.
OceanofPDF.com
Fundamentals of Software Engineering
by Nathaniel Schutta and Jakub Pilimon
Copyright © 2023 O’Reilly Media. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc. , 1005 Gravenstein Highway North,
Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales
promotional use. Online editions are also available for most titles
(http://oreilly.com). For more information, contact our
corporate/institutional sales department: 800-998-9938 or
[email protected].
OceanofPDF.com
Preface
Despite the way coding is taught, developers spend far more time reading
code than writing it. In most beginner coding courses, you jump
immediately into writing code, focusing on core language concepts and
idioms without acknowledging that you’d never learn Polish or Portuguese
in a similar manner. And while most academic projects start from a blank
slate, practicing developers are almost always working within the confines
of code that has taken years to arrive at its current state. While it may not be
your first choice, you will work with code you did not write. Take heart,
there are techniques to help you orient yourself in a new codebase.
TERMS MATTER
Reading old code is often the stuff of nightmares for most developers,
to the point that we often use the term legacy code to describe it. This is
usually not meant as a compliment. However, you shouldn’t disparage
the success of an existing application. If a product has delivered
business value for years and has justified continued investment, that is
worthy of a pat on the back. We prefer a more positive frame such as
heritage code or existing code. For a more in depth discussion of the
topic, please see Chapter X.
First and foremost, you need to understand the domain, the problem
you are trying to solve. And the domains developers work in are very
demanding! Software is eating the world, meaning software engineers
are tasked with increasingly more challenging business contexts; much
of the proverbial low hanging fruit has already been picked. But that is
only one part of the problem when dealing with existing code.
Second, you must see the problem through the eyes of the developer
who came before you, and that is often the most challenging aspect of
software development.
Third, it’s possible the code isn’t the right abstraction. Perhaps it is
modeled in a way that is too generic or fails to capture the proper
nuance of the domain.
It also doesn’t help that you are almost always dealing with patches on top
of patches. Maybe the last developer didn’t have a full understanding of the
problem or they weren’t up to speed on some new language feature that
could greatly simplify the job at hand. Add in the typical demands of fix it
fast, and you could spend an afternoon deciphering a single method.
Cognitive Biases
Of course you don’t write bad code, do you? On more than one occasion
your humble authors have struggled with some code, uttering less polite
variations of “what idiot wrote this” only to discover that it was actually
written by none other than ourselves. And frankly, if you read code you
wrote a few years ago, you should be a little disappointed—that’s a sign of
growth; you know more today than you did then. That is a good thing!
You also have a couple of cognitive biases working against you when you
work with existing code. First is the IKEA effect which says in a nutshell,
you place a higher value on things you create. One study found people
would pay 63% more for a product they successfully assembled themselves
versus the identical product put together by someone else. There are
actually several examples of companies profiting off the IKEA effect! If
you’ve ever picked your own strawberries or apples, you are often paying a
premium to, well, do some of the work yourself.
Additionally there is the mere exposure effect: you tend to prefer the things
you are already familiar with. Which leads to the typical dogmatism many
developers have around programming languages. Developers tend to think
time began with whatever language they learned first. When Java first
introduced Lambda expressions, someone on a language-specific mailing
list asked why Java needed these “new fangled Lambdas” not realizing
Lambdas are not a new concept in programming languages and were part of
the original plan for Java itself!
Developers can be very provincial around their preferred tools, which is
something Paul Graham touches on in his essay Beating the Averages.
Graham says programming languages exist on a power continuum, but you
often can’t recognize why a language is more powerful than another. To
demonstrate his point he introduces the hypothetical Blub language and a
very productive Blub programmer. When the Blub programmer looks down
the power continuum, all they see are languages that lack features they use
everyday, and they can’t understand why anyone would choose such an
inferior tool. When they look up the power continuum all they see are a
bunch of weird features they don’t have in Blub, and they can’t imagine
why anyone would need those to be productive since they aren’t in Blub
and they are a very good Blub programmer!
Learning a new language takes time, how do you justify the investment it
takes learning something you might not use daily at work? Learning a new
language will change how you code even if you don’t get to use the new
tool day in and day out. When you seek out a new language challenge, try
and pick something that is different from what you use at work. If you’re an
experienced Java developer, look beyond other C like languages such as C#
(not that there’s anything wrong with learning C# mind you) towards
different paradigms. Consider instead a dynamic language like Ruby or a
functional language like Haskell. Trust us, even just a cursory examination
of a language outside your normal neighborhood will fundamentally alter
your approach to programming. You may come to appreciate your regular
language more, or you may find yourself writing code in a different way.
Take time to learn new things.
Learning new languages gets easier over time. The more languages you
know, the more you have to compare to. Think back to the first language
you learned—you were starting at zero. By the third, fourth, fifth language,
you start to see how one idiom is just like another one from Java, or how
some structure was borrowed from Ruby.
With the Golden Rule in mind, write your documentation for those who will
come after you. Favor light weight, low ceremony approaches seeking to
answer common questions such as:
WARNING
Metrics Can Mislead
Code coverage (how much of the code base is executed when the tests are run) can be a
very useful metric on a project. However, there are no silver bullets in software, and it is
possible to fail even with 100% code coverage. A friend of ours joined a project that
was having regressions with every release. As he was getting up to speed on the code he
asked the tech lead if there were any tests. The tech lead very proudly said, “yes, we
have right around 92% code coverage.” Very impressed, our friend was somewhat
surprised they had so many regressions but he continued his analysis.
Looking at the test code he found some startling patterns. At first he thought these were
isolated, but eventually he discovered they were endemic to the code base. He went
back to the tech lead and said, “I couldn’t help but notice your tests don’t have any
asserts.” The tech lead responded by repeating the code coverage statistic.
The meta lesson is: be wary of any metric, because they can mislead. But don’t lose
sight of the value and purpose of a practice. If it is just about ceremony, you are unlikely
to get the benefit you expect. Project teams should regularly challenge themselves and
their approach; don’t be afraid to change course when warranted.
Software Archeology
Once you’ve surveyed the team and familiarized yourself with any existing
documentation, it is time to open your editor of choice and practice some
software archeology. Roll up your sleeves and root around in the code base!
To paraphrase Sir Issac Newton, look for smoother pebbles and prettier
shells. Look at the code structure—how is the code organized? Some
languages have first class constructs for packaging code, others rely on
conventions. How does the code fit together? What domain concepts are
expressed in the code? Read the tests—what do they tell you about the
functionality?
Once you have your bearings, run the application. What does it do? Find a
specific element, be it something on a user interface or a parameter to a
service call, and map that back to the code. Hunt for a landmark; if you
know a given action results in an update to the datastore, find that in the
code. Use your debugger to walk though the code—did it work the way you
anticipated? Did you end up on a vastly different code path? Ultimately,
you are building a mental model of the code, you are loading it into your
brain.
NOTE
“The goal of software design is to create chunks or slices that fit into a human mind.
The software keeps growing but the human mind maxes out, so we have to keep
chunking and slicing differently if we want to keep making changes.” —Kent Beck
Use your editor to navigate the code. Many editors make it very easy to
jump to methods in other classes as you work your way through the code.
Consider collapsing all the method bodies to give you a smaller surface area
to peruse (see Figure 1-1). Read the method names. What does that tell you
about the purpose of the module?
Figure 1-1. : Collapse methods to orient yourself in the codebase
Do not assume the code does what the name implies. Naming things is hard
and, as code evolves, variable and method names may no longer reflect
reality. Don’t rush, confirm your hunches. It is tempting to cut corners, but
take your time.2 Exceptions can mislead. More than once we have
encountered exceptions that made incorrect assumptions about possible
error conditions.
Figure 1-2. : Use your IDE to help you navigate code
Use your source code management tool as well. Many modern tools allow
you to quickly move about your project. Look at the change history of the
files. What changes frequently? What do the commit logs tell you about the
updates? Start with the most frequently modified classes, something git can
show you with a command like this:
git log -pretty=format: --since="1 year ago" --name-only -
"*.java" | sort | uniq -c | sort -rg | head -10
Who on your team made the most recent modification or the most frequent
changes? Don’t be afraid to reach out to your teammates with questions!
Figure 1-3. : Use your source code management tool to help you understand the code
Wrapping Up
Arguably, coding is taught backwards: you learn to write before you learn
to read, and yet you will spend a significant amount of your career reading
code written by someone else. While you may not enjoy existing code as
much as greenfield development, it comes with the paycheck. Rather than
run from the situation, learn to embrace it; there is much to gain
professionally. Be aware of cognitive biases. Don’t be afraid to roll up your
sleeves and root around in an unfamiliar codebase—you will learn
something. As your understanding grows, leave the code better than you
found it, easing the path of the next developer...which just might be you!
Additional Resources
Code as Design: Three Essays by Jack W. Reeves
Reading Code Is Harder Than Writing It
Reading Other People’s Code
How to quickly and effectively read other people’s code
How to read code without ripping your hair out
1 Or more commonly “Do unto others as you would have them do unto you.”
3 Though your organization’s lawyers likely have strong opinions about the use of such tools,
double check your corporate policies before you paste your pricing algorithm into one of them!
OceanofPDF.com
Chapter 2. Writing Code
Less is More
With the size of some code bases you might think developers are paid by
the character. While some problems genuinely require millions of lines to
solve, in most cases you should favor smaller code bases. The less code, the
less there is to load into people’s brains. Many projects reach the size where
it is no longer possible for one developer to actually understand all of the
code, which is one of the forces that has given rise to microservices and
functions as a service.
NOTE
“The goal of software design is to create chunks or slices that fit into a human mind.
The software keeps growing but the human mind maxes out, so we have to keep
chunking and slicing differently if we want to keep making changes.” —Kent Beck
The typical big balls of mud often have dictionary sized Getting Started
guides, and build processes that are measured in phases of the moon. It can
take developers new to the project weeks or months to get productive
within the code. The smaller the code base, the less time it takes for a new
developer to get their head wrapped around the code and the faster they can
start contributing, see Chapter X Working With Heritage Code for more
details.
THE 0TH LAW OF COMPUTER SCIENCE
Many of the practices software engineers espouse in an effort to tame
code boil down to the 0th Law of Computer Science: high cohesion,
low coupling. Cohesion is a measure of how things relate to one other.
High cohesion means essentially that like things are together. The
notification function that also contains print logic would be an example
of low cohesion. Coupling refers to the amount of interdependence
between modules or routines. Code with tight coupling can be difficult
to modify as changes to one part of the code unexpectedly affect other,
seemingly unrelated parts of the system. Changing the notification
service shouldn’t break the print module.
High cohesion and low coupling tend to result in code that is more
readable and simpler to maintain and evolve. Many patterns are ways of
achieving high cohesion and low coupling, often at different levels of
abstraction. At their best, arguably, microservices are high cohesion,
low coupling applied to services.
Class Vehicle {
Engine engine
Number num_wheels
Number num_doors
Function brake {}
Function accelerate {}
}
Class Truck < Vehicle {
Number tow_capacity
}
Class ElectricVehicle < Vehicle {
Number range
# wait… EVs don’t have engines…
}
But along come electric vehicles with nary a combustion engine to be found. An EV is-a
vehicle, but not like those other vehicles. Composition is more flexible and should be
favored over inheritance. That isn’t to say you should never use inheritance, just that
you should prefer composition.
Write short methods, as in single digit lines of code. Like a Linux or Unix
command line tool, methods should do one thing and only one thing, and
they should do it well working in concert to accomplish larger goals. Any
method name that includes conjunctions (and, or, but) is a sign the method
is doing too much. Method names should be clear and concise and avoid
being clever. If you are having a hard time naming a method, it might be
doing too much. Try breaking it apart and see what happens. Be descriptive.
Remove logic duplication, even in small amounts. Simplify, then simplify
some more.
Write Code to be Read
There are any number of guidelines you can apply when it comes to the
“correct” length of a function from reuse to picking an often arbitrary
number of lines. Martin Fowler offers sage advice: mind the separation
between intent and implementation. In other words, how long does it take
for you to understand what a function is doing? You should be able to read
the name and understand immediately what the code does without
investigating the method body itself. This principle will often lead to
functions with only a few or even just a single line of code; make the
intention clear.
Code can be written like a newspaper article with an “inverted pyramid”.
Articles start with the lead then move to key facts and then on to deeper
background. You as the reader can stop at whatever level of detail you wish
—maybe just the first couple of paragraphs, maybe all the way to the end.
As they say in publishing, don’t bury the lead.
Code should follow the same model. The class name is almost like the
headline of an article. From there you should be able to skim the method
signatures to get a general understanding. If you want to explore a function
or a call to another class you are free to do so, but you should still
understand the gist of the code.
NAMING THINGS IS HARD
Every developer has stared at their editor struggling to name a variable,
method or class. And while this struggle can indicate an insufficient
understanding of the problem or some overly complex code, there is a
reason foo, bar and foobar are such common occurrences. It can be time
consuming to come up with meaningful names. Don’t rush!3 It is worth
the effort to come up with good names. Don’t be afraid to reach out to a
teammate for their input. Sometimes just explaining what you’re
working on will be enough to inspire the perfect moniker.
Don’t hesitate to refactor poorly named code. Modern editors have
powerful tools that make renaming a straightforward endeavor. Just
make sure you aren’t constantly renaming core domain concepts,
because that usually indicates a problem of misconception.
It may also help to play the gibberish game. Often, the first word you
use to define a concept isn’t the best option but it may shape your
thinking about the problem domain. When you are first working
through the domain, make up a word! After you’ve done some
additional analysis, go ahead and replace the gibberish with real words
—you’ll likely have some up with something that’s very different, and
clearer, than your first reaction. The next time you’re really stuck on
what to call something, throw in some nonsense words and circle back.
// Dear maintainer:
//
// Once you are done trying to
//’optimize’ this routine,
// and have realized what a terrible
// mistake that was,
// please increment the following
// counter as a warning
// to the next person:
//
// total_hours_wasted_here = 42
Tests as Documentation
If comments aren’t an appropriate way to document code, what should you
do? Write tests. Tests, especially those written in more fluent styles, are
executable specifications that evolve along with the production code.
Documentation, whether code comments, readmes or specifications, tends
to diverge from the code as soon as it’s written. Tests written while you
write code allow you to refactor freely and increase your confidence in the
quality of your application. They also act as signposts for the developers
that follow you. Testing is explored in more depth in Chapter X Automated
Testing.
Adding tests to existing code allows you to capture what you’re learning
about how the code works and unlocks your knowledge such that other
developers can benefit from your work. As you rename a method or
variable and prune some dead code, you’re actively leaving the code better
than you found it.
Some developers insist they need to write copious comments for those who
consume their services. While these comments are less smelly than those
mentioned earlier, they aren’t your only option. Again, tests make for a
more resilient documentation mechanism. Utilizing Consumer-Driven
Contracts allows you to convey what your service does while also giving
you the confidence to iterate as necessary. As long as you haven’t violated
the contract, you can evolve your code freed from the worry that you might
inadvertently break a downstream system. Consumers gain confidence in
your services, as they have a set of tests they can execute that simulates the
expected behavior of your code. They can modify their code without fear of
introducing a new defect.
Consumer-Driven Contracts are a vital part of reliable and resilient
software. Many languages and frameworks have projects you can (and
should!) leverage with your applications. From Spring Cloud Contract to
Pact versions for nearly every platform, you have options.
----
if (condition)
doFoo();
doBar();
----
Avoid the error prone forms in your toolchain of choice. Just because you
clearly understand it does not guarantee that the information is widely
distributed across your team. Don’t be afraid to update (or establish) coding
standards to cover these cases. There are a number of static analysis tools
you can add to your deployment pipeline to keep you and your team from
inadvertently introducing these types of problems. Take advantage of them.
Code Reviews
Code reviews can vary from asking a colleague for feedback on a method
all the way to hours-long walkthroughs with several developers. Regardless
of the specific implementation details, code reviews are an excellent way to
learn, share experience, and socialize knowledge. More eyes on code is a
good thing and part of the reason some organizations use pair programming.
Whether formal or not, there are certain practices that can improve your
code review practice. First and foremost, don’t be snarky. Avoid sarcasm.
Asking for feedback can be very stressful for people, and many people take
criticism personally. How you share your comments is critical. Be
empathetic to your teammate. While it may be tempting to use a code
review as an opportunity to drop some esoteric bit of trivia on your team,
the goal is to improve the code, not exhibit your technical expertise.
NOTE
“…the only way to make something great is to recognize that it might not be great yet.
Your goal is to find the best solution, not to measure your personal self-worth by it.” —
Jonas Downey
Focus your attention on the most important things. While style points
matter, your effort is better spent lower down on the Code Review Pyramid.
You can (and should) automate formatting and style related issues; let a
computer handle those. Your time and effort should be spent on the things
computers can’t detect for you. Are method and variable names clear and
concise? Is the code readable? Is there duplication? Does the code have the
proper logging, tracing and metrics? Are interfaces consistent with the rest
of the code? Did the developer use any error prone forms?
It is Hard to be Criticized
Developers often invest a lot of themselves into their work, so sometimes it
is hard not to take feedback personally. Code reviews are not an opportunity
to embarrass someone because they didn’t know about some new language
feature or didn’t immediately see a simpler way to solve a problem. No one
is perfect, everyone makes mistakes. Code reviews are about building better
applications and are about the code, not the coder. Don’t get personal in a
code review. Be humble and ask helpful questions. Critiques are more
digestible when they are sandwiched by compliments, so be sure to point
out the good things too.
Share your experiences. Personal stories carry immense weight and diffuse
people’s natural resistance to change. Offer assistance with things you’ve
encountered on previous projects. Be careful with blanket proclamations.
Make sure you have all the details before you pronounce something won’t
work, as you may be missing a key bit of context. Is there some background
you don’t have? Perhaps there are some constraints you aren’t aware of.
Stick to the code.
NOTE
“Every single one of us is doing the absolute best we can given our state of
consciousness.” —Deepak Chopra
Fostering Trust
If you shouldn’t do checkboxes or LFTM reviews, what should you do?
Regardless of your code review process, don’t lose sight of the purpose of
code reviews. You should be sharing experiences, learning and growing as a
team while avoiding problematic practices. Code reviews should foster
collective code ownership and foster trust amongst the team. Promoting a
bug of the week or just taking the time to share something you ran into can
be incredibly powerful.
It should go without saying, but treat your teammates with respect. Be kind,
do what’s right, do what works. Don’t be afraid to take a moment to review
your approach and ask if there might be a better way. Whether you’re
following an agile development methodology or not, you should adapt and
adjust on a regular basis. If something isn’t working, change it!
Wrapping Up
Developers write code, it is part and parcel of the job. Avoiding error prone
forms and overly clever code can be the difference between a codebase
that’s a pleasure to work on and one that developers avoid like the plague.
From code reviews to analysis tools, there are many ways to help you write
better code. Favor writing tests over copious comments. When critiquing
code, be empathetic. Never forget, code should be written to be read by
humans; adhering to that principle goes a long way towards ensuring you
write code others will stamp with the elusive “good” label!
Additional Resources
Portrait of a Noob
Favor composition over inheritance
The Mythical Man-Month: Essays on Software Engineering
An Appropriate Use of Metrics
Simple Made Easy
1 Like on the freeway when people notice the state trooper in the median and everyone slows
down
2 Though with modern monitor sizes you may want to stick to a single page.
4 You can often tell the experience level of a developer by their use (or avoidance) of code
comments.
5 Dogmatism, whether in software or life, is rarely the right path. Favor pragmatism.
6 One of your authors, who shall remain nameless in this instance, spent the better part of two
weeks debugging that code block.
7 Every developer has stared at some code wondering what idiot wrote this only to slowly
realize they actually were the “idiot” that wrote said code.
8 With the exception of certain, very specialized programming examples. You’ll know it when
you encounter it.
OceanofPDF.com