Refactoring Booklet
Refactoring Booklet
Message Chains (81) Hide Delegate (189), Extract Function (106), Move
Function (198)
Middle Man (81) Remove Middle Man (192), Inline Function (115),
Replace Superclass with Delegate (399), Replace
Subclass with Delegate (381)
Mutable Data (75) Encapsulate Variable (132), Split Variable (240), Slide
Statements (223), Extract Function (106), Separate
Query from Modifier (306), Remove Setting Method
(331), Replace Derived Variable with Query (248),
Combine Functions into Class (144), Combine
Functions into Transform (149), Change Reference to
Value (252)
Books in the Martin Fowler Signature Series are personally chosen Repeated Switches (79) Replace Conditional with Polymorphism (272)
by Fowler, and his signature ensures that he has worked closely with Shotgun Surgery (76) Move Function (198), Move Field (207), Combine
authors to define topic coverage, book scope, critical content, and Functions into Class (144), Combine Functions into
overall uniqueness. The expert signatures also symbolize a promise to Transform (149), Split Phase (154), Inline Function
(115), Inline Class (186)
our readers: you are reading a future classic.
Speculative Generality (80) Collapse Hierarchy (380), Inline Function (115), Inline
Class (186), Change Function Declaration (124),
b Remove Dead Code (237)
Temporary Field (80) Extract Class (182), Move Function (198), Introduce
Special Case (289)
Data Clumps (78) Extract Class (182), Introduce Parameter Object (140),
Second Edition
Preserve Whole Object (319)
Divergent Change (76) Split Phase (154), Move Function (198), Extract
Function (106), Extract Class (182)
Duplicated Code (72) Extract Function (106), Slide Statements (223), Pull
Up Method (350)
Insider Trading (82) Move Function (198), Move Field (207), Hide Delegate Martin Fowler
(189), Replace Subclass with Delegate (381), Replace
Superclass with Delegate (399) with contributions by Kent Beck
Large Class (82) Extract Class (182), Extract Superclass (375), Replace
Type Code with Subclasses (362)
Lazy Element (80) Inline Function (115), Inline Class (186), Collapse
Hierarchy (380)
Long Function (73) Extract Function (106), Replace Temp with Query
(178), Introduce Parameter Object (140), Preserve
Whole Object (319), Replace Function with Command
(337), Decompose Conditional (260), Replace
Conditional with Polymorphism (272), Split Loop (227)
Long Parameter List (74) Replace Parameter with Query (324), Preserve Whole Boston • Columbus • New York • San Francisco • Amsterdam • Cape Town
Object (319), Introduce Parameter Object (140),
Dubai • London • Madrid • Milan • Munich • Paris • Montreal • Toronto • Delhi • Mexico City
Remove Flag Argument (314), Combine Functions into
São Paulo • Sydney • Hong Kong • Seoul • Singapore • Taipei • Tokyo
Class (144)
Photo by izusek/gettyimages
Many of the designations used by manufacturers and sellers to distinguish their products Register Your Product at informit.com/register
are claimed as trademarks. Where those designations appear in this book, and the publisher
was aware of a trademark claim, the designations have been printed with initial capital $ccess additional beneȴts and save 35% on your next purchase
letters or in all capitals.
The author and publisher have taken care in the preparation of this book, but make no • Automatically receive a coupon for 35% off your next purchase, valid
expressed or implied warranty of any kind and assume no responsibility for errors or for 30 days. Look for your code in your InformIT cart or the Manage
omissions. No liability is assumed for incidental or consequential damages in connection
Codes section of your account page.
with or arising out of the use of the information or programs contained herein.
• Download available product updates.
For information about buying this title in bulk quantities, or for special sales opportunities
(which may include electronic versions; custom cover designs; and content particular to • Access bonus material if available.*
your business, training goals, marketing focus, or branding interests), please contact our
• Check the box to hear from us and receive exclusive offers on new
corporate sales department at [email protected] or (800) 382–3419.
editions and related products.
For government sales inquiries, please contact [email protected].
*
For questions about sales outside the U.S., please contact [email protected]. Registration benefits vary by product. Benefits will be listed on your account page under
Registered Products.
Visit us on the Web: informit.com/aw
Library of Congress Control Number: 2018950015 InformIT.com—The Trusted Technology Learning Source
InformIT is the online home of information technology brands at Pearson, the world’s
Copyright © 2019 Pearson Education, Inc.
foremost education company. At InformIT.com, you can:
Cover photo: Waldo-Hancock Bridge & Penobscot Narrows Bridge by Martin Fowler • Shop our books, eBooks, software, and video training
Lightbulb graphic: Irina Adamovich/Shutterstock • Take advantage of our special offers and promotions (informit.com/promotions)
• Sign up for special offers and content newsletter (informit.com/newsletters)
All rights reserved. Printed in the United States of America. This publication is protected
by copyright, and permission must be obtained from the publisher prior to any prohibited • Access thousands of free chapters and video lessons
reproduction, storage in a retrieval system, or transmission in any form or by any means,
electronic, mechanical, photocopying, recording, or likewise. For information regarding Connect with InformIT—Visit informit.com/community
permissions, request forms and the appropriate contacts within the Pearson Education
Global Rights & Permissions Department, please visit www.pearsoned.com/permissions/.
ISBN-13: 978-0-13-475759-9
ISBN-10: 0-13-475759-9
1 18 Addison-Wesley • Adobe Press • Cisco Press • Microsoft Press • Pearson IT Certification • Prentice Hall • Que • Sams • Peachpit Press
For Cindy
Photo by marvent/Shutterstock —Martin
LEARN QUICKLY
Learn a new technology in just hours. Video training can teach more in
less time, and material is generally easier to absorb and remember.
TEST YOURSELF
Our Complete Video Courses offer self-assessment quizzes throughout.
CONVENIENT
Most videos are streaming with an option to download lessons for offline viewing.
Learn more, browse our store, and watch free, sample lessons at
i n f o r m i t. co m / v i d e o
Save 50%* off the list price of video courses with discount code VIDBOB
*Discount code VIDBOB confers a 50% discount off the list price of eligible titles purchased on informit.com. Eligible titles include most full-course video titles. Book + eBook bundles,
book/eBook + video bundles, individual video lessons, Rough Cuts, Safari Books Online, non-discountable titles, titles on promotion with our retail partners, and any title featured
as eBook Deal of the Day or Video Deal of the Week is not eligible for discount. Discount may not be combined with any other offer and is not redeemable for cash. Offer subject to change.
This page intentionally left blank This page intentionally left blank
418 Index
vii
viii Contents Index 417
Chapter 3: Bad Smells in Code ......................................................................... 71 S with delegates, 81–82, 84, 381–398
Sadalage, Pramod, 61, 70 with fields, 369–374
Mysterious Name ........................................................................................ 72 Substitute Algorithm, 195–196, 230, 309
Self Delegation tactic, 77
Duplicated Code ......................................................................................... 72 Self-Encapsulate Field. See Encapsulate Variable Superclasses
Long Function ............................................................................................. 73 Self-encapsulation, 133, 364 defining constructors for, 356
Self-references, 319 extracting, 82–83, 375–379, 383, 395
Long Parameter List ................................................................................... 74 Separate Query from Modifier, 75, 179, 264, pushing down:
Global Data ................................................................................................. 74 306–309 fields, 361, 363, 380
Setters methods, 359–360, 363, 366, 380
Mutable Data .............................................................................................. 75 replacing with delegates, 81–82, 84, 376,
naming, 134, 176
Divergent Change ...................................................................................... 76 removing, 75, 83, 171, 173, 253, 255, 399–404
Shotgun Surgery ......................................................................................... 76 331–333 role of, 278
returning a copy of data, 135–136 with interfaces not supported in subclasses,
Feature Envy ............................................................................................... 77 Shallow copies, 27 84
Data Clumps ............................................................................................... 78 Shotgun surgery, 76 Swap Statement, 226
Side effects, 75, 225, 306–309 See also Slide Statements
Primitive Obsession ................................................................................... 78 Switches
Simple design, 63
Repeated Switches ...................................................................................... 79 Slide Statements, 18–20, 72, 75, 112, 210, 214, repeated, 79
Loops ........................................................................................................... 79 217, 223–226, 229, 356–357 signaling code to extract, 74
See also Swap Statement System clock, calls to, 109
Lazy Element .............................................................................................. 80 Smalltalk, 67
Speculative Generality ............................................................................... 80 Refactoring Browser, 68 T
subclass responsibility errors in, 352 Teardown, 95
Temporary Field ......................................................................................... 80
using short methods in, 107 Telephone numbers
Message Chains .......................................................................................... 81 Smells. See Code, bad smells in adding logic to, 174
Middle Man ................................................................................................. 81 Software. See Code behavior of, separated into a class,
Special cases, introducing, 81, 289–301 183–185
Insider Trading ........................................................................................... 82 Speculative generality, 80 formatting, 125
Large Class .................................................................................................. 82 Split Loop, 18, 20, 74, 227–230 programming for, 253–255
Split Phase, 24, 76–77, 83, 154–159 Temporary fields, 80
Alternative Classes with Different Interfaces .......................................... 83
Split Temp. See Split Variable Temporary variables (temps), 16
Data Class ................................................................................................... 83 Split Variable, 75, 108, 112, 114, 225, 240–243, replacing with queries, 73, 108, 114, 119,
Refused Bequest ......................................................................................... 83 249–250 178–181, 325
State design pattern, 382 Test coverage analysis, 99
Comments ................................................................................................... 84 Statements Test-Driven Development, 87
moving: Tests, 85–100
Chapter 4: Building Tests .................................................................................. 85 into functions, 213–216 adding to legacy code, 60
The Value of Self-Testing Code ................................................................ 85 to callers, 117–118, 155, 213, 217–221, 285 affecting productivity, 86, 100
sliding, 72, 75, 112, 210, 214, 217, 223–226, choosing how many to write, 93, 98, 100
Sample Code to Test .................................................................................. 87 229, 356–357 duplicated code in, 94
A First Test .................................................................................................. 90 Static typing, 127 evolving over time, 99
Strategy design pattern, 77, 382 failing, 21
Add Another Test ....................................................................................... 93
Subclass responsibility errors, 352 intermittently, 94
Modifying the Fixture ................................................................................ 95 Subclasses where they should, 91–92, 99
Probing the Boundaries ............................................................................. 96 creating, 38–39, 282 for boundary conditions, 97
duplicated code in, 72 for setters, 95
Much More Than This ............................................................................... 99 overriding methods in, 116, 282 importance of, 5
pulling up: in IDE, 92
Chapter 5: Introducing the Catalog ............................................................... 101 fields, 353–354, 376, 378, 380 nondeterministic, 94, 109
Format of the Refactorings ...................................................................... 101 methods, 350–352, 358, 376–380 running:
refusing implementations, 84 after each change, 8
The Choice of Refactorings ..................................................................... 102 replacing: frequently, 86, 92
type code with, 79, 82, 362–368 self-checking, 5, 59–60, 63, 85–87, 302
416 Index Contents ix
Refactoring (continued) Repetitive code. See Duplicated code Chapter 6: A First Set of Refactorings ........................................................... 105
definition of, 45–46 Replace Command with Function, 344–347
embedded into code reviews, 54 Replace Conditional with Polymorphism, 34, 39,
Extract Function ....................................................................................... 106
exercises to practice, 70 79, 272–288, 359, 363, 366 Inline Function ......................................................................................... 115
first step in, 5 Replace Constructor with Factory Function, 39, Extract Variable ........................................................................................ 119
fitting into workflow, 50, 63–64 334–336, 356, 363–364, 370–371, 382, 385
impacting software architecture, 62–63 Replace Constructor with Factory Method. See Inline Variable .......................................................................................... 123
in small steps, 8, 20–21, 44, 46, 59, 102, 245 Replace Constructor with Factory Function Change Function Declaration ................................................................. 124
litter-pickup, 52 Replace Data Value with Object. See Replace
long-term, 53 Primitive with Object
Encapsulate Variable ................................................................................ 132
of databases, 61, 70 Replace Derived Variable with Query, 75, 248–251 Rename Variable ...................................................................................... 137
of legacy code, 60–61, 70 Replace Function with Command, 73, 337–343 Introduce Parameter Object .................................................................... 140
performance and, 14, 20, 64–67, 228 Replace Inheritance with Delegation. See Replace
planned vs. opportunistic, 52–53 Superclass with Delegate Combine Functions into Class ................................................................ 144
preparatory, 50, 56 Replace Inline Code with Function Call, 108, 222 Combine Functions into Transform ....................................................... 149
preserving observable behavior of code, Replace Loop with Pipeline, 30, 79, 230, 231–236,
45–46, 59, 67, 98 372
Split Phase ................................................................................................ 154
productivity and, 48–50, 56, 67 Replace Method with Method Object. See
reasons to perform, 5, 43–44, 47–50, 56–57 Replace Function with Command Chapter 7: Encapsulation ................................................................................ 161
separating from optimization, 64–67, 228 Replace Nested Conditional with Guard Clauses, Encapsulate Record .................................................................................. 162
when to avoid, 55 266–271
when to do, 50–55, 71–84 Replace Parameter with Explicit Methods. See
Encapsulate Collection ............................................................................ 170
Refactoring Browser (Smalltalk), 68 Remove Flag Argument Replace Primitive with Object ................................................................ 174
Refactorings Replace Parameter with Method. See Replace Replace Temp with Query ...................................................................... 178
automated, 9, 59, 68–70, 194 Parameter with Query
catalog of, 101–103 Replace Parameter with Query, 74, 324–326 Extract Class ............................................................................................. 182
definition of, 45 Replace Primitive with Object, 79, 174–177, 363, Inline Class ............................................................................................... 186
naming, 101 367
References Replace Query with Parameter, 327–330
Hide Delegate ........................................................................................... 189
changing to values, 76, 169, 175, 183, 185, Replace Subclass with Delegate, 81–82, 84, Remove Middle Man ............................................................................... 192
252–255 381–398 Substitute Algorithm ................................................................................ 195
changing values to, 175, 256–258, 402 Replace Subclass with Fields. See Remove
Referential transparency, 327, 330 Subclass
Refused bequest, 83–84 Replace Superclass with Delegate, 81–82, 84, 376,
Chapter 8: Moving Features ............................................................................ 197
remove method (for collections), 170 399–404 Move Function .......................................................................................... 198
Remove Assignments to Parameters. See Split Replace Temp with Query, 11, 19, 73, 108, 114, Move Field ................................................................................................ 207
Variable 119, 178–181, 325
Remove Dead Code, 80, 237, 249–250, 295, Replace Type Code with Class. See Replace Move Statements into Function .............................................................. 213
320–321, 345, 347, 366, 383 Primitive with Object Move Statements to Callers .................................................................... 217
Remove Flag Argument, 74, 314–318 Replace Type Code with State/Strategy. See
Remove Middle Man, 81, 192–194 Replace Type Code with Subclasses
Replace Inline Code with Function Call ............................................... 222
inversed. See Hide Delegate Replace Type Code with Subclasses, 38, 79, 82, Slide Statements ....................................................................................... 223
Remove Parameter. See Change Function 362–368 Split Loop .................................................................................................. 227
Declaration inversed. See Remove Subclass
Remove Setting Method, 75, 83, 171, 173, 253, Repository objects, 257 Replace Loop with Pipeline .................................................................... 231
255, 331–333 Resharper plug-in, 68 Remove Dead Code ................................................................................. 237
Remove Subclass, 369–374 Responsibility, shifting, 324, 327–328
inversed. See Replace Type Code with Subclasses Restructuring, 46 Chapter 9: Organizing Data ............................................................................ 239
Rename Field, 72, 244–247, 354 Roberts, Don, 50, 68
Rename Function, Rename Method. See Change Ruby language Split Variable ............................................................................................ 240
Function Declaration equality testing in, 255 Rename Field ............................................................................................ 244
Rename Variable, 72, 137–139 refactoring in, 70
Repeated switches, 79 Rule of three, the, 50
Replace Derived Variable with Query ................................................... 248
x Contents Index 415
Change Reference to Value .................................................................... 252 N large data structures and, 169
Names measuring, 65–66
Change Value to Reference .................................................................... 256 multiple copies of data and, 256
communicating what things are doing, 7, 10,
18, 51, 72–73, 107, 124–125, 137, 260, optimizing, 46, 66–67
Chapter 10: Simplifying Conditional Logic ................................................... 259 284 refactoring and, 14, 20, 64–67, 228
Decompose Conditional .......................................................................... 260 deep design issues and, 72 Phases, splitting, 76–77, 83, 154–159
temporary, 130 Pipelines, replacing loops with, 30, 230,
Consolidate Conditional Expression ...................................................... 263 231–236, 372
new operator, 334
Replace Nested Conditional with Guard Clauses ................................ 266 NodeJS console, 91 Polymorphism, 34, 38–41, 79
Replace Conditional with Polymorphism .............................................. 272 now method (Date), 109 changing methods/classes with, 126
Null Object. See Introduce Special Case replacing conditionals with, 39–41, 79,
Introduce Special Case ............................................................................ 289 272–288, 363, 366
Introduce Assertion .................................................................................. 302 O Preserve Whole Object, 73–74, 78, 319–323
Object-oriented languages Primitive obsession, 78–79
Chapter 11: Refactoring APIs ......................................................................... 305 equality testing in, 255 Primitive types, replacing with objects, 79,
inheritance mechanism in, 381 174–177, 363, 367
Separate Query from Modifier ................................................................ 306 polymorphism in, 273 Productivity
Parameterize Function ............................................................................. 310 Objects and code base health, 55
benefits of, 122 and refactoring, 48–50, 56, 67
Remove Flag Argument ........................................................................... 314 and running tests, 86, 100
creating by creation script, 331
Preserve Whole Object ............................................................................ 319 equality testing for, 254–255 and writing tests, 100
Replace Parameter with Query ............................................................... 324 for mutable data, 162 Programming
initializing, 334 functional, 75
Replace Query with Parameter ............................................................... 327 nested, 252 object-oriented. See Object-oriented languages
Remove Setting Method .......................................................................... 331 preserving whole, 73–74, 78, 319–323 productivity of, 48–50
replacing primitive types with, 79, 174–177, Programs. See Code
Replace Constructor with Factory Function .......................................... 334 protected keyword, 354
363, 367
Replace Function with Command .......................................................... 337 Opdyke, Bill, 67 Proxies, for data structures, 168
Replace Command with Function .......................................................... 344 or logical operator, 264–265 Public fields, 83, 133
Overloaded getter setter practice, 134 Pull Up Constructor Body, 354, 355–358, 376
Pull Up Field, 351, 353–354, 376, 378, 380
Chapter 12: Dealing with Inheritance ........................................................... 349 Pull Up Method, 72, 350–352, 356, 358, 376–380
P
Pull Up Method ........................................................................................ 350 Paracelsus, 75 Push Down Field, 83, 361, 363, 380
Pull Up Field ............................................................................................. 353 Parallel change pattern, 61 Push Down Method, 83, 359–360, 363, 366, 380
Parameter lists, length of, 73, 74, 319, 324
Pull Up Constructor Body ....................................................................... 355 Q
Parameter objects
Push Down Method ................................................................................. 359 introducing, 73–74, 78–79, 140–143, 145, Queries
319 replaced with parameters, 327–330
Push Down Field ...................................................................................... 361
preserving whole, 73–74, 78, 319–323 replacing:
Replace Type Code with Subclasses ...................................................... 362 Parameterize Function, 51, 62, 310–313, 351 parameters, 74, 324–326
Remove Subclass ...................................................................................... 369 Parameterize Method. See Parameterize Function temps, 178–181, 325
Parameters separating from modifiers, 75, 179, 264,
Extract Superclass .................................................................................... 375 306–309
adding, 62, 127, 128–129, 141–142
Collapse Hierarchy ................................................................................... 380 changing, 129–131
Replace Subclass with Delegate ............................................................. 381 choosing, 125 R
extracting, 322 Records
Replace Superclass with Delegate .......................................................... 399 naming, 10, 137 changing, 210
removing, 12–13, 126–127, 143 encapsulating, 83, 133, 145–146, 162–169,
Bibliography ............................................................................................. 405 replacing with queries, 74, 324–326 210, 245
unneeded, 80 nested, 165–169
Index .......................................................................................................... 409 Performance Refactoring
accessing collections and, 171 automated tools for, 131
improving, vs. cost of production, 65 comprehension, 51
414 Index
xi
xii Foreword to the First Edition Index 413
your design. You will quickly add these refactorings and their names to your between classes, 204–206 Hide Delegate, 81–82, 189–191, 192
development vocabulary. to the top level, 200–204 inversed. See Remove Middle Man
naming, 7, 18, 21, 73, 106–107, 124, 130, 151, Hierarchy
My first experience with disciplined, “one step at a time” refactoring was when 222, 260, 284, 320 changing, 83–84
I was pair-programming at 30,000 feet with Kent Beck. He made sure that we nested, 22–23, 108, 114, 145, 179, 200–204, collapsing, 80, 380
applied refactorings from this book’s catalog one step at a time. I was amazed at 343 Hypertext Markup Language (HTML),
how well this practice worked. Not only did my confidence in the resulting code parameterizing, 51, 62, 310–313, 351 refactoring, 70
increase, I also felt less stressed. I highly recommend you try these refactorings: parameters of:
adding, 127, 128–129, 141–142 I
You and your code will feel much better for it. changing, 129–131 if statement, 267
choosing, 125 nested, 264–265
— Erich Gamma, Object Technology International, Inc. flags, 74, 314–318 Immutability, 151, 162, 331
January 1999 length of lists of, 73, 74, 319, 324 Immutable fields, 83
removing, 126–127, 143, 324–326 Incremental design, 63
referential transparency of, 327, 330 Indirection, needless, 115
removing, 40 Inheritance, 82, 381–382, 385
renaming, 21, 57, 69, 72, 80, 84, 125, 125, code logic and, 278–287
127–128, 130, 147, 175–176, 184, 201, indirect, 366–368
203–204, 214, 216, 218, 221–222, planned in advantage, 376
245–246, 286, 321, 352, 379 vs. delegation, 382, 389
replaced with commands, 73, 337–343 when to avoid, 399–400
replacing: Inline Class, 77, 80, 186–188
commands, 344–347 inversed. See Extract Class
inline code, 108, 222 Inline code, replacing with function calls, 108,
restricting visibility of, 204 222
returning a value, 225, 306–309 Inline Function, 38, 77, 80–81, 115–118, 126,
short, 106–107, 137 128–130, 188, 193–194, 199, 214, 216,
side effects in, 225, 306–309 218–220, 290, 294, 320–321, 328–329,
syntax errors in, 37 332–333, 345–347
testing, 145 inversed. See Extract Function
using instead of variables, 178–179 Inline Method. See Inline Function
varying behavior of, 217–221 Inline Temp. See Inline Variable
wrapping, 315, 318 Inline Variable, 11, 14, 19–20, 123, 130, 147,
See also Methods 152–153, 181, 293, 328, 372
inversed. See Extract Variable
G Input parameters, 242
Gamma, Eric, 86 Insider trading, 82
Gang of Four, 67, 77 Instance variables, 82
Generalization hierarchy, 278 Integrated Development Environments (IDEs)
Getters refactoring capabilities in, 69
naming, 134, 138, 176 renaming functions automatically in, 127
returning a copy of data, 135, 171 running tests in, 92
Git version control system, 9 IntelliJ IDEA, 68–69
Global data, 74–75, 258 Interfaces
mutable, 75 adjusting after extracting a class, 183
Graphical test runners, 93 marking as deprecated, 57
Guard clauses, 266–271 published, 57
Intermittent failures, 94
H Introduce Assertion, 84, 98, 129, 208, 211,
Harold, Elliotte Rusty, 70 249–250, 302–304
Harvey, Shane, 70 Introduce Explaining Variable. See Extract
hashcode method (Object), 255 Variable
Hashmaps, 162 Introduce Null Object. See Introduce Special Case
412 Index
xiii
xiv Preface Index 411
for a week or two. All this was to make the code look better, not to make it do decomposing, 260–262, 315–318 Data transformation functions. See Transform
anything it didn’t already do. nested, 264–270 functions
replacing with: Databases, refactoring, 61, 70
How do you feel about this story? Do you think the consultant was right to guard clauses, 266–271 Date.now method, 109
suggest further cleanup? Or do you follow that old engineering adage, “if it works, polymorphism, 39–41, 79, 272–288, 359, Dead code, removing, 80, 237, 249–250, 295,
don’t fix it”? 363, 366 320–321, 345, 347, 366, 383
I must admit to some bias here. I was that consultant. Six months later, the reversing, 269–270 Debugging
project failed, in large part because the code was too complex to debug or tune signaling code to extract, 74 making easier, 48, 51
sliding statements with, 226 reducing, 5, 49
to acceptable performance. symmetrical, 373 time spent on, 85
The consultant Kent Beck was brought in to restart the project—an exercise Consolidate Conditional Expression, 263–265, 267, using:
that involved rewriting almost the whole system from scratch. He did several 270 assertions, 302, 304
things differently, but one of the most important changes was to insist on contin- Consolidate Duplicate Conditional Fragments. variables, 119
uous cleaning up of the code using refactoring. The improved effectiveness of See Slide Statements Decompose Conditional, 260–262, 315–318
const keyword (JavaScript), 94, 242 Deep copies, 168
the team, and the role refactoring played, is what inspired me to write the first Constants Delegates
edition of this book—so I could pass on the knowledge that Kent and others have creating, 242 hiding, 81–82, 189–191, 192, 203
acquired by using refactoring to improve the quality of software. renaming, 138–139 replacing:
Since then, refactoring has become an accepted part of the vocabulary of pro- Constructors subclasses with, 381–398
gramming. And the original book has stood up rather well. However, eighteen manipulating fields in, 331 superclasses with, 399–404
naming, 334 Delegation, 81
years is an old age for a programming book, so I felt it was time to go back and pulling up body of, 355–358, 376 unnecessary, 80, 115
rework it. Doing this had me rewrite pretty much every page in the book. But, replacing with factory functions, 39, 334–336, vs. inheritance, 382, 389
in a sense, very little has changed. The essence of refactoring is the same; most 356, 363–364, 370–371, 382, 385 Dependencies, in code, 324–325, 327, 330
of the key refactorings remain essentially the same. But I do hope that the Continuous Delivery (CD), 60, 64 Derived variables, 75, 248–251
rewriting will help more people learn how to do refactoring effectively. Continuous Integration (CI), 58–60, 63 Design stamina hypothesis, 50
Coupling Divergent change, 76
between sub- and superclasses, 400 Duplicated code, 47, 72, 82, 213, 249
reducing, 125 for common behavior, 289
removing, 330 for derived data, 149–151
What Is Refactoring? Cunningham, Ward, 7, 51, 67 for fields, 353, 376, 378, 380
for methods, 350
D for validation checks, 98
Refactoring is the process of changing a software system in a way that does not in tests, 94
Data
alter the external behavior of the code yet improves its internal structure. It is a clumps of, 78, 140 replacing with function calls, 222, 310–313
disciplined way to clean up code that minimizes the chances of introducing bugs. derived, 149–153 searching for, 108
In essence, when you refactor, you are improving the design of the code after it detecting changes to, 135
has been written. duplicated, 249 E
encapsulating, 132–136 Eclipse IDE, 68–69
“Improving the design after it has been written.” That’s an odd turn of phrase.
global, 74–75, 258 else statement, 267
For much of the history of software development, most people believed that we immutable, 133, 148, 153, 162 Emacs text editor
design first, and only when done with design should we code. Over time, the mutable, 75–76, 132, 151, 162, 170, 248 macros in, 69
code will be modified, and the integrity of the system—its structure according to Data classes, 83 running tests in, 93
that design—gradually fades. The code slowly sinks from engineering to hacking. Data structures, 162–169 Encapsulate Collection, 163, 169, 170–173
accessing, 223 Encapsulate Field. See Encapsulate Variable
Refactoring is the opposite of this practice. With refactoring, we can take a
copying, 169 Encapsulate Record, 83, 133, 145–146, 162–169,
bad, even chaotic, design and rework it into well-structured code. Each step is immutable, 247, 252–255 210, 245
simple—even simplistic. I move a field from one class to another, pull some code importance of, 207, 244 Encapsulate Variable, 75, 132–136, 137–138,
out of a method to make it into its own method, or push some code up or down multiple copies of, 256 163–164, 166, 171, 175, 193–194, 249, 364
a hierarchy. Yet the cumulative effect of these small changes can radically improve nested, 252 Encapsulation, 81
read-only proxy for, 168 applicability of, 189
the design. It is the exact reverse of the notion of software decay.
updating, 167, 256 Equality testing, 254–255
when to change, 207 equals method, 254
410 Index Preface xv
Classes (continued) easy to modify, 4–5, 43, 45, 47, 49 With refactoring, the balance of work changes. I found that design, rather than
extracting, 76, 78–80, 82, 182–185, 186, 199, free of side effects, 75, 225, 306–309 occurring all up front, occurs continuously during development. As I build the
253, 319, 376 improving, 47
immutable, 330 internal design of, 4, 49
system, I learn how to improve the design. The result of this interaction is a
inlining, 77, 80, 186–188 legacy, 60–61, 70, 133 program whose design stays good as development continues.
large, 77, 82, 182 length of, 33, 42
no fields in class definitions of, 354 observable behavior of, 45–46, 59
polymorphic, 126 ownership of, 57
redundancy in, 82 performance of, 14, 20, 64–67
renaming, 183 self-testing, 5, 59–60, 63, 85–87, 302 What’s in This Book?
vs. transform functions, 144, 149, 153 structure of, 23–24, 124, 154–155
Clock wrapper, 109 adding, 272 This book is a guide to refactoring; it is written for a professional programmer.
Code changing, 46, 140 My aim is to show you how to do refactoring in a controlled and efficient manner.
adding functionality to, 46, 50, 53, 56 losing over time, 47 You will learn to refactor in such a way that you don’t introduce bugs into the
bad smells in, 71–84 symmetrical, 373
alternative classes with different interfaces, thought as “done”, 53
code but methodically improve its structure.
83 understanding, 4, 7, 24, 33, 43, 45, 47–48, Traditionally, a book starts with an introduction. I agree with that in principle,
comments, 84 51, 54, 60, 119, 198, 207, 223, 244, 315, but I find it hard to introduce refactoring with a generalized discussion or
data class, 83 327 definitions—so I start with an example. Chapter 1 takes a small program with
data clumps, 78 Code analysis tools, 315 some common design flaws and refactors it into a program that’s easier to under-
divergent change, 76 Code reviews, 54
duplicated code, 47, 72 Collapse Hierarchy, 80, 380
stand and change. This will show you both the process of refactoring and a
feature envy, 77, 319 Collecting variables, 240 number of useful refactorings. This is the key chapter to read if you want to
global data, 74–75 Collection pipelines, 171, 231–236 understand what refactoring really is about.
insider trading, 82 Collections In Chapter 2, I cover more of the general principles of refactoring, some defi-
large class, 82 empty, 96 nitions, and the reasons for doing refactoring. I outline some of the challenges
lazy element, 80 encapsulating, 163, 169, 170–173
long function, 73–74 immutable, 171
with refactoring. In Chapter 3, Kent Beck helps me describe how to find bad
long parameter list, 74 modifier methods for, 170 smells in code and how to clean them up with refactorings. Testing plays a very
loops, 79 Combine Functions into Class, 74, 76–77, important role in refactoring, so Chapter 4 describes how to build tests into code.
message chains, 81 144–148, 153, 199, 274, 281, 290, 293 The heart of the book—the catalog of refactorings—takes up the rest of its vol-
middle man, 81, 192 Combine Functions into Transform, 76–77, 144, ume. While this is by no means a comprehensive catalog, it covers the key
mutable data, 75–76 149–153, 290, 297–300
mysterious name, 72 Command pattern, 338
refactorings that most developers will likely need. It grew from the notes I made
primitive obsession, 78–79 Command-query separation principle, 225, 306, when learning about refactoring in the late 1990s, and I still use these notes now
refused bequest, 83–84 338 as I don’t remember them all. When I want to do something, such as Split Phase
repeated switches, 79 Commands (command objects), 337–343 (154), the catalog reminds me how to do it in a safe, step-by-step manner. I hope
shotgun surgery, 76 naming, 338 this is the part of the book that you’ll come back to often.
speculative generality, 80 replaced with functions, 344–347
temporary field, 80 Comments, 84
bad smells of: for assumptions, 302 A Web-First Book
type code, 336 for dead code, 237
branches of, 57–59 signaling code to extract, 73, 84, 107 The World-Wide Web has made an enormous impact on our society, particularly
cleaning up, 45, 52, 67 turning into names, 125
affecting how we gather information. When I wrote this book, most of the
communicating what it is doing, 10, 18, 51, Compilers
72–73, 107, 124–125, 137, 260, 263, 267, chain of phases in, 155 knowledge about software development was transferred through print. Now I
302 removing dead code in, 237 gather most of my information online. This has presented a challenge for authors
complexity of, 260 Compile-test-commit cycle, 9, 37 like myself: Is there still a role for books, and what should they look like?
cost of production of, 65 Compiling I believe there still is role for books like this—but they need to change. The
dead, removing, 80, 237, 249–250, 295, after each change, 8
value of a book is a large body of knowledge put together in a cohesive fashion.
320–321, 345, 347, 366, 383 checks during, 108
dependencies in, 324–325, 327, 330 Conditionals In writing this book, I tried to cover many different refactorings and organize
duplicated. See Duplicated code consolidating, 263–265, 267, 270 them in a consistent and integrated manner.
xvi Preface
But that integrated whole is an abstract literary work that, while traditionally
represented by a paper book, need not be in the future. Most of the book industry Index
still sees the paper book as the primary representation, and while we’ve enthusi-
astically adopted ebooks, they are just electronic representations of an original
work based on the structure of a paper book.
With this book, I’m exploring a different approach. The canonical form of this
book is its web site or web edition. Access to the web edition is included with
the purchase of the print or ebook versions. (See note below about registering
your product on InformIT.) The paper book is a selection of material from the
web site, arranged in a manner that makes sense for print. It doesn’t attempt to
include all the refactorings on the web site, particularly since I may well add
more refactorings to the canonical web edition in the future. Similarly, the ebook
is a different representation of the web book that may not include the same set
Symbols and Numbers Budgeting resources, 64
of refactorings as the printed book—after all, ebooks don’t get heavy as I add
_ (leading underscore), in function names, 386 Bugs
pages and they can be easily updated after they are bought. == operator (Ruby), 255 finding, 93, 98
I don’t know whether you’re reading the web edition online, an ebook on your during refactoring, 46, 48
phone, a paper copy, or some other form I can’t imagine as I write this. I do my A time spent on, 85–86
best to make this a useful work, whatever way you wish to absorb it. add method (for collections), 170 fixing, 51, 85
Add Parameter. See Change Function Declaration introducing into code, 4–5, 59, 129
For access to the canonical web edition and updates or corrections as they
Agile software methods, 63
become available, register your copy of Refactoring, Second Edition, on the InformIT C
Algorithms, substituting, 195–196, 230, 309
site. To start the registration process, go to informit.com/register and log in (or create Alternative classes with different interfaces, C# language, automated refactorings in, 68
an account if you don’t have one). Enter the ISBN 9780134757599 and click 83 C++ language, refactorings in
Submit. You will be asked a challenge question, so be sure to have your copy of Ambler, Scott, 70 for framework development, 67
and logical operator, 264–265 safe, 60
the print or ebook available. After you’ve successfully registered your copy,
Application Programming Interfaces (APIs), Calculations
open the “Digital Purchases” tab on your Account page and click on the link under repeated, 149–151
refactoring, 126–128
this title to “Launch” the web edition. Architecture vs. using variables, 248
decaying over time, 47 Chai assertion library, 92
refactoring and, 62–63 Change Function Declaration, 12–13, 17, 36, 69,
JavaScript Examples testability of, 99 72, 80, 83–84, 124–131, 141–142, 147,
Arrays, sorting, 173 175–176, 184, 214, 216, 218, 222, 245–246,
As in most technical areas of software development, code examples are very im- Assertion libraries (Mocha framework), 92 253, 286, 310, 312, 325–326, 332, 345–346,
portant to illustrate the concepts. However, the refactorings look mostly the same Assertions 351–352, 366, 376, 379, 403
in different languages. There will sometimes be particular things that a language applicability of, 302, 304 Change Reference to Value, 76, 169, 175, 183, 185,
forces me to pay attention to, but the core elements of the refactorings remain introducing, 84, 98, 129, 208, 211, 249–250, 252–255
302–304 Change Signature. See Change Function
the same. Declaration
Assignments
I chose JavaScript to illustrate these refactorings, as I felt that this language removing to parameters, 112 Change Value to Reference, 175, 256–258, 402
would be readable by the most amount of people. You shouldn’t find it difficult, Assumptions, for values, 302 Chrysler Comprehensive Compensation, 65
however, to adapt the refactorings to whatever language you are currently using. Automated refactoring tools, 131 Classes
I try not to use any of the more complicated bits of the language, so you should abstract, 80
B advantages of, 174
be able to follow the refactorings with only a cursory knowledge of JavaScript. alternative, with different interfaces, 83
Babel, 37
My use of JavaScript is certainly not an endorsement of the language. Bazuzi, Jay, 60 as data holders, 83
Although I use JavaScript for my examples, that doesn’t mean the techniques Beck, Kent, 10, 44, 46, 48, 65–67, 71, 77, 86–87, combining functions into, 74, 76–77,
in this book are confined to JavaScript. The first edition of this book used Java, 107 144–148, 274, 281, 290, 293
and many programmers found it useful even though they never wrote a single Boundary conditions, 97 containing their own tests, 85
Branch by abstraction tactic, 54 context of, 198
Java class. I did toy with illustrating this generality by using a dozen different creating, 142
Brant, John, 68
409
Preface xvii
languages for the examples, but I felt that would be too confusing for the reader.
Still, this book is written for programmers in any language. Outside of the example
sections, I’m not making any assumptions about the language. I expect the
reader to absorb my general comments and apply them to the language they
are using. Indeed, I expect readers to take the JavaScript examples and adapt
them to their language.
This means that, apart from discussing specific examples, when I talk about
“class,” “module,” “function,” etc., I use those terms in the general programming
This page intentionally left blank meaning, not as specific terms of the JavaScript language model.
The fact that I’m using JavaScript as the example language also means that I
try to avoid JavaScript styles that will be less familiar to those who aren’t regular
JavaScript programmers. This is not a “refactoring in JavaScript” book—rather,
it’s a general refactoring book that happens to use JavaScript. There are many
interesting refactorings that are specific to JavaScript (such as refactoring from
callbacks, to promises, to async/await) but they are out of scope for this book.
If you want to understand what refactoring is, read Chapter 1—the example
should make the process clear.
If you want to understand why you should refactor, read the first two
chapters. They will tell you what refactoring is and why you should do it.
xviii Preface Bibliography 407
If you want to find where you should refactor, read Chapter 3. It tells you [mf-ua] Martin Fowler. “Bliki: UniformAccessPrinciple.”
the signs that suggest the need for refactoring. https://martinfowler.com/bliki/UniformAccessPrinciple.html.
[mf-vo] Martin Fowler. “Bliki: ValueObject.” https://martinfowler.com/bliki/ValueObject.html.
If you want to actually do refactoring, read the first four chapters completely,
then skip-read the catalog. Read enough of the catalog to know, roughly, [mf-xp] Martin Fowler. “Bliki: ExtremeProgramming.”
what is in there. You don’t have to understand all the details. When you https://martinfowler.com/bliki/ExtremeProgramming.html.
actually need to carry out a refactoring, read the refactoring in detail and [mf-xunit] Martin Fowler. “Bliki: Xunit.” https://martinfowler.com/bliki/Xunit.html.
use it to help you. The catalog is a reference section, so you probably won’t [mf-yagni] Martin Fowler. “Bliki: Yagni.” https://martinfowler.com/bliki/Yagni.html.
want to read it in one go.
[mocha] https://mochajs.org.
An important part of writing this book was naming the various refactorings. [Opdyke] William F. Opdyke. “Refactoring Object-Oriented Frameworks.” Doctoral Disser-
Terminology helps us communicate, so that when one developer advises another tation. University of Illinois at Urbana-Champaign, 1992. http://www.laputan.org/pub/papers
to extract some code into a function, or to split some computation into separate /opdyke-thesis.pdf.
phases, both understand the references to Extract Function (106) and Split Phase [Parnas] D. L. Parnas. “On the Criteria to Be Used in Decomposing Systems into Modules.”
(154). This vocabulary also helps in selecting automated refactorings. In: Communications of the ACM, Volume 15 Issue 12, pp. 1053–1058. Dec. 1972.
[ref.com] https://refactoring.com.
[Wake] William C. Wake. Refactoring Workbook. Addison-Wesley, 2003. ISBN 0321109295.
Building on a Foundation Laid by Others [wake-swap] Bill Wake. “The Swap Statement Refactoring.” https://www.industriallogic.com
/blog/swap-statement-refactoring/.
I need to say right at the beginning that I owe a big debt with this book—a debt
to those whose work in the 1990s developed the field of refactoring. It was
learning from their experience that inspired and informed me to write the first
edition of this book, and although many years have passed, it’s important that I
continue to acknowledge the foundation that they laid. Ideally, one of them
should have written that first edition, but I ended up being the one with the time
and energy.
Two of the leading early proponents of refactoring were Ward Cunningham
and Kent Beck. They used it as a foundation of development in the early
days and adapted their development processes to take advantage of it. In partic-
ular, it was my collaboration with Kent that showed me the importance of
refactoring—an inspiration that led directly to this book.
Ralph Johnson leads a group at the University of Illinois at Urbana-Champaign
that is notable for its practical contributions to object technology. Ralph has long
been a champion of refactoring, and several of his students did vital early work
in this field. Bill Opdyke developed the first detailed written work on refactor-
ing in his doctoral thesis. John Brant and Don Roberts went beyond writing
words—they created the first automated refactoring tool, the Refactoring Browser,
for refactoring Smalltalk programs.
Many people have advanced the field of refactoring since the first edition of
this book. In particular, the work of those who have added automated refactorings
to development tools have contributed enormously to making programmers’ lives
easier. It’s easy for me to take it for granted that I can rename a widely used
function with a simple key sequence—but that ease relies on the efforts of IDE
teams whose work helps us all.
406 Bibliography Preface xix
405
404 Chapter 12 Dealing with Inheritance
class Scroll…
constructor(id, title, tags, dateLastCleaned, catalogID, catalog) {
this._id = id;
this._catalogItem = catalog.get(catalogID);
this._lastCleaned = dateLastCleaned;
}
How do I begin to talk about refactoring? The traditional way is by introducing
the history of the subject, broad principles, and the like. When somebody does
that at a conference, I get slightly sleepy. My mind starts wandering, with a
low-priority background process polling the speaker until they give an example.
The examples wake me up because I can see what is going on. With principles,
it is too easy to make broad generalizations—and too hard to figure out how to
apply things. An example helps make things clear.
So I’m going to start this book with an example of refactoring. I’ll talk about
how refactoring works and will give you a sense of the refactoring process. I can
then do the usual principles-style introduction in the next chapter.
With any introductory example, however, I run into a problem. If I pick a large
program, describing it and how it is refactored is too complicated for a mortal
reader to work through. (I tried this with the original book—and ended up
throwing away two examples, which were still pretty small but took over a hun-
dred pages each to describe.) However, if I pick a program that is small enough
to be comprehensible, refactoring does not look like it is worthwhile.
I’m thus in the classic bind of anyone who wants to describe techniques that
are useful for real-world programs. Frankly, it is not worth the effort to do all
the refactoring that I’m going to show you on the small program I will be using.
But if the code I’m showing you is part of a larger system, then the refactoring
becomes important. Just look at my example and imagine it in the context of a
much larger system.
1
2 Chapter 1 Refactoring: A First Example Replace Superclass with Delegate 403
Image a company of theatrical players who go out to various events performing Creating a catalog item with a null ID would usually raise red flags and cause
plays. Typically, a customer will request a few plays and the company charges alarms to sound. But that’s just temporary while I get things into shape. Once
them based on the size of the audience and the kind of play they perform. There I’ve done that, the scrolls will refer to a shared catalog item with its proper ID.
are currently two kinds of plays that the company performs: tragedies and Currently the scrolls are loaded as part of a load routine.
comedies. As well as providing a bill for the performance, the company gives its
customers “volume credits” which they can use for discounts on future perfor- load routine…
mances—think of it as a customer loyalty mechanism. const scrolls = aDocument
.map(record => new Scroll(record.id,
The performers store data about their plays in a simple JSON file that looks
record.catalogData.title,
something like this: record.catalogData.tags,
LocalDate.parse(record.lastCleaned)));
plays.json…
{ The first step in Change Value to Reference (256) is finding or creating a repository.
"hamlet": {"name": "Hamlet", "type": "tragedy"}, I find there is a repository that I can easily import into the load routine. The
"as-like": {"name": "As You Like It", "type": "comedy"}, repository supplies catalog items indexed by an ID. My next task is to see how
"othello": {"name": "Othello", "type": "tragedy"}
}
to get that ID into the constructor of the scroll. Fortunately, it’s present in the
input data and was being ignored as it wasn’t useful when using inheritance.
The data for their bills also comes in a JSON file: With that sorted out, I can now use Change Function Declaration (124) to add both
the catalog and the catalog item’s ID to the constructor parameters.
invoices.json…
[ load routine…
{ const scrolls = aDocument
"customer": "BigCo", .map(record => new Scroll(record.id,
"performances": [ record.catalogData.title,
{ record.catalogData.tags,
"playID": "hamlet", LocalDate.parse(record.lastCleaned),
"audience": 55 record.catalogData.id,
}, catalog));
{
"playID": "as-like", class Scroll…
"audience": 35 constructor(id, title, tags, dateLastCleaned, catalogID, catalog) {
}, this._id = id;
{ this._catalogItem = new CatalogItem(null, title, tags);
"playID": "othello", this._lastCleaned = dateLastCleaned;
"audience": 40 }
}
] I now modify the constructor to use the catalog ID to look up the catalog item
}
and use it instead of creating a new one.
]
class Scroll…
The code that prints the bill is this simple function:
constructor(id, title, tags, dateLastCleaned, catalogID, catalog) {
function statement (invoice, plays) { this._id = id;
let totalAmount = 0; this._catalogItem = catalog.get(catalogID);
let volumeCredits = 0; this._lastCleaned = dateLastCleaned;
let result = `Statement for ${invoice.customer}\n`; }
const format = new Intl.NumberFormat("en-US",
{ style: "currency", currency: "USD", I no longer need the title and tags passed into the constructor, so I use Change
minimumFractionDigits: 2 }).format; Function Declaration (124) to remove them.
402 Chapter 12 Dealing with Inheritance Comments on the Starting Program 3
I begin by creating a property in Scroll that refers to the catalog item, initializing for (let perf of invoice.performances) {
it with a new instance. const play = plays[perf.playID];
let thisAmount = 0;
class Scroll extends CatalogItem…
switch (play.type) {
constructor(id, title, tags, dateLastCleaned) {
case "tragedy":
super(id, title, tags);
thisAmount = 40000;
this._catalogItem = new CatalogItem(id, title, tags);
if (perf.audience > 30) {
this._lastCleaned = dateLastCleaned;
thisAmount += 1000 * (perf.audience - 30);
}
}
break;
I create forwarding methods for each element of the superclass that I use on
case "comedy":
the subclass. thisAmount = 30000;
if (perf.audience > 20) {
class Scroll… thisAmount += 10000 + 500 * (perf.audience - 20);
get id() {return this._catalogItem.id;} }
get title() {return this._catalogItem.title;} thisAmount += 300 * perf.audience;
hasTag(aString) {return this._catalogItem.hasTag(aString);} break;
default:
I remove the inheritance link to the catalog item. throw new Error(`unknown type: ${play.type}`);
}
class Scroll extends CatalogItem{
constructor(id, title, tags, dateLastCleaned) {
// add volume credits
super(id, title, tags);
volumeCredits += Math.max(perf.audience - 30, 0);
this._catalogItem = new CatalogItem(id, title, tags);
// add extra credit for every ten comedy attendees
this._lastCleaned = dateLastCleaned;
if ("comedy" === play.type) volumeCredits += Math.floor(perf.audience / 5);
}
// print line for this order
Breaking the inheritance link finishes the basic Replace Superclass with Delegate
result += ` ${play.name}: ${format(thisAmount/100)} (${perf.audience} seats)\n`;
refactoring, but there is something more I need to do in this case. totalAmount += thisAmount;
The refactoring shifts the role of the catalog item to that of a component of }
scroll; each scroll contains a unique instance of a catalog item. In many cases result += `Amount owed is ${format(totalAmount/100)}\n`;
where I do this refactoring, this is enough. However, in this situation a better result += `You earned ${volumeCredits} credits\n`;
model is to link the greyscale catalog item to the six scrolls in the library that return result;
}
are copies of that writing. Doing this is, essentially, Change Value to Reference (256).
There’s a problem that I have to fix, however, before I use Change Value to Running that code on the test data files above results in the following output:
Reference (256). In the original inheritance structure, the scroll used the catalog
item’s ID field to store its ID. But if I treat the catalog item as a reference, it needs Statement for BigCo
Hamlet: $650.00 (55 seats)
to use that ID for the catalog item ID rather than the scroll ID. This means I
As You Like It: $580.00 (35 seats)
need to create an ID field on scroll and use that instead of one in catalog item. Othello: $500.00 (40 seats)
It’s a sort-of move, sort-of split. Amount owed is $1,730.00
You earned 47 credits
class Scroll…
constructor(id, title, tags, dateLastCleaned) {
this._id = id;
this._catalogItem = new CatalogItem(null, title, tags); Comments on the Starting Program
this._lastCleaned = dateLastCleaned;
} What are your thoughts on the design of this program? The first thing I’d say is
get id() {return this._id;}
that it’s tolerable as it is—a program so short doesn’t require any deep structure
to be comprehensible. But remember my earlier point that I have to keep examples
4 Chapter 1 Refactoring: A First Example Replace Superclass with Delegate 401
small. Imagine this program on a larger scale—perhaps hundreds of lines long. Example
At that size, a single inline function is hard to understand.
Given that the program works, isn’t any statement about its structure merely I recently was consulting for an old town’s library of ancient scrolls. They keep
an aesthetic judgment, a dislike of “ugly” code? After all, the compiler doesn’t details of their scrolls in a catalog. Each scroll has an ID number and records its
care whether the code is ugly or clean. But when I change the system, there is title and list of tags.
a human involved, and humans do care. A poorly designed system is hard to
class CatalogItem…
change—because it is difficult to figure out what to change and how these changes
constructor(id, title, tags) {
will interact with the existing code to get the behavior I want. And if it is hard this._id = id;
to figure out what to change, there is a good chance that I will make mistakes this._title = title;
and introduce bugs. this._tags = tags;
Thus, if I’m faced with modifying a program with hundreds of lines of code, }
I’d rather it be structured into a set of functions and other program elements that
get id() {return this._id;}
allow me to understand more easily what the program is doing. If the program
get title() {return this._title;}
lacks structure, it’s usually easier for me to add structure to the program first, hasTag(arg) {return this._tags.includes(arg);}
and then make the change I need.
In this case, I have a couple of changes One of the things that scrolls need is regular cleaning. The code for that uses
When you have to add a fea- that the users would like to make. First, the catalog item and extends it with the data it needs for cleaning.
they want a statement printed in HTML.
ture to a program but the code Consider what impact this change would class Scroll extends CatalogItem…
constructor(id, title, tags, dateLastCleaned) {
is not structured in a conve- have. I’m faced with adding conditional super(id, title, tags);
around every statement that this._lastCleaned = dateLastCleaned;
nient way, first refactor the statements adds a string to the result. That will add }
program to make it easy to add a host of complexity to the function.
needsCleaning(targetDate) {
the feature, then add the Faced with that, most people prefer to const threshold = this.hasTag("revered") ? 700 : 1500;
copy the method and change it to emit
feature. return this.daysSinceLastCleaning(targetDate) > threshold ;
HTML. Making a copy may not seem too
}
onerous a task, but it sets up all sorts of daysSinceLastCleaning(targetDate) {
problems for the future. Any changes to the charging logic would force me to return this._lastCleaned.until(targetDate, ChronoUnit.DAYS);
update both methods—and to ensure they are updated consistently. If I’m writing }
a program that will never change again, this kind of copy-and-paste is fine. But
if it’s a long-lived program, then duplication is a menace. This is an example of a common modeling error. There is a difference
This brings me to a second change. The players are looking to perform more between the physical scroll and the catalog item. The scroll describing the treat-
kinds of plays: they hope to add history, pastoral, pastoral-comical, historical- ment for the greyscale disease may have several copies, but be just one item in
pastoral, tragical-historical, tragical-comical-historical-pastoral, scene individable, the catalog.
and poem unlimited to their repertoire. They haven’t exactly decided yet what It many situations, I can get away with an error like this. I can think of the title
they want to do and when. This change will affect both the way their plays are and tags as copies of data in the catalog. Should this data never change, I can
charged for and the way volume credits are calculated. As an experienced devel- get away with this representation. But if I need to update either, I must be careful
oper I can be sure that whatever scheme they come up with, they will change it to ensure that all copies of the same catalog item are updated correctly.
again within six months. After all, when feature requests come, they come not Even without this issue, I’d still want to change the relationship. Using catalog
as single spies but in battalions. item as a superclass to scroll is likely to confuse programmers in the future, and
Again, that statement method is where the changes need to be made to deal with is thus a poor model to work with.
changes in classification and charging rules. But if I copy statement to htmlStatement,
I’d need to ensure that any changes are consistent. Furthermore, as the rules
400 Chapter 12 Dealing with Inheritance The First Step in Refactoring 5
As well as using all the functions of the superclass, it should also be true that grow in complexity, it’s going to be harder to figure out where to make the
every instance of the subclass is an instance of the superclass and a valid object changes and harder to do them without making a mistake.
in all cases where we’re using the superclass. If I have a car model class, with Let me stress that it’s these changes that drive the need to perform refactoring.
things like name and engine size, I might think I could reuse these features to If the code works and doesn’t ever need to change, it’s perfectly fine to leave it
represent a physical car, adding functions for VIN number and manufacturing alone. It would be nice to improve it, but unless someone needs to understand
date. This is a common, and often subtle, modeling mistake which I’ve called the it, it isn’t causing any real harm. Yet as soon as someone does need to under-
type-instance homonym [mf-tih]. stand how that code works, and struggles to follow it, then you have to do
These are both examples of problems leading to confusion and errors—which something about it.
can be easily avoided by replacing inheritance with delegation to a separate object.
Using delegation makes it clear that it is a separate thing—one where only some
of the functions carry over.
Even in cases where the subclass is reasonable modeling, I use Replace Super- The First Step in Refactoring
class with Delegate because the relationship between a sub- and superclass is
Whenever I do refactoring, the first step is always the same. I need to ensure I
highly coupled, with the subclass easily broken by changes in the superclass. The
have a solid set of tests for that section of code. The tests are essential because
downside is that I need to write a forwarding function for any function that is
even though I will follow refactorings structured to avoid most of the opportunities
the same in the host and in the delegate—but, fortunately, even though such
for introducing bugs, I’m still human and still make mistakes. The larger a pro-
forwarding functions are boring to write, they are too simple to get wrong.
gram, the more likely it is that my changes will cause something to break
As a consequence of all this, some people advise avoiding inheritance
inadvertently—in the digital age, frailty’s name is software.
entirely—but I don’t agree with that. Provided the appropriate semantic conditions
Since the statement returns a string, what I do is create a few invoices, give each
apply (every method on the supertype applies to the subtype, every instance of
invoice a few performances of various kinds of plays, and generate the statement
the subtype is an instance of the supertype), inheritance is a simple and effective
strings. I then do a string comparison between the new string and some refer-
mechanism. I can easily apply Replace Superclass with Delegate should the situ-
ence strings that I have hand-checked. I set up all of these tests using a testing
ation change and inheritance is no longer the best option. So my advice is to
framework so I can run them with just a simple keystroke in my development
(mostly) use inheritance first, and apply Replace Superclass with Delegate when
environment. The tests take only a few seconds to run, and as you will see, I run
(and if) it becomes a problem.
them often.
An important part of the tests is the way they report their results. They either
Mechanics go green, meaning that all the strings are identical to the reference strings, or
red, showing a list of failures—the lines that turned out differently. The tests are
Create a field in the subclass that refers to the superclass object. Initialize thus self-checking. It is vital to make tests self-checking. If I don’t, I’d end up
this delegate reference to a new instance. spending time hand-checking values from the test against values on a desk pad,
For each element of the superclass, create a forwarding function in the sub- and that would slow me down. Modern testing frameworks provide all the features
class that forwards to the delegate reference. Test after forwarding each needed to write and run self-checking tests.
consistent group. As I do the refactoring, I’ll lean on the
tests. I think of them as a bug detector Before you start refactoring,
Most of the time you can test after each function that’s forwarded, but, for example, to protect me against my own mistakes.
get/set pairs can only be tested once both have been moved. By writing what I want twice, in the code make sure you have a solid
When all superclass elements have been overridden with forwarders, remove and in the test, I have to make the mis- suite of tests. These tests must
the inheritance link. take consistently in both places to fool
the detector. By double-checking my
be self-checking.
work, I reduce the chance of doing
something wrong. Although it takes time to build the tests, I end up saving that
time, with considerable interest, by spending less time debugging. This is such
an important part of refactoring that I devote a full chapter to it (Building Tests
(85)).
6 Chapter 1 Refactoring: A First Example Replace Superclass with Delegate 399
switch (play.type) {
case "tragedy":
thisAmount = 40000;
if (perf.audience > 30) {
thisAmount += 1000 * (perf.audience - 30);
class Stack {
}
constructor() {
break;
this._storage = new List();
case "comedy":
}
thisAmount = 30000;
}
if (perf.audience > 20) {
class List {...}
thisAmount += 10000 + 500 * (perf.audience - 20);
}
thisAmount += 300 * perf.audience;
break; Motivation
default:
throw new Error(`unknown type: ${play.type}`); In object-oriented programs, inheritance is a powerful and easily available way
} to reuse existing functionality. I inherit from some existing class, then override
and add additional features. But subclassing can be done in a way that leads to
// add volume credits
volumeCredits += Math.max(perf.audience - 30, 0);
confusion and complication.
// add extra credit for every ten comedy attendees One of the classic examples of mis-inheritance from the early days of objects
if ("comedy" === play.type) volumeCredits += Math.floor(perf.audience / 5); was making a stack be a subclass of list. The idea that led to this was reusing of
list’s data storage and operations to manipulate it. While it’s good to reuse, this
// print line for this order inheritance had a problem: All the operations of the list were present on the in-
result += ` ${play.name}: ${format(thisAmount/100)} (${perf.audience} seats)\n`;
terface of the stack, although most of them were not applicable to a stack. A
totalAmount += thisAmount;
} better approach is to make the list into a field of the stack and delegate the
result += `Amount owed is ${format(totalAmount/100)}\n`; necessary operations to it.
result += `You earned ${volumeCredits} credits\n`; This is an example of one reason to use Replace Superclass with Delegate—if
return result; functions of the superclass don’t make sense on the subclass, that’s a sign that I
} shouldn’t be using inheritance to use the superclass’s functionality.
398 Chapter 12 Dealing with Inheritance Decomposing the statement Function 7
class NorwegianBlueParrotDelegate extends SpeciesDelegate { As I look at this chunk, I conclude that it’s calculating the charge for one per-
constructor(data, bird) { formance. That conclusion is a piece of insight about the code. But as Ward
super(data, bird);
Cunningham puts it, this understanding is in my head—a notoriously volatile
this._voltage = data.voltage;
this._isNailed = data.isNailed; form of storage. I need to persist it by moving it from my head back into the
} code itself. That way, should I come back to it later, the code will tell me what
get airSpeedVelocity() { it’s doing—I don’t have to figure it out again.
return (this._isNailed) ? 0 : 10 + this._voltage / 10; The way to put that understanding into code is to turn that chunk of code into
} its own function, naming it after what it does—something like amountFor(aPerformance).
get plumage() {
if (this._voltage > 100) return "scorched";
When I want to turn a chunk of code into a function like this, I have a procedure
else return this._bird._plumage || "beautiful"; for doing it that minimizes my chances of getting it wrong. I wrote down this
} procedure and, to make it easy to reference, named it Extract Function (106).
} First, I need to look in the fragment for any variables that will no longer be in
scope once I’ve extracted the code into its own function. In this case, I have three:
This example replaces the original subclasses with a delegate, but there is still
perf, play, and thisAmount. The first two are used by the extracted code, but not
a very similar inheritance structure in SpeciesDelegate. Have I gained anything from
modified, so I can pass them in as parameters. Modified variables need more
this refactoring, other than freeing up inheritance on Bird? The species inheritance
care. Here, there is only one, so I can return it. I can also bring its initialization
is now more tightly scoped, covering just the data and functions that vary due
inside the extracted code. All of which yields this:
to the species. Any code that’s the same for all species remains on Bird and its
future subclasses. function statement…
I could apply the same idea of creating a superclass delegate to the booking function amountFor(perf, play) {
example earlier. This would allow me to replace those methods on Booking that let thisAmount = 0;
have dispatch logic with simple calls to the delegate and letting its inheritance switch (play.type) {
sort out the dispatch. However, it’s nearly dinner time, so I’ll leave that as an case "tragedy":
thisAmount = 40000;
exercise for the reader.
if (perf.audience > 30) {
These examples illustrate that the phrase “Favor object composition over class thisAmount += 1000 * (perf.audience - 30);
inheritance” might better be said as “Favor a judicious mixture of composition }
and inheritance over either alone”—but I fear that is not as catchy. break;
case "comedy":
thisAmount = 30000;
if (perf.audience > 20) {
thisAmount += 10000 + 500 * (perf.audience - 20);
}
thisAmount += 300 * perf.audience;
break;
default:
throw new Error(`unknown type: ${play.type}`);
}
return thisAmount;
}
When I use a header like “function someName…” in italics for some code, that means
that the following code is within the scope of the function, file, or class named in the
header. There is usually other code within that scope that I won’t show, as I’m not
discussing it at the moment.
The original statement code now calls this function to populate thisAmount:
8 Chapter 1 Refactoring: A First Example Replace Subclass with Delegate 397
class AfricanSwallowDelegate extends SpeciesDelegate { In this case the tests passed, so my next step is to commit the change to my
constructor(data, bird) { local version control system. I use a version control system, such as git or mercu-
super(data,bird);
rial, that allows me to make private commits. I commit after each successful
this._numberOfCoconuts = data.numberOfCoconuts;
} refactoring, so I can easily get back to a working state should I mess up later. I
then squash changes into more significant commits before I push the changes to
class NorwegianBlueParrotDelegate extends SpeciesDelegate { a shared repository.
constructor(data, bird) { Extract Function (106) is a common refactoring to automate. If I was programming
super(data, bird);
this._voltage = data.voltage;
in Java, I would have instinctively reached for the key sequence for my IDE to
this._isNailed = data.isNailed; perform this refactoring. As I write this, there is no such robust support for this
} refactoring in JavaScript tools, so I have to do this manually. It’s not hard, although
I have to be careful with those locally scoped variables.
Indeed, now I have a superclass, I can move any default behavior from Bird to Once I’ve used Extract Function (106), I take a look at what I’ve extracted to see
SpeciesDelegate
by ensuring there’s always something in the speciesDelegate field. if there are any quick and easy things I can do to clarify the extracted function.
class Bird… The first thing I do is rename some of the variables to make them clearer, such
as changing thisAmount to result.
selectSpeciesDelegate(data) {
switch(data.type) {
function statement…
case 'EuropeanSwallow':
return new EuropeanSwallowDelegate(data, this); function amountFor(perf, play) {
case 'AfricanSwallow': let result = 0;
return new AfricanSwallowDelegate(data, this); switch (play.type) {
case 'NorweigianBlueParrot': case "tragedy":
return new NorwegianBlueParrotDelegate(data, this); result = 40000;
default: return new SpeciesDelegate(data, this); if (perf.audience > 30) {
} result += 1000 * (perf.audience - 30);
} }
// rest of bird's code... break;
case "comedy":
get plumage() {return this._speciesDelegate.plumage;} result = 30000;
if (perf.audience > 20) {
get airSpeedVelocity() {return this._speciesDelegate.airSpeedVelocity;} result += 10000 + 500 * (perf.audience - 20);
}
class SpeciesDelegate… result += 300 * perf.audience;
get airSpeedVelocity() {return null;} break;
default:
I like this, as it simplifies the delegating methods on Bird. I can easily see which throw new Error(`unknown type: ${play.type}`);
behavior is delegated to the species delegate and which stays behind. }
Here’s the final state of these classes: return result;
}
function createBird(data) {
return new Bird(data); It’s my coding standard to always call the return value from a function “result”.
} That way I always know its role. Again, I compile, test, and commit. Then I move
onto the first argument.
10 Chapter 1 Refactoring: A First Example Replace Subclass with Delegate 395
All well and good, but the Norwegian blue overrides the plumage property, which through the loop. But play is computed from the performance, so there’s no need
I didn’t have to deal with for the other cases. The initial Move Function (198) to pass it in as a parameter at all—I can just recalculate it within amountFor. When
is simple enough, albeit with the need to modify the constructor to put in a I’m breaking down a long function, I like to get rid of variables like play, because
back-reference to the bird. temporary variables create a lot of locally scoped names that complicate
extractions. The refactoring I will use here is Replace Temp with Query (178).
class NorwegianBlueParrot… I begin by extracting the right-hand side of the assignment into a function.
get plumage() {
return this._speciesDelegate.plumage; function statement…
} function playFor(aPerformance) {
return plays[aPerformance.playID];
class NorwegianBlueParrotDelegate… }
get plumage() {
if (this._voltage > 100) return "scorched"; top level…
else return this._bird._plumage || "beautiful"; function statement (invoice, plays) {
} let totalAmount = 0;
let volumeCredits = 0;
constructor(data, bird) {
let result = `Statement for ${invoice.customer}\n`;
this._bird = bird;
const format = new Intl.NumberFormat("en-US",
this._voltage = data.voltage;
{ style: "currency", currency: "USD",
this._isNailed = data.isNailed;
minimumFractionDigits: 2 }).format;
}
for (let perf of invoice.performances) {
class Bird… const play = playFor(perf);
let thisAmount = amountFor(perf, play);
selectSpeciesDelegate(data) {
switch(data.type) {
// add volume credits
case 'EuropeanSwallow':
volumeCredits += Math.max(perf.audience - 30, 0);
return new EuropeanSwallowDelegate();
// add extra credit for every ten comedy attendees
case 'AfricanSwallow':
if ("comedy" === play.type) volumeCredits += Math.floor(perf.audience / 5);
return new AfricanSwallowDelegate(data);
case 'NorweigianBlueParrot':
// print line for this order
return new NorwegianBlueParrotDelegate(data, this);
result += ` ${play.name}: ${format(thisAmount/100)} (${perf.audience} seats)\n`;
default: return null;
totalAmount += thisAmount;
}
}
}
result += `Amount owed is ${format(totalAmount/100)}\n`;
result += `You earned ${volumeCredits} credits\n`;
The tricky step is how to remove the subclass method for plumage. If I do
return result;
class Bird… I compile-test-commit, and then use Inline Variable (123).
get plumage() {
if (this._speciesDelegate) top level…
return this._speciesDelegate.plumage;
function statement (invoice, plays) {
else
let totalAmount = 0;
return this._plumage || "average";
let volumeCredits = 0;
}
let result = `Statement for ${invoice.customer}\n`;
const format = new Intl.NumberFormat("en-US",
then I’ll get a bunch of errors because there is no plumage property on the other
{ style: "currency", currency: "USD",
species’ delegate classes. minimumFractionDigits: 2 }).format;
12 Chapter 1 Refactoring: A First Example Replace Subclass with Delegate 393
This refactoring alarms some programmers. Previously, the code to look up get airSpeedVelocity() {
the play was executed once in each loop iteration; now, it’s executed thrice. I’ll return (this._isNailed) ? 0 : 10 + this._voltage / 10;
}
talk about the interplay of refactoring and performance later, but for the moment
}
I’ll just observe that this change is unlikely to significantly affect performance,
and even if it were, it is much easier to improve the performance of a well-factored The system will shortly be making a big difference between birds tagged in
code base. the wild and those tagged in captivity. That difference could be modeled as two
The great benefit of removing local variables is that it makes it much easier to subclasses for Bird: WildBird and CaptiveBird. However, I can only use inheritance
do extractions, since there is less local scope to deal with. Indeed, usually I’ll once, so if I want to use subclasses for wild versus captive, I’ll have to remove
take out local variables before I do any extractions. them for the species.
Now that I’m done with the arguments to amountFor, I look back at where it’s When several subclasses are involved, I’ll tackle them one at a time, starting
called. It’s being used to set a temporary variable that’s not updated again, so I with a simple one—in this case, EuropeanSwallow. I create an empty delegate class for
apply Inline Variable (123). the delegate.
top level… class EuropeanSwallowDelegate {
function statement (invoice, plays) { }
let totalAmount = 0;
let volumeCredits = 0;
I don’t put in any data or back-reference parameters yet. For this example, I’ll
let result = `Statement for ${invoice.customer}\n`; introduce them as I need them.
const format = new Intl.NumberFormat("en-US", I need to decide where to handle the initialization of the delegate field. Here,
{ style: "currency", currency: "USD", since I have all the information in the single data argument to the constructor, I
minimumFractionDigits: 2 }).format; decide to do it in the constructor. Since there are several delegates I could add,
for (let perf of invoice.performances) {
I make a function to select the correct one based on the type code in the
// add volume credits document.
volumeCredits += Math.max(perf.audience - 30, 0);
// add extra credit for every ten comedy attendees class Bird…
if ("comedy" === playFor(perf).type) volumeCredits += Math.floor(perf.audience / 5); constructor(data) {
this._name = data.name;
// print line for this order this._plumage = data.plumage;
result += ` ${playFor(perf).name}: ${format(amountFor(perf)/100)} (${perf.audience} seats)\n`; this._speciesDelegate = this.selectSpeciesDelegate(data);
totalAmount += amountFor(perf); }
}
result += `Amount owed is ${format(totalAmount/100)}\n`; selectSpeciesDelegate(data) {
result += `You earned ${volumeCredits} credits\n`; switch(data.type) {
return result; case 'EuropeanSwallow':
return new EuropeanSwallowDelegate();
default: return null;
Extracting Volume Credits }
}
Here’s the current state of the statement function body:
Now I have the structure set up, I can apply Move Function (198) to the European
top level… swallow’s air speed velocity.
function statement (invoice, plays) {
let totalAmount = 0; class EuropeanSwallowDelegate…
let volumeCredits = 0; get airSpeedVelocity() {return 35;}
let result = `Statement for ${invoice.customer}\n`;
const format = new Intl.NumberFormat("en-US", class EuropeanSwallow…
{ style: "currency", currency: "USD", get airSpeedVelocity() {return this._speciesDelegate.airSpeedVelocity;}
minimumFractionDigits: 2 }).format;
390 Chapter 12 Dealing with Inheritance Decomposing the statement Function 15
I compile-test-commit that, and then rename the variables inside the new class PremiumBooking…
function. get hasDinner() {
return this._extras.hasOwnProperty('dinner') && !this.isPeakDay;
function statement… }
function volumeCreditsFor(aPerformance) {
let result = 0; I move it from the subclass to the delegate:
result += Math.max(aPerformance.audience - 30, 0);
if ("comedy" === playFor(aPerformance).type) result += Math.floor(aPerformance.audience / 5); class PremiumBookingDelegate…
return result; get hasDinner() {
} return this._extras.hasOwnProperty('dinner') && !this._host.isPeakDay;
}
I’ve shown it in one step, but as before I did the renames one at a time, with
a compile-test-commit after each. I then add dispatch logic to Booking:
class Booking…
Removing the format Variable get hasDinner() {
return (this._premiumDelegate)
Let’s look at the main statement method again: ? this._premiumDelegate.hasDinner
: undefined;
top level… }
function statement (invoice, plays) {
let totalAmount = 0; In JavaScript, accessing a property on an object where it isn’t defined returns
let volumeCredits = 0; undefined,
so I do that here. (Although my every instinct is to have it raise an error,
let result = `Statement for ${invoice.customer}\n`; which would be the case in other object-oriented dynamic languages I’m used to.)
const format = new Intl.NumberFormat("en-US", Once I’ve moved all the behavior out of the subclass, I can change the factory
{ style: "currency", currency: "USD",
minimumFractionDigits: 2 }).format;
method to return the superclass—and, once I’ve run tests to ensure all is well,
for (let perf of invoice.performances) { delete the subclass.
volumeCredits += volumeCreditsFor(perf);
top level…
// print line for this order function createPremiumBooking(show, date, extras) {
result += ` ${playFor(perf).name}: ${format(amountFor(perf)/100)} (${perf.audience} seats)\n`; const result = new PremiumBooking (show, date, extras);
totalAmount += amountFor(perf); result._bePremium(extras);
} return result;
result += `Amount owed is ${format(totalAmount/100)}\n`; }
result += `You earned ${volumeCredits} credits\n`;
return result; class PremiumBooking extends Booking ...
As I suggested before, temporary variables can be a problem. They are only This is one of those refactorings where I don’t feel that refactoring alone im-
useful within their own routine, and therefore they encourage long, complex proves the code. Inheritance handles this situation very well, whereas using del-
routines. My next move, then, is to replace some of them. The easiest one is egation involves adding dispatch logic, two-way references, and thus extra
format. This is a case of assigning a function to a temp, which I prefer to replace complexity. The refactoring may still be worthwhile, since the advantage of a
with a declared function. mutable premium status, or a need to use inheritance for other purposes, may
outweigh the disadvantage of losing inheritance.
function statement…
function format(aNumber) {
return new Intl.NumberFormat("en-US",
Example: Replacing a Hierarchy
{ style: "currency", currency: "USD",
minimumFractionDigits: 2 }).format(aNumber);
The previous example showed using Replace Subclass with Delegate on a single
} subclass, but I can do the same thing with an entire hierarchy.
388 Chapter 12 Dealing with Inheritance Decomposing the statement Function 17
This is almost the same, but there is a wrinkle in the form of the pesky call on top level…
super(which is pretty common in these kinds of subclass extension cases). When function statement (invoice, plays) {
I move the subclass code to the delegate, I’ll need to call the parent case—but I let totalAmount = 0;
can’t just call this._host._basePrice without getting into an endless recursion. let volumeCredits = 0;
let result = `Statement for ${invoice.customer}\n`;
I have a couple of options here. One is to apply Extract Function (106) on the
for (let perf of invoice.performances) {
base calculation to allow me to separate the dispatch logic from price calculation. volumeCredits += volumeCreditsFor(perf);
(The rest of the move is as before.)
// print line for this order
class Booking… result += ` ${playFor(perf).name}: ${format(amountFor(perf)/100)} (${perf.audience} seats)\n`;
get basePrice() { totalAmount += amountFor(perf);
return (this._premiumDelegate) }
? this._premiumDelegate.basePrice result += `Amount owed is ${format(totalAmount/100)}\n`;
: this._privateBasePrice; result += `You earned ${volumeCredits} credits\n`;
} return result;
get _privateBasePrice() { Although changing a function variable to a declared function is a refactoring, I haven’t
let result = this._show.price; named it and included it in the catalog. There are many refactorings that I didn’t feel
if (this.isPeakDay) result += Math.round(result * 0.15); important enough for that. This one is both simple to do and relatively rare, so I didn’t
return result; think it was worthwhile.
}
I’m not keen on the name—“format” doesn’t really convey enough of what it’s
class PremiumBookingDelegate… doing. “formatAsUSD” would be a bit too long-winded since it’s being used in a
get basePrice() { string template, particularly within this small scope. I think the fact that it’s for-
return Math.round(this._host._privateBasePrice + this._extras.premiumFee); matting a currency amount is the thing to highlight here, so I pick a name that
}
suggests that and apply Change Function Declaration (124).
Alternatively, I can recast the delegate’s method as an extension of the base
top level…
method.
function statement (invoice, plays) {
class Booking… let totalAmount = 0;
let volumeCredits = 0;
get basePrice() {
let result = `Statement for ${invoice.customer}\n`;
let result = this._show.price;
for (let perf of invoice.performances) {
if (this.isPeakDay) result += Math.round(result * 0.15);
volumeCredits += volumeCreditsFor(perf);
return (this._premiumDelegate)
? this._premiumDelegate.extendBasePrice(result)
// print line for this order
: result;
result += ` ${playFor(perf).name}: ${usd(amountFor(perf))} (${perf.audience} seats)\n`;
}
totalAmount += amountFor(perf);
}
class PremiumBookingDelegate…
result += `Amount owed is ${usd(totalAmount)}\n`;
extendBasePrice(base) { result += `You earned ${volumeCredits} credits\n`;
return Math.round(base + this._extras.premiumFee); return result;
}
function statement…
Both work reasonably here; I have a slight preference for the latter as it’s a bit
function usd(aNumber) {
smaller. return new Intl.NumberFormat("en-US",
The last case is a method that only exists on the subclass. { style: "currency", currency: "USD",
minimumFractionDigits: 2 }).format(aNumber/100);
}
18 Chapter 1 Refactoring: A First Example Replace Subclass with Delegate 387
Naming is both important and tricky. Breaking a large function into smaller class PremiumBooking…
ones only adds value if the names are good. With good names, I don’t have to get hasTalkback() {
read the body of the function to see what it does. But it’s hard to get names right return this._show.hasOwnProperty('talkback');
the first time, so I use the best name I can think of for the moment, and don’t }
hesitate to rename it later. Often, it takes a second pass through some code to I use Move Function (198) to move the subclass method to the delegate. To
realize what the best name really is. make it fit its home, I route any access to superclass data with a call to _host.
As I’m changing the name, I also move the duplicated division by 100 into the
function. Storing money as integer cents is a common approach—it avoids class PremiumBookingDelegate…
the dangers of storing fractional monetary values as floats but allows me to use get hasTalkback() {
arithmetic operators. Whenever I want to display such a penny-integer number, return this._host._show.hasOwnProperty('talkback');
however, I need a decimal, so my formatting function should take care of the }
division.
class PremiumBooking…
get hasTalkback() {
Removing Total Volume Credits return this._premiumDelegate.hasTalkback;
}
My next target variable is volumeCredits. This is a trickier case, as it’s built up during
the iterations of the loop. My first move, then, is to use Split Loop (227) to I test to ensure everything is working, then delete the subclass method:
separate the accumulation of volumeCredits. class PremiumBooking…
top level… get hasTalkback() {
return this._premiumDelegate.hasTalkback;
function statement (invoice, plays) {
}
let totalAmount = 0;
let volumeCredits = 0; I run the tests at this point, expecting some to fail.
let result = `Statement for ${invoice.customer}\n`;
Now I finish the move by adding dispatch logic to the superclass method to
for (let perf of invoice.performances) { use the delegate if it is present.
class PremiumBooking…
get basePrice() {
return Math.round(super.basePrice + this._extras.premiumFee);
}
386 Chapter 12 Dealing with Inheritance Decomposing the statement Function 19
I use a leading underscore on _bePremium to indicate that it shouldn’t be part of top level…
the public interface for Booking. Of course, if the point of doing this refactoring is function statement (invoice, plays) {
to allow a booking to mutate to premium, it can be a public method. let totalAmount = 0;
let result = `Statement for ${invoice.customer}\n`;
Alternatively, I can do all the connections in the constructor for Booking. In order to do for (let perf of invoice.performances) {
that, I need some way to signal to the constructor that we have a premium booking.
That could be an extra parameter, or just the use of extras if I can be sure that it is always // print line for this order
present when used with a premium booking. Here, I prefer the explicitness of doing result += ` ${playFor(perf).name}: ${usd(amountFor(perf))} (${perf.audience} seats)\n`;
this through the factory function. totalAmount += amountFor(perf);
}
With the structures set up, it’s time to start moving the behavior. The first case let volumeCredits = totalVolumeCredits();
I’ll consider is the simple override of hasTalkback. Here’s the existing code: result += `Amount owed is ${usd(totalAmount)}\n`;
result += `You earned ${volumeCredits} credits\n`;
class Booking… return result;
get hasTalkback() {
Once everything is extracted, I can apply Inline Variable (123):
return this._show.hasOwnProperty('talkback') && !this.isPeakDay;
}
20 Chapter 1 Refactoring: A First Example Replace Subclass with Delegate 385
top level… Inheritance works well for this example. I can understand the base class without
function statement (invoice, plays) { having to understand the subclass. The subclass is defined just by saying how it
let totalAmount = 0; differs from the base case—both reducing duplication and clearly communicating
let result = `Statement for ${invoice.customer}\n`; what are the differences it’s introducing.
for (let perf of invoice.performances) {
Actually, it isn’t quite as perfect as the previous paragraph implies. There are
// print line for this order things in the superclass structure that only make sense due to the subclass—such
result += ` ${playFor(perf).name}: ${usd(amountFor(perf))} (${perf.audience} seats)\n`; as methods that have been factored in such a way as to make it easier to override
totalAmount += amountFor(perf); just the right kinds of behavior. So although most of the time I can modify the
} base class without having to understand subclasses, there are occasions where
such mindful ignorance of the subclasses will lead me to breaking a subclass by
result += `Amount owed is ${usd(totalAmount)}\n`;
modifying the superclass. However, if these occasions are not too common, the
result += `You earned ${totalVolumeCredits()} credits\n`;
return result; inheritance pays off—provided I have good tests to detect a subclass breakage.
So why would I want to change such a happy situation by using Replace Sub-
Let me pause for a bit to talk about what I’ve just done here. Firstly, I know class with Delegate? Inheritance is a tool that can only be used once—so if I have
readers will again be worrying about performance with this change, as many another reason to use inheritance, and I think it will benefit me more than the
people are wary of repeating a loop. But most of the time, rerunning a loop like premium booking subclass, I’ll need to handle premium bookings a different way.
this has a negligible effect on performance. If you timed the code before and Also, I may need to change from the default booking to the premium booking
after this refactoring, you would probably not notice any significant change in dynamically—i.e., support a method like aBooking.bePremium(). In some cases, I can
speed—and that’s usually the case. Most programmers, even experienced ones, avoid this by creating a whole new object (a common example is where an HTTP
are poor judges of how code actually performs. Many of our intuitions are broken request loads new data from the server). But sometimes, I need to modify a data
by clever compilers, modern caching techniques, and the like. The performance structure and not rebuild it from scratch, and it is difficult to just replace a single
of software usually depends on just a few parts of the code, and changes anywhere booking that’s referred to from many different places. In such situations, it can
else don’t make an appreciable difference. be useful to allow a booking to switch from default to premium and back again.
But “mostly” isn’t the same as “alwaysly.” Sometimes a refactoring will have a When these needs crop up, I need to apply Replace Subclass with Delegate. I
significant performance implication. Even then, I usually go ahead and do it, be- have clients call the constructors of the two classes to make the bookings:
cause it’s much easier to tune the performance of well-factored code. If I introduce
a significant performance issue during refactoring, I spend time on performance booking client
tuning afterwards. It may be that this leads to reversing some of the refactoring aBooking = new Booking(show,date);
I did earlier—but most of the time, due to the refactoring, I can apply a more ef-
premium client
fective performance-tuning enhancement instead. I end up with code that’s both
aBooking = new PremiumBooking(show, date, extras);
clearer and faster.
So, my overall advice on performance with refactoring is: Most of the time you Removing subclasses will alter all of this, so I like to encapsulate the constructor
should ignore it. If your refactoring introduces performance slow-downs, finish calls with Replace Constructor with Factory Function (334).
refactoring first and do performance tuning afterwards.
The second aspect I want to call your attention to is how small the steps were top level…
to remove volumeCredits. Here are the four steps, each followed by compiling, testing, function createBooking(show, date) {
and committing to my local source code repository: return new Booking(show, date);
}
Split Loop (227) to isolate the accumulation function createPremiumBooking(show, date, extras) {
return new PremiumBooking (show, date, extras);
Slide Statements (223) to bring the initializing code next to the accumulation }
Extract Function (106) to create a function for calculating the total booking client
aBooking = createBooking(show, date);
Inline Variable (123) to remove the variable completely
384 Chapter 12 Dealing with Inheritance Decomposing the statement Function 21
class PremiumBooking extends Booking… I confess I don’t always take quite as short steps as these—but whenever things
constructor(show, date, extras) { get difficult, my first reaction is to take shorter steps. In particular, should a test
super(show, date); fail during a refactoring, if I can’t immediately see and fix the problem, I’ll revert
this._extras = extras; to my last good commit and redo what I just did with smaller steps. That works
}
because I commit so frequently and because small steps are the key to moving
There are quite a few changes that the premium booking makes to what it in- quickly, particularly when working with difficult code.
herits from the superclass. As is typical with this kind of programming-by- I then repeat that sequence to remove totalAmount. I start by splitting the loop
difference, in some cases the subclass overrides methods on the superclass, in (compile-test-commit), then I slide the variable initialization (compile-test-commit),
others it adds new methods that are only relevant for the subclass. I won’t go and then I extract the function. There is a wrinkle here: The best name for the
into all of them, but I will pick out a few interesting cases. function is “totalAmount”, but that’s the name of the variable, and I can’t have
First, there is a simple override. Regular bookings offer a talkback after the both at the same time. So I give the new function a random name when I extract
show, but only on nonpeak days. it (and compile-test-commit).
function statement… Modify the creation of the subclass so that it initializes the delegate field
function totalAmount() { with an instance of the delegate.
let totalAmount = 0;
for (let perf of invoice.performances) { This can be done in the factory function, or in the constructor if the constructor
totalAmount += amountFor(perf); can reliably tell whether to create the correct delegate.
}
return totalAmount; Choose a subclass method to move to the delegate class.
}
Use Move Function (198) to move it to the delegate class. Don’t remove the
I also take the opportunity to change the names inside my extracted functions source’s delegating code.
to adhere to my convention.
If the method needs elements that should move to the delegate, move them. If it
function statement… needs elements that should stay in the superclass, add a field to the delegate that
refers to the superclass.
function totalAmount() {
let result = 0; If the source method has callers outside the class, move the source’s delegat-
for (let perf of invoice.performances) {
result += amountFor(perf);
ing code from the subclass to the superclass, guarding it with a check for
} the presence of the delegate. If not, apply Remove Dead Code (237).
return result;
}
If there’s more than one subclass, and you start duplicating code within them, use
function totalVolumeCredits() { Extract Superclass (375). In this case, any delegating methods on the source super-
let result = 0; class no longer need a guard if the default behavior is moved to the delegate
for (let perf of invoice.performances) { superclass.
result += volumeCreditsFor(perf);
} Test.
return result;
}
Repeat until all the methods of the subclass are moved.
Find all callers of the subclasses’s constructor and change them to use the
superclass constructor.
needed. Object-oriented languages make this simple to implement and thus a function totalVolumeCredits() {
familiar mechanism. let result = 0;
for (let perf of invoice.performances) {
But inheritance has its downsides. Most obviously, it’s a card that can only be
result += volumeCreditsFor(perf);
played once. If I have more than one reason to vary something, I can only use }
inheritance for a single axis of variation. So, if I want to vary behavior of people return result;
by their age category and by their income level, I can either have subclasses for }
young and senior, or for well-off and poor—I can’t have both. function usd(aNumber) {
A further problem is that inheritance introduces a very close relationship be- return new Intl.NumberFormat("en-US",
{ style: "currency", currency: "USD",
tween classes. Any change I want to make to the parent can easily break children, minimumFractionDigits: 2 }).format(aNumber/100);
so I have to be careful and understand how children derive from the superclass. }
This problem is made worse when the logic of the two classes resides in different function volumeCreditsFor(aPerformance) {
modules and is looked after by different teams. let result = 0;
Delegation handles both of these problems. I can delegate to many different result += Math.max(aPerformance.audience - 30, 0);
if ("comedy" === playFor(aPerformance).type) result += Math.floor(aPerformance.audience / 5);
classes for different reasons. Delegation is a regular relationship between
return result;
objects—so I can have a clear interface to work with, which is much less coupling }
than subclassing. It’s therefore common to run into the problems with subclassing function playFor(aPerformance) {
and apply Replace Subclass with Delegate. return plays[aPerformance.playID];
There is a popular principle: “Favor object composition over class inheritance” }
(where composition is effectively the same as delegation). Many people take this function amountFor(aPerformance) {
let result = 0;
to mean “inheritance considered harmful” and claim that we should never use
switch (playFor(aPerformance).type) {
inheritance. I use inheritance frequently, partly because I always know I can use case "tragedy":
Replace Subclass with Delegate should I need to change it later. Inheritance is a result = 40000;
valuable mechanism that does the job most of the time without problems. So I if (aPerformance.audience > 30) {
reach for it first, and move onto delegation when it starts to rub badly. This usage result += 1000 * (aPerformance.audience - 30);
is actually consistent with the principle—which comes from the Gang of Four }
break;
book [gof] that explains how inheritance and composition work together. The case "comedy":
principle was a reaction to the overuse of inheritance. result = 30000;
Those who are familiar with the Gang of Four book may find it helpful to think if (aPerformance.audience > 20) {
of this refactoring as replacing subclasses with the State or Strategy patterns. result += 10000 + 500 * (aPerformance.audience - 20);
Both of these patterns are structurally the same, relying on the host delegating }
result += 300 * aPerformance.audience;
to a separate hierarchy. Not all cases of Replace Subclass with Delegate involve
break;
an inheritance hierarchy for the delegate (as the first example below illustrates), default:
but setting up a hierarchy for states or strategies is often useful. throw new Error(`unknown type: ${playFor(aPerformance).type}`);
}
return result;
Mechanics }
}
If there are many callers for the constructors, apply Replace Constructor with
Factory Function (334). The structure of the code is much better now. The top-level statement function
is now just seven lines of code, and all it does is laying out the printing of the
Create an empty class for the delegate. Its constructor should take any statement. All the calculation logic has been moved out to a handful of supporting
subclass-specific data as well as, usually, a back-reference to the superclass. functions. This makes it easier to understand each individual calculation as well
Add a field to the superclass to hold the delegate. as the overall flow of the report.
24 Chapter 1 Refactoring: A First Example Replace Subclass with Delegate 381
Splitting the Phases of Calculation and Formatting Replace Subclass with Delegate
So far, my refactoring has focused on adding enough structure to the function so
that I can understand it and see it in terms of its logical parts. This is often the
case early in refactoring. Breaking down complicated chunks into small pieces is
important, as is naming things well. Now, I can begin to focus more on the
functionality change I want to make—specifically, providing an HTML version of
this statement. In many ways, it’s now much easier to do. With all the calculation
code split out, all I have to do is write an HTML version of the seven lines
of code at the top. The problem is that these broken-out functions are nested
within the textual statement method, and I don’t want to copy and paste them class Order {
into a new function, however well organized. I want the same calculation functions get daysToShip() {
to be used by the text and HTML versions of the statement. return this._warehouse.daysToShip;
}
There are various ways to do this, but one of my favorite techniques is Split }
Phase (154). My aim here is to divide the logic into two parts: one that calculates
the data required for the statement, the other that renders it into text or HTML. class PriorityOrder extends Order {
The first phase creates an intermediate data structure that it passes to the second. get daysToShip() {
I start a Split Phase (154) by applying Extract Function (106) to the code that return this._priorityPlan.daysToShip;
}
makes up the second phase. In this case, that’s the statement printing code, which
}
is in fact the entire content of statement. This, together with all the nested functions,
goes into its own top-level function which I call renderPlainText.
function statement (invoice, plays) {
return renderPlainText(invoice, plays);
}
class Order {
function renderPlainText(invoice, plays) { get daysToShip() {
let result = `Statement for ${invoice.customer}\n`; return (this._priorityDelegate)
for (let perf of invoice.performances) { ? this._priorityDelegate.daysToShip
result += ` ${playFor(perf).name}: ${usd(amountFor(perf))} (${perf.audience} seats)\n`; : this._warehouse.daysToShip;
} }
result += `Amount owed is ${usd(totalAmount())}\n`; }
result += `You earned ${totalVolumeCredits()} credits\n`;
return result; class PriorityOrderDelegate {
get daysToShip() {
function totalAmount() {...} return this._priorityPlan.daysToShip
function totalVolumeCredits() {...} }
function usd(aNumber) {...} }
function volumeCreditsFor(aPerformance) {...}
function playFor(aPerformance) {...}
function amountFor(aPerformance) {...}
Motivation
If I have some objects whose behavior varies from category to category, the nat-
ural mechanism to express this is inheritance. I put all the common data and
behavior in the superclass, and let each subclass add and override features as
380 Chapter 12 Dealing with Inheritance Splitting the Phases of Calculation and Formatting 25
Motivation I now examine the other arguments used by renderPlainText. I want to move
the data that comes from them into the intermediate data structure, so that all the
When I’m refactoring a class hierarchy, I’m often pulling and pushing features calculation code moves into the statement function and renderPlainText operates
around. As the hierarchy evolves, I sometimes find that a class and its parent are solely on data passed to it through the data parameter.
no longer different enough to be worth keeping separate. At this point, I’ll merge My first move is to take the customer and add it to the intermediate object
them together. (compile-test-commit).
function statement (invoice, plays) {
Mechanics const statementData = {};
statementData.customer = invoice.customer;
Choose which one to remove. return renderPlainText(statementData, invoice, plays);
}
I choose based on which name makes most sense in the future. If neither name
is best, I’ll pick one arbitrarily. function renderPlainText(data, invoice, plays) {
let result = `Statement for ${data.customer}\n`;
Use Pull Up Field (353), Push Down Field (361), Pull Up Method (350), and Push for (let perf of invoice.performances) {
Down Method (359) to move all the elements into a single class. result += ` ${playFor(perf).name}: ${usd(amountFor(perf))} (${perf.audience} seats)\n`;
}
Adjust any references to the victim to change them to the class that will result += `Amount owed is ${usd(totalAmount())}\n`;
result += `You earned ${totalVolumeCredits()} credits\n`;
stay.
return result;
Remove the empty class.
Test.
26 Chapter 1 Refactoring: A First Example Extract Superclass 379
Similarly, I add the performances, which allows me to delete the invoice class Department…
parameter to renderPlainText (compile-test-commit). get totalAnnualCost() {
return this.totalMonthlyCost * 12;
top level… }
function statement (invoice, plays) {
const statementData = {}; The methods they use, monthlyCost and totalMonthlyCost, have different names and
statementData.customer = invoice.customer; different bodies—but do they represent the same intent? If so, I should use Change
statementData.performances = invoice.performances; Function Declaration (124) to unify their names.
return renderPlainText(statementData, invoice, plays);
} class Department…
get totalAnnualCost() {
function renderPlainText(data, plays) {
return this.monthlyCost * 12;
let result = `Statement for ${data.customer}\n`;
}
for (let perf of data.performances) {
result += ` ${playFor(perf).name}: ${usd(amountFor(perf))} (${perf.audience} seats)\n`; get monthlyCost() { … }
}
result += `Amount owed is ${usd(totalAmount())}\n`; I then do a similar renaming to the annual costs:
result += `You earned ${totalVolumeCredits()} credits\n`;
return result; class Department…
get annualCost() {
function renderPlainText…
return this.monthlyCost * 12;
function totalAmount() { }
let result = 0;
for (let perf of data.performances) { I can now apply Pull Up Method (350) to the annual cost methods.
result += amountFor(perf);
} class Party…
return result;
get annualCost() {
}
return this.monthlyCost * 12;
function totalVolumeCredits() {
}
let result = 0;
for (let perf of data.performances) { class Employee…
result += volumeCreditsFor(perf);
get annualCost() {
}
return this.monthlyCost * 12;
return result;
}
}
Now I’d like the play name to come from the intermediate data. To do this, I class Department…
need to enrich the performance record with data from the play (compile-test- get annualCost() {
return this.monthlyCost * 12;
commit). }
378 Chapter 12 Dealing with Inheritance Splitting the Phases of Calculation and Formatting 27
class Department…
get name() {return this._name;}
class Employee…
get annualCost() {
return this.monthlyCost * 12;
}
28 Chapter 1 Refactoring: A First Example Extract Superclass 377
Examine remaining methods on the subclasses. See if there are common function renderPlainText…
parts. If so, use Extract Function (106) followed by Pull Up Method (350). function totalVolumeCredits() {
let result = 0;
Check clients of the original classes. Consider adjusting them to use the for (let perf of data.performances) {
superclass interface. result += perf.volumeCredits;
}
return result;
Example }
I’m pondering these two classes, they share some common functionality—their Finally, I move the two calculations of the totals.
name and the notions of annual and monthly costs:
function statement…
const statementData = {};
statementData.customer = invoice.customer;
statementData.performances = invoice.performances.map(enrichPerformance);
statementData.totalAmount = totalAmount(statementData);
statementData.totalVolumeCredits = totalVolumeCredits(statementData);
return renderPlainText(statementData, plays);
function renderPlainText…
let result = `Statement for ${data.customer}\n`; Extract Superclass
for (let perf of data.performances) {
result += ` ${perf.play.name}: ${usd(perf.amount)} (${perf.audience} seats)\n`;
}
result += `Amount owed is ${usd(data.totalAmount)}\n`;
result += `You earned ${data.totalVolumeCredits} credits\n`;
return result;
Although I could have modified the bodies of these totals functions to use the
statementData
variable (as it’s within scope), I prefer to pass the explicit parameter.
And, once I’m done with compile-test-commit after the move, I can’t resist a
couple quick shots of Replace Loop with Pipeline (231).
function renderPlainText…
class Department {
function totalAmount(data) {
get totalAnnualCost() {...}
return data.performances
get name() {...}
.reduce((total, p) => total + p.amount, 0);
get headCount() {...}
}
}
function totalVolumeCredits(data) {
return data.performances
class Employee {
.reduce((total, p) => total + p.volumeCredits, 0);
get annualCost() {...}
}
get name() {...}
get id() {...}
I now extract all the first-phase code into its own function (compile-test-commit).
}
top level…
function statement (invoice, plays) {
return renderPlainText(createStatementData(invoice, plays));
}
statement.js…
function htmlStatement (invoice, plays) {
return renderHtml(createStatementData(invoice, plays));
}
function renderHtml (data) {
let result = `<h1>Statement for ${data.customer}</h1>\n`;
result += "<table>\n";
result += "<tr><th>play</th><th>seats</th><th>cost</th></tr>";
for (let perf of data.performances) {
result += ` <tr><td>${perf.play.name}</td><td>${perf.audience}</td>`;
result += `<td>${usd(perf.amount)}</td></tr>\n`;
}
result += "</table>\n";
result += `<p>Amount owed is <em>${usd(data.totalAmount)}</em></p>\n`;
result += `<p>You earned <em>${data.totalVolumeCredits}</em> credits</p>\n`;
return result;
}
(I moved usd to the top level, so that renderHtml could use it.)
createStatementData.js I find the lack of symmetry with the gender code to be annoying. A future
export default function createStatementData(invoice, plays) { reader of the code will always wonder about this lack of symmetry. So I prefer
const result = {};
to change the code to make it symmetrical—if I can do it without introducing
result.customer = invoice.customer;
result.performances = invoice.performances.map(enrichPerformance);
any other complexity, which is the case here.
result.totalAmount = totalAmount(result);
function createPerson(aRecord) {
result.totalVolumeCredits = totalVolumeCredits(result);
switch (aRecord.gender) {
return result;
case 'M': return new Person(aRecord.name, "M");
case 'F': return new Person(aRecord.name, "F");
function enrichPerformance(aPerformance) {
default: return new Person(aRecord.name, "X");
const result = Object.assign({}, aPerformance);
}
result.play = playFor(result);
}
result.amount = amountFor(result);
result.volumeCredits = volumeCreditsFor(result);
return result;
}
372 Chapter 12 Dealing with Inheritance Status: Separated into Two Files (and Phases) 33
client… I have more code than I did when I started: 70 lines (not counting htmlStatement)
const numberOfMales = people.filter(p => p.isMale).length; as opposed to 44, mostly due to the extra wrapping involved in putting things
in functions. If all else is equal, more code is bad—but rarely is all else equal.
With that refactoring done, all knowledge of the subclasses is now safely en- The extra code breaks up the logic into identifiable parts, separating the calcula-
cased within the superclass and the factory function. (Usually I’m wary of a tions of the statements from the layout. This modularity makes it easier for me
superclass referring to a subclass, but this code isn’t going to last until my next to understand the parts of the code and how they fit together. Brevity is the soul
cup of tea, so I’m not going worry about it.) of wit, but clarity is the soul of evolvable software. Adding this modularity allows
I now add a field to represent the difference between the subclasses; since I’m to me to support the HTML version of the code without any duplication of the
using a code loaded from elsewhere, I might as well just use that. calculations.
34 Chapter 1 Refactoring: A First Example Remove Subclass 371
Mechanics createStatementData.js…
export default function createStatementData(invoice, plays) {
Use Replace Constructor with Factory Function (334) on the subclass constructor. const result = {};
result.customer = invoice.customer;
If the clients of the constructors use a data field to decide which subclass to create, result.performances = invoice.performances.map(enrichPerformance);
put that decision logic into a superclass factory method. result.totalAmount = totalAmount(result);
result.totalVolumeCredits = totalVolumeCredits(result);
If any code tests against the subclass’s types, use Extract Function (106) on return result;
the type test and Move Function (198) to move it to the superclass. Test after
each change. function enrichPerformance(aPerformance) {
const result = Object.assign({}, aPerformance);
Create a field to represent the subclass type. result.play = playFor(result);
result.amount = amountFor(result);
Change the methods that refer to the subclass to use the new type field. result.volumeCredits = volumeCreditsFor(result);
return result;
Delete the subclass. }
function playFor(aPerformance) {
Test. return plays[aPerformance.playID]
}
Often, this refactoring is used on a group of subclasses at once—in which case function amountFor(aPerformance) {
carry out the steps to encapsulate them (add factory function, move type tests) let result = 0;
first, then individually fold them into the superclass. switch (aPerformance.play.type) {
case "tragedy":
result = 40000;
Example if (aPerformance.audience > 30) {
result += 1000 * (aPerformance.audience - 30);
I’ll start with this stump of subclasses: }
break;
class Person… case "comedy":
constructor(name) { result = 30000;
this._name = name; if (aPerformance.audience > 20) {
} result += 10000 + 500 * (aPerformance.audience - 20);
get name() {return this._name;} }
get genderCode() {return "X";} result += 300 * aPerformance.audience;
// snip break;
default:
class Male extends Person { throw new Error(`unknown type: ${aPerformance.play.type}`);
get genderCode() {return "M";} }
} return result;
}
class Female extends Person { function volumeCreditsFor(aPerformance) {
get genderCode() {return "F";} let result = 0;
} result += Math.max(aPerformance.audience - 30, 0);
if ("comedy" === aPerformance.play.type) result += Math.floor(aPerformance.audience / 5);
If that’s all that a subclass does, it’s not really worth having. But before I remove return result;
these subclasses, it’s usually worth checking to see if there’s any subclass- }
dependent behavior in the clients that should be moved in there. In this case, I function totalAmount(data) {
don’t find anything worth keeping the subclasses for. return data.performances
.reduce((total, p) => total + p.amount, 0);
}
36 Chapter 1 Refactoring: A First Example Remove Subclass 369
function totalVolumeCredits(data) {
return data.performances Remove Subclass
.reduce((total, p) => total + p.volumeCredits, 0);
}
formerly: Replace Subclass with Fields
inverse of: Replace Type Code with Subclasses (362)
Creating a Performance Calculator
The enrichPerformance function is the key, since it populates the intermediate data
structure with the data for each performance. Currently, it calls the conditional
functions for amount and volume credits. What I need it to do is call
those functions on a host class. Since that class hosts functions for calculating
data about performances, I’ll call it a performance calculator.
function createStatementData…
function enrichPerformance(aPerformance) {
const calculator = new PerformanceCalculator(aPerformance);
const result = Object.assign({}, aPerformance); class Person {
result.play = playFor(result); get genderCode() {return "X";}
result.amount = amountFor(result); }
result.volumeCredits = volumeCreditsFor(result); class Male extends Person {
return result; get genderCode() {return "M";}
} }
class Female extends Person {
top level… get genderCode() {return "F";}
}
class PerformanceCalculator {
constructor(aPerformance) {
this.performance = aPerformance;
}
}
So far, this new object isn’t doing anything. I want to move behavior into class Person {
it—and I’d like to start with the simplest thing to move, which is the play record. get genderCode() {return this._genderCode;}
}
Strictly, I don’t need to do this, as it’s not varying polymorphically, but this way
I’ll keep all the data transforms in one place, and that consistency will make the
code clearer.
To make this work, I will use Change Function Declaration (124) to pass the Motivation
performance’s play into the calculator. Subclasses are useful. They support variations in data structure and polymorphic
function createStatementData… behavior. They are a good way to program by difference. But as a software system
function enrichPerformance(aPerformance) {
evolves, subclasses can lose their value as the variations they support are moved
const calculator = new PerformanceCalculator(aPerformance, playFor(aPerformance)); to other places or removed altogether. Sometimes, subclasses are added in antic-
const result = Object.assign({}, aPerformance); ipation of features that never end up being built, or end up being built in a way
result.play = calculator.play; that doesn’t need the subclasses.
result.amount = amountFor(result); A subclass that does too little incurs a cost in understanding that is no longer
result.volumeCredits = volumeCreditsFor(result);
worthwhile. When that time comes, it’s best to remove the subclass, replacing it
return result;
} with a field on its superclass.
368 Chapter 12 Dealing with Inheritance Reorganizing the Calculations by Type 37
do is run Babel [babel]. That will be enough to catch any syntax errors in the class Employee…
new function—but little more than that. Even so, that can be a useful step. constructor(name, type){
Once the new function fits its home, I take the original function and turn it this.validateType(type);
into a delegating function so it calls the new function. this._name = name;
this._type = type;
function createStatementData… }
validateType(arg) {
function amountFor(aPerformance) {
if (!["engineer", "manager", "salesman"].includes(arg))
return new PerformanceCalculator(aPerformance, playFor(aPerformance)).amount;
throw new Error(`Employee cannot be of type ${arg}`);
}
}
Now I can compile-test-commit to ensure the code is working properly in its get type() {return this._type;}
set type(arg) {this._type = arg;}
new home. With that done, I use Inline Function (115) to call the new function
directly (compile-test-commit). get capitalizedType() {
return this._type.charAt(0).toUpperCase() + this._type.substr(1).toLowerCase();
function createStatementData… }
function enrichPerformance(aPerformance) { toString() {
const calculator = new PerformanceCalculator(aPerformance, playFor(aPerformance)); return `${this._name} (${this.capitalizedType})`;
const result = Object.assign({}, aPerformance); }
result.play = calculator.play;
result.amount = calculator.amount; This time toString is a bit more complicated, to allow me to illustrate something
result.volumeCredits = volumeCreditsFor(result); shortly.
return result; My first step is to use Replace Primitive with Object (174) on the type code.
}
class EmployeeType {
I repeat the same process to move the volume credits calculation. constructor(aString) {
this._value = aString;
function createStatementData… }
function enrichPerformance(aPerformance) { toString() {return this._value;}
const calculator = new PerformanceCalculator(aPerformance, playFor(aPerformance)); }
const result = Object.assign({}, aPerformance);
result.play = calculator.play; class Employee…
result.amount = calculator.amount; constructor(name, type){
result.volumeCredits = calculator.volumeCredits; this.validateType(type);
return result; this._name = name;
} this.type = type;
}
class PerformanceCalculator… validateType(arg) {
get volumeCredits() { if (!["engineer", "manager", "salesman"].includes(arg))
let result = 0; throw new Error(`Employee cannot be of type ${arg}`);
result += Math.max(this.performance.audience - 30, 0); }
if ("comedy" === this.play.type) result += Math.floor(this.performance.audience / 5); get typeString() {return this._type.toString();}
return result; get type() {return this._type;}
} set type(arg) {this._type = new EmployeeType(arg);}
get capitalizedType() {
Making the Performance Calculator Polymorphic return this.typeString.charAt(0).toUpperCase()
+ this.typeString.substr(1).toLowerCase();
Now that I have the logic in a class, it’s time to apply the polymorphism. The }
first step is to use Replace Type Code with Subclasses (362) to introduce subclasses
instead of the type code. For this, I need to create subclasses of the performance
366 Chapter 12 Dealing with Inheritance Reorganizing the Calculations by Type 39
function createEmployee(name, type) { calculator and use the appropriate subclass in createPerformanceData. In order to get
switch (type) { the right subclass, I need to replace the constructor call with a function, since
case "engineer": return new Engineer(name, type);
JavaScript constructors can’t return subclasses. So I use Replace Constructor with
case "salesman": return new Salesman(name, type);
case "manager": return new Manager (name, type); Factory Function (334).
default: throw new Error(`Employee cannot be of type ${type}`);
}
function createStatementData…
return new Employee(name, type); function enrichPerformance(aPerformance) {
} const calculator = createPerformanceCalculator(aPerformance, playFor(aPerformance));
const result = Object.assign({}, aPerformance);
The type argument to the constructor is now useless, so it falls victim to Change result.play = calculator.play;
Function Declaration (124). result.amount = calculator.amount;
result.volumeCredits = calculator.volumeCredits;
class Employee… return result;
}
constructor(name, type){
this._name = name;
top level…
}
function createPerformanceCalculator(aPerformance, aPlay) {
function createEmployee(name, type) { return new PerformanceCalculator(aPerformance, aPlay);
switch (type) { }
case "engineer": return new Engineer(name, type);
case "salesman": return new Salesman(name, type); With that now a function, I can create subclasses of the performance calculator
case "manager": return new Manager (name, type); and get the creation function to select which one to return.
default: throw new Error(`Employee cannot be of type ${type}`);
} top level…
} function createPerformanceCalculator(aPerformance, aPlay) {
switch(aPlay.type) {
I still have the type code accessors on the subclasses—get type. I’ll usually want case "tragedy": return new TragedyCalculator(aPerformance, aPlay);
to remove these too, but that may take a bit of time due to other methods that case "comedy" : return new ComedyCalculator(aPerformance, aPlay);
depend on them. I’ll use Replace Conditional with Polymorphism (272) and Push default:
Down Method (359) to deal with these. At some point, I’ll have no code that uses throw new Error(`unknown type: ${aPlay.type}`);
the type getters, so I will subject them to the tender mercies of Remove Dead Code }
}
(237).
class TragedyCalculator extends PerformanceCalculator {
}
Example: Using Indirect Inheritance class ComedyCalculator extends PerformanceCalculator {
}
Let’s go back to the starting case—but this time, I already have existing subclasses
for part-time and full-time employees, so I can’t subclass from Employee for the type This sets up the structure for the polymorphism, so I can now move on to
codes. Another reason to not use direct inheritance is keeping the ability to Replace Conditional with Polymorphism (272).
change the type of employee. I start with the calculation of the amount for tragedies.
class TragedyCalculator…
get amount() {
let result = 40000;
if (this.performance.audience > 30) {
result += 1000 * (this.performance.audience - 30);
}
return result;
}
40 Chapter 1 Refactoring: A First Example Replace Type Code with Subclasses 365
Just having this method in the subclass is enough to override the superclass I test to ensure that worked out correctly. But, because I’m paranoid, I then
conditional. But if you’re as paranoid as I am, you might do this: alter the return value of the engineer’s override and test again to ensure the test
fails. That way I know the subclass is being used. I correct the return value and
class PerformanceCalculator… continue with the other cases. I can do them one at a time, testing after each
get amount() { change.
let result = 0;
switch (this.play.type) { class Salesman extends Employee {
case "tragedy": get type() {return "salesman";}
throw 'bad thing'; }
case "comedy":
result = 30000; class Manager extends Employee {
if (this.performance.audience > 20) { get type() {return "manager";}
result += 10000 + 500 * (this.performance.audience - 20); }
}
result += 300 * this.performance.audience; function createEmployee(name, type) {
break; switch (type) {
default: case "engineer": return new Engineer(name, type);
throw new Error(`unknown type: ${this.play.type}`); case "salesman": return new Salesman(name, type);
} case "manager": return new Manager (name, type);
return result; }
} return new Employee(name, type);
}
I could have removed the case for tragedy and let the default branch throw an error.
But I like the explicit throw—and it will only be there for a couple more minutes (which Once I’m done with them all, I can remove the type code field and the
is why I threw a string, not a better error object). superclass getting method (the ones in the subclasses remain).
After a compile-test-commit of that, I move the comedy case down too. class Employee…
constructor(name, type){
class ComedyCalculator… this.validateType(type);
get amount() { this._name = name;
let result = 30000; this._type = type;
if (this.performance.audience > 20) { }
result += 10000 + 500 * (this.performance.audience - 20);
} get type() {return this._type;}
result += 300 * this.performance.audience; toString() {return `${this._name} (${this.type})`;}
return result;
}
After testing to ensure all is still well, I can remove the validation logic, since
the switch is effectively doing the same thing.
I can now remove the superclass amount method, as it should never be called.
But it’s kinder to my future self to leave a tombstone. class Employee…
constructor(name, type){
class PerformanceCalculator… this.validateType(type);
get amount() { this._name = name;
throw new Error('subclass responsibility'); }
}
Example and let the variations override it as necessary. So I just push down the case for
comedies:
I’ll start with this overused employee example:
class PerformanceCalculator…
class Employee… get volumeCredits() {
constructor(name, type){ return Math.max(this.performance.audience - 30, 0);
this.validateType(type); }
this._name = name;
this._type = type; class ComedyCalculator…
} get volumeCredits() {
validateType(arg) { return super.volumeCredits + Math.floor(this.performance.audience / 5);
if (!["engineer", "manager", "salesman"].includes(arg)) }
throw new Error(`Employee cannot be of type ${arg}`);
}
toString() {return `${this._name} (${this._type})`;}
My first step is to use Encapsulate Variable (132) to self-encapsulate the type code. Status: Creating the Data with the Polymorphic
class Employee… Calculator
get type() {return this._type;}
toString() {return `${this._name} (${this.type})`;} Time to reflect on what introducing the polymorphic calculator did to the code.
Note that toString uses the new getter by removing the underscore. createStatementData.js
export default function createStatementData(invoice, plays) {
I pick one type code, the engineer, to start with. I use direct inheritance, sub- const result = {};
classing the employee class itself. The employee subclass is simple—just overriding result.customer = invoice.customer;
the type code getter with the appropriate literal value. result.performances = invoice.performances.map(enrichPerformance);
result.totalAmount = totalAmount(result);
class Engineer extends Employee { result.totalVolumeCredits = totalVolumeCredits(result);
get type() {return "engineer";} return result;
}
function enrichPerformance(aPerformance) {
Although JavaScript constructors can return other objects, things will get messy const calculator = createPerformanceCalculator(aPerformance, playFor(aPerformance));
if I try to put selector logic in there, since that logic gets intertwined with field const result = Object.assign({}, aPerformance);
initialization. So I use Replace Constructor with Factory Function (334) to create a result.play = calculator.play;
result.amount = calculator.amount;
new space for it.
result.volumeCredits = calculator.volumeCredits;
function createEmployee(name, type) { return result;
return new Employee(name, type); }
} function playFor(aPerformance) {
return plays[aPerformance.playID]
To use the new subclass, I add selector logic into the factory. }
function totalAmount(data) {
function createEmployee(name, type) { return data.performances
switch (type) { .reduce((total, p) => total + p.amount, 0);
case "engineer": return new Engineer(name, type); }
} function totalVolumeCredits(data) {
return new Employee(name, type); return data.performances
} .reduce((total, p) => total + p.volumeCredits, 0);
}
}
42 Chapter 1 Refactoring: A First Example Replace Type Code with Subclasses 363
function createPerformanceCalculator(aPerformance, aPlay) { most helpful when I have several functions that invoke different behavior
switch(aPlay.type) { depending on the value of the type code. With subclasses, I can apply Replace
case "tragedy": return new TragedyCalculator(aPerformance, aPlay);
Conditional with Polymorphism (272) to these functions.
case "comedy" : return new ComedyCalculator(aPerformance, aPlay);
default: The second case is where I have fields or methods that are only valid for par-
throw new Error(`unknown type: ${aPlay.type}`); ticular values of a type code, such as a sales quota that’s only applicable to the
} “salesman” type code. I can then create the subclass and apply Push Down Field
} (361). While I can include validation logic to ensure a field is only used when
class PerformanceCalculator { the type code has the correct value, using a subclass makes the relationship more
constructor(aPerformance, aPlay) {
this.performance = aPerformance;
explicit.
this.play = aPlay; When using Replace Type Code with Subclasses, I need to consider whether
} to apply it directly to the class I’m looking at, or to the type code itself. Do I
get amount() { make engineer a subtype of employee, or should I give the employee an employee
throw new Error('subclass responsibility'); type property which can have subtypes for engineer and manager? Using direct
}
subclassing is simpler, but I can’t use it for the job type if I need it for something
get volumeCredits() {
return Math.max(this.performance.audience - 30, 0);
else. I also can’t use direct subclasses if the type is mutable. If I need to move
} the subclasses to an employee type property, I can do that by using Replace
} Primitive with Object (174) on the type code to create an employee type class and
class TragedyCalculator extends PerformanceCalculator { then using Replace Type Code with Subclasses on that new class.
get amount() {
let result = 40000;
if (this.performance.audience > 30) { Mechanics
result += 1000 * (this.performance.audience - 30);
} Self-encapsulate the type code field.
return result;
} Pick one type code value. Create a subclass for that type code. Override the
} type code getter to return the literal type code value.
class ComedyCalculator extends PerformanceCalculator {
get amount() { Create selector logic to map from the type code parameter to the new
let result = 30000; subclass.
if (this.performance.audience > 20) {
result += 10000 + 500 * (this.performance.audience - 20); With direct inheritance, use Replace Constructor with Factory Function (334) and put
} the selector logic in the factory. With indirect inheritance, the selector logic may
result += 300 * this.performance.audience; stay in the constructor.
return result;
} Test.
get volumeCredits() {
return super.volumeCredits + Math.floor(this.performance.audience / 5); Repeat creating the subclass and adding to the selector logic for each type
} code value. Test after each change.
}
Remove the type code field.
Again, the code has increased in size as I’ve introduced structure. The benefit
here is that the calculations for each kind of play are grouped together. If most Test.
of the changes will be to this code, it will be helpful to have it clearly separated Use Push Down Method (359) and Replace Conditional with Polymorphism (272)
like this. Adding a new kind of play requires writing a new subclass and adding on any methods that use the type code accessors. Once all are replaced, you
it to the creation function. can remove the type code accessors.
The example gives some insight as to when using subclasses like this is useful.
Here, I’ve moved the conditional lookup from two functions (amountFor and
volumeCreditsFor) to a single constructor function createPerformanceCalculator. The more
362 Chapter 12 Dealing with Inheritance Final Thoughts 43
functions there are that depend on the same type of polymorphism, the
Replace Type Code with Subclasses more useful this approach becomes.
An alternative to what I’ve done here would be to have createPerformanceData return
subsumes: Replace Type Code with State/Strategy the calculator itself, instead of the calculator populating the intermediate data
subsumes: Extract Subclass structure. One of the nice features of JavaScript’s class system is that with it, using
inverse of: Remove Subclass (369) getters looks like regular data access. My choice on whether to return the instance
or calculate separate output data depends on who is using the downstream data
structure. In this case, I preferred to show how to use the intermediate
data structure to hide the decision to use a polymorphic calculator.
Final Thoughts
This is a simple example, but I hope it will give you a feeling for what refactoring
is like. I’ve used several refactorings, including Extract Function (106), Inline Variable
function createEmployee(name, type) {
(123), Move Function (198), and Replace Conditional with Polymorphism (272).
return new Employee(name, type);
} There were three major stages to this refactoring episode: decomposing the
original function into a set of nested functions, using Split Phase (154) to separate
the calculation and printing code, and finally introducing a polymorphic calculator
for the calculation logic. Each of these added structure to the code, enabling me
to better communicate what the code was doing.
As is often the case with refactoring, the early stages were mostly driven by
function createEmployee(name, type) {
trying to understand what was going on. A common sequence is: Read the code,
switch (type) {
case "engineer": return new Engineer(name);
gain some insight, and use refactoring to move that insight from your head back
case "salesman": return new Salesman(name); into the code. The clearer code then makes it easier to understand it, leading to
case "manager": return new Manager (name); deeper insights and a beneficial positive feedback loop. There are still some im-
} provements I could make, but I feel I’ve done enough to pass my test of leaving
the code significantly better than how I found it.
I’m talking about improving the
Motivation code—but programmers love to argue The true test of good code is
about what good code looks like. I know
Software systems often need to represent different kinds of a similar thing. I may some people object to my preference for how easy it is to change it.
classify employees by their job type (engineer, manager, salesman), or orders by small, well-named functions. If we con-
their priority (rush, regular). My first tool for handling this is some kind of type sider this to be a matter of aesthetics, where nothing is either good or bad but
code field—depending on the language, that might be an enum, symbol, string, thinking makes it so, we lack any guide but personal taste. I believe, however,
or number. Often, this type code will come from an external service that provides that we can go beyond taste and say that the true test of good code is how easy
me with the data I’m working on. it is to change it. Code should be obvious: When someone needs to make a
Most of the time, such a type code is all I need. But there are a couple of situ- change, they should be able to find the code to be changed easily and to make
ations where I could do with something more, and that something more are the change quickly without introducing any errors. A healthy code base maximizes
subclasses. There are two things that are particularly enticing about subclasses. our productivity, allowing us to build more features for our users both faster and
First, they allow me to use polymorphism to handle conditional logic. I find this more cheaply. To keep code healthy, pay attention to what is getting between
the programming team and that ideal, then refactor to get closer to the ideal.
44 Chapter 1 Refactoring: A First Example Push Down Field 361
But the most important thing to learn from this example is the rhythm of
refactoring. Whenever I’ve shown people how I refactor, they are surprised by Push Down Field
how small my steps are, each step leaving the code in a working state that com-
piles and passes its tests. I was just as surprised myself when Kent Beck showed inverse of: Pull Up Field (353)
me how to do this in a hotel room in Detroit two decades ago. The key to effective
refactoring is recognizing that you go faster when you take tiny steps, the code
is never broken, and you can compose those small steps into substantial changes.
Remember that—and the rest is silence.
Motivation
If a field is only used by one subclass (or a small proportion of subclasses), I
move it to those subclasses.
Mechanics
Declare field in all subclasses that need it.
Remove the field from the superclass.
Test.
Remove the field from all subclasses that don’t need it.
Test.
360 Chapter 12 Dealing with Inheritance
Test. Chapter 2
Remove the method from each superclass that doesn’t need it.
Test.
Principles in Refactoring
The example in the previous chapter should have given you a decent feel of what
refactoring is. Now you have that, it’s a good time to step back and talk about
some of the broader principles in refactoring.
Defining Refactoring
Like many terms in software development, “refactoring” is often used very
loosely by practitioners. I use the term more precisely, and find it useful to use
it in that more precise form. (These definitions are the same as those I gave in
the first edition of this book.) The term “refactoring” can be used either as a noun
or a verb. The noun’s definition is:
45
46 Chapter 2 Principles in Refactoring Push Down Method 359
class Employee… structured differently. So I swap hats and refactor for a while. Once the code is
constructor (name) {...} better structured, I swap hats back and add the new capability. Once I get the
new capability working, I realize I coded it in a way that’s awkward to understand,
get isPrivileged() {...}
so I swap hats again and refactor. All this might take only ten minutes, but during
assignCar() {...} this time I’m always aware of which hat I’m wearing and the subtle difference
that makes to how I program.
class Manager extends Employee…
constructor(name, grade) {
super(name);
this._grade = grade;
if (this.isPrivileged) this.assignCar(); // every subclass does this Why Should We Refactor?
}
I don’t want to claim refactoring is the cure for all software ills. It is no “silver
get isPrivileged() { bullet.” Yet it is a valuable tool—a pair of silver pliers that helps you keep a good
return this._grade > 4; grip on your code. Refactoring is a tool that can—and should—be used for several
}
purposes.
The wrinkle here comes from the fact that the call to isPrivileged can’t be made
until after the grade field is assigned, and that can only be done in the subclass. Refactoring Improves the Design of Software
In this case, I do Extract Function (106) on the common code:
Without refactoring, the internal design—the architecture—of software tends to
class Manager… decay. As people change code to achieve short-term goals, often without a full
constructor(name, grade) { comprehension of the architecture, the code loses its structure. It becomes harder
super(name); for me to see the design by reading the code. Loss of the structure of code has
this._grade = grade;
this.finishConstruction();
a cumulative effect. The harder it is to see the design in the code, the harder it
} is for me to preserve it, and the more rapidly it decays. Regular refactoring helps
keep the code in shape.
finishConstruction() { Poorly designed code usually takes more code to do the same things, often
if (this.isPrivileged) this.assignCar();
because the code quite literally does the same thing in several places. Thus an
}
important aspect of improving design is to eliminate duplicated code. It’s not that
Then, I use Pull Up Method (350) to move it to the superclass. reducing the amount of code will make the system run any faster—the effect on
the footprint of the programs rarely is significant. Reducing the amount of
class Employee… code does, however, make a big difference in modification of the code. The
finishConstruction() { more code there is, the harder it is to modify correctly. There’s more code for
if (this.isPrivileged) this.assignCar();
me to understand. I change this bit of code here, but the system doesn’t do what
}
I expect because I didn’t change that bit over there that does much the same
thing in a slightly different context. By eliminating duplication, I ensure that the
code says everything once and only once, which is the essence of good design.
code to make some changes. That user, who we often forget, is actually the most class Department extends Party {
important. Who cares if the computer takes a few more cycles to compile some- constructor(name, staff){
super();
thing? Yet it does matter if it takes a programmer a week to make a change that
this._name = name;
would have taken only an hour with proper understanding of my code. this._staff = staff;
The trouble is that when I’m trying to get the program to work, I’m not thinking }
about that future developer. It takes a change of rhythm to make the code easier // rest of class...
to understand. Refactoring helps me make my code more readable. Before
refactoring, I have code that works but is not ideally structured. A little time The common code here is the assignment of the name. I use Slide Statements
spent on refactoring can make the code better communicate its purpose—say (223) to move the assignment in Employee next to the call to super():
more clearly what I want. class Employee extends Party {
I’m not necessarily being altruistic about this. Often, this future developer is constructor(name, id, monthlyCost) {
myself. This makes refactoring even more important. I’m a very lazy programmer. super();
One of my forms of laziness is that I never remember things about the code I this._name = name;
this._id = id;
write. Indeed, I deliberately try not remember anything I can look up, because
this._monthlyCost = monthlyCost;
I’m afraid my brain will get full. I make a point of trying to put everything I }
should remember into the code so I don’t have to remember it. That way I’m // rest of class...
less worried about Maudite [maudite] killing off my brain cells.
With that tested, I move the common code to the superclass. Since that code
contains a reference to a constructor argument, I pass that in as a parameter.
Refactoring Helps Me Find Bugs
class Party…
Help in understanding the code also means help in spotting bugs. I admit I’m constructor(name){
not terribly good at finding bugs. Some people can read a lump of code and see this._name = name;
bugs; I cannot. However, I find that if I refactor code, I work deeply on under- }
standing what the code does, and I put that new understanding right back into
the code. By clarifying the structure of the program, I clarify certain assumptions class Employee…
I’ve made—to a point where even I can’t avoid spotting the bugs. constructor(name, id, monthlyCost) {
super(name);
It reminds me of a statement Kent Beck often makes about himself: “I’m not a
this._id = id;
great programmer; I’m just a good programmer with great habits.” Refactoring this._monthlyCost = monthlyCost;
helps me be much more effective at writing robust code. }
class Department…
Refactoring Helps Me Program Faster constructor(name, staff){
super(name);
In the end, all the earlier points come down to this: Refactoring helps me develop this._staff = staff;
code more quickly. }
This sounds counterintuitive. When I talk about refactoring, people can easily
see that it improves quality. Better internal design, readability, reducing bugs—all Run the tests, and I’m done.
these improve quality. But doesn’t the time I spend on refactoring reduce the Most of the time, constructor behavior will work like this: Do the common
speed of development? elements first (with a super call), then do extra work that the subclass needs.
When I talk to software developers who have been working on a system for a Occasionally, however, there is some common behavior later.
while, I often hear that they were able to make progress rapidly at first, but now Consider this example:
it takes much longer to add new features. Every new feature requires more and
more time to understand how to fit it into the existing code base, and once it’s
added, bugs often crop up that take even longer to fix. The code base starts
looking like a series of patches covering patches, and it takes an exercise in
356 Chapter 12 Dealing with Inheritance Why Should We Refactor? 49
Motivation archaeology to figure out how things work. This burden slows down adding new
features—to the point that developers wish they could start again from a blank
Constructors are tricky things. They aren’t quite normal methods—so I’m more slate.
restricted in what I can do with them. I can visualize this state of affairs with the following pseudograph:
If I see subclass methods with common behavior, my first thought is to use
Extract Function (106) followed by Pull Up Method (350), which will move it nicely
into the superclass. Constructors tangle that—because they have special rules
about what can be done in what order, so I need a slightly different approach.
If this refactoring starts getting messy, I reach for Replace Constructor with Factory
Function (334).
Mechanics
Define a superclass constructor, if one doesn’t already exist. Ensure it’s called
by subclass constructors.
Use Slide Statements (223) to move any common statements to just after the
super call.
But some teams report a different experience. They find they can add new
Remove the common code from each subclass and put it in the superclass. features faster because they can leverage the existing things by quickly building
Add to the super call any constructor parameters referenced in the on what’s already there.
common code.
Test.
If there is any common code that cannot move to the start of the constructor,
use Extract Function (106) followed by Pull Up Method (350).
Example
I start with the following code:
class Party {}
The difference between these two is the internal quality of the software. Soft-
ware with a good internal design allows me to easily find how and where I need
to make changes to add a new feature. Good modularity allows me to only have to
understand a small subset of the code base to make a change. If the code is clear,
I’m less likely to introduce a bug, and if I do, the debugging effort is much easier.
50 Chapter 2 Principles in Refactoring Pull Up Constructor Body 355
Done well, my code base turns into a platform for building new features for its
domain. Pull Up Constructor Body
I refer to this effect as the Design Stamina Hypothesis [mf-dsh]: By putting our
effort into a good internal design, we increase the stamina of the software effort,
allowing us to go faster for longer. I can’t prove that this is the case, which is
why I refer to it as a hypothesis. But it explains my experience, together with the
experience of hundreds of great programmers that I’ve got to know over my career.
Twenty years ago, the conventional wisdom was that to get this kind of good
design, it had to be completed before starting to program—because once we
wrote the code, we could only face decay. Refactoring changes this picture.
We now know we can improve the design of existing code—so we can form and
improve a design over time, even as the needs of the program change. Since it
is very difficult to do a good design up front, refactoring becomes vital to
achieving that virtuous path of rapid functionality.
Many dynamic languages do not define fields as part of their class definition— “It’s like I want to go 100 miles east but instead of just traipsing through the
instead, fields appear when they are first assigned to. In this case, pulling up a woods, I’m going to drive 20 miles north to the highway and then I’m going to
field is essentially a consequence of Pull Up Constructor Body (355). go 100 miles east at three times the speed I could have if I just went straight
there. When people are pushing you to just go straight there, sometimes you need
to say, ‘Wait, I need to check the map and find the quickest route.’ The prepara-
Mechanics
tory refactoring does that for me.”
Inspect all users of the candidate field to ensure they are used in the — Jessica Kerr,
same way. https://martinfowler.com/articles/preparatory-refactoring-example.html
If the fields have different names, use Rename Field (244) to give them the The same happens when fixing a bug. Once I’ve found the cause of the problem,
same name. I see that it would be much easier to fix should I unify the three bits of copied
code causing the error into one. Or perhaps separating some update logic from
Create a new field in the superclass. queries will make it easier to avoid the tangling that’s causing the error. By
The new field will need to be accessible to subclasses (protected in common refactoring to improve the situation, I also increase the chances that the bug will
languages). stay fixed, and reduce the chances that others will appear in the same crevices
of the code.
Delete the subclass fields.
Test. Comprehension Refactoring: Making Code Easier to Understand
Before I can change some code, I need to understand what it does. This code
may have been written by me or by someone else. Whenever I have to think to
understand what the code is doing, I ask myself if I can refactor the code to make
that understanding more immediately apparent. I may be looking at some condi-
tional logic that’s structured awkwardly. I may have wanted to use some existing
functions but spent several minutes figuring out what they did because they were
named badly.
At that point I have some understanding in my head, but my head isn’t a very
good record of such details. As Ward Cunningham puts it, by refactoring I move
the understanding from my head into the code itself. I then test that understanding
by running the software to see if it still works. If I move my understanding into
the code, it will be preserved longer and be visible to my colleagues.
That doesn’t just help me in the future—it often helps me right now. Early on,
I do comprehension refactoring on little details. I rename a couple variables now
that I understand what they are, or I chop a long function into smaller parts.
Then, as the code gets clearer, I find I can see things about the design that I
could not see before. Had I not changed the code, I probably never would have
seen these things, because I’m just not clever enough to visualize all these changes
in my head. Ralph Johnson describes these early refactorings as wiping the dirt
off a window so you can see beyond. When I’m studying code, refactoring leads
me to higher levels of understanding that I would otherwise miss. Those who
dismiss comprehension refactoring as useless fiddling with the code don’t realize
that by foregoing it they never see the opportunities hidden behind the confusion.
52 Chapter 2 Principles in Refactoring Pull Up Field 353
Litter-Pickup Refactoring
Pull Up Field
A variation of comprehension refactoring is when I understand what the code is
doing, but realize that it’s doing it badly. The logic is unnecessarily convoluted, inverse of: Push Down Field (361)
or I see functions that are nearly identical and can be replaced by a single param-
eterized function. There’s a bit of a tradeoff here. I don’t want to spend a lot of
time distracted from the task I’m currently doing, but I also don’t want to leave
the trash lying around and getting in the way of future changes. If it’s easy to
change, I’ll do it right away. If it’s a bit more effort to fix, I might make a note
of it and fix it when I’m done with my immediate task.
Sometimes, of course, it’s going to take a few hours to fix, and I have more
urgent things to do. Even then, however, it’s usually worthwhile to make it a little
bit better. As the old camping adage says, always leave the camp site cleaner
than when you found it. If I make it a little better each time I pass through the
code, over time it will get fixed. The nice thing about refactoring is that I don’t class Employee {...} // Java
break the code with each small step—so, sometimes, it takes months to complete
class Salesman extends Employee {
the job but the code is never broken even when I’m part way through it. private String name;
}
Planned and Opportunistic Refactoring class Engineer extends Employee {
private String name;
The examples above—preparatory, comprehension, litter-pickup refactoring—are
}
all opportunistic. I don’t set aside time at the beginning to spend on refactor-
ing—instead, I do refactoring as part of adding a feature or fixing a bug. It’s part
of my natural flow of programming. Whether I’m adding a feature or fixing a
bug, refactoring helps me do the immediate task and also sets me up to make
future work easier. This is an important point that’s frequently missed. Refactoring
isn’t an activity that’s separated from programming—any more than you set aside class Employee {
protected String name;
time to write if statements. I don’t put time on my plans to do refactoring; most
}
refactoring happens while I’m doing other things.
It’s also a common error to see refac- class Salesman extends Employee {...}
You have to refactor when you toring as something people do to fix past class Engineer extends Employee {...}
mistakes or clean up ugly code. Certainly
run into ugly code—but excel- you have to refactor when you run into
lent code needs plenty of refac- ugly code, but excellent code needs Motivation
plenty of refactoring too. Whenever I
toring too. write code, I’m making tradeoffs—how If subclasses are developed independently, or combined through refactoring, I
much do I need to parameterize, where often find that they duplicate features. In particular, certain fields can be dupli-
to draw the lines between functions? The tradeoffs I made correctly for yesterday’s cates. Such fields sometimes have similar names—but not always. The only way
feature set may no longer be the right ones for the new features I’m adding today. I can tell what is going on is by looking at the fields and examining how they
The advantage is that clean code is easier to refactor when I need to change are used. If they are being used in a similar way, I can pull them up into the
those tradeoffs to reflect the new reality. superclass.
By doing this, I reduce duplication in two ways. I remove the duplicate data
declaration and I can then move behavior that uses the field from the subclasses
to the superclass.
352 Chapter 12 Dealing with Inheritance When Should We Refactor? 53
I look at both classes and see that they refer to the monthlyCost property which “for each desired change, make the change easy (warning: this may be hard),
isn’t defined on the superclass, but is present in both subclasses. Since I’m in a then make the easy change”
dynamic language, I’m OK; if I were in a static language, I’d need to define an — Kent Beck,
abstract method on Party. https://twitter.com/kentbeck/status/250733358307500032
The methods have different names, so I Change Function Declaration (124) to
make them the same. For a long time, people thought of writing software as a process of accretion:
To add new features, we should be mostly adding new code. But good developers
class Department… know that, often, the fastest way to add a new feature is to change the code to
get annualCost() { make it easy to add. Software should thus be never thought of as “done.” As new
return this.monthlyCost * 12; capabilities are needed, the software changes to reflect that. Those changes can
} often be greater in the existing code than in the new code.
I copy the method from one subclass and paste it into the superclass. All this doesn’t mean that planned refactoring is always wrong. If a team has
neglected refactoring, it often needs dedicated time to get their code base into a
class Party… better state for new features, and a week spent refactoring now can repay itself
get annualCost() { over the next couple of months. Sometimes, even with regular refactoring I’ll see
return this.monthlyCost * 12; a problem area grow to the point when it needs some concerted effort to fix. But
} such planned refactoring episodes should be rare. Most refactoring effort should
be the unremarkable, opportunistic kind.
In a static language, I’d compile to ensure that all the references were OK. That
One bit of advice I’ve heard is to separate refactoring work and new feature
won’t help me here, so I first remove annualCost from Employee, test, and then remove
additions into different version-control commits. The big advantage of this is that
it from Department.
they can be reviewed and approved independently. I’m not convinced of this,
That completes the refactoring, but does leave a question. annualCost calls
however. Too often, the refactorings are closely interwoven with adding new
monthlyCost, but monthlyCost doesn’t appear in the Party class. It all works, because
features, and it’s not worth the time to separate them out. This can also remove
JavaScript is a dynamic language—but there is value in signaling that subclasses
the context for the refactoring, making the refactoring commits hard to justify.
of Party should provide an implementation for monthlyCost, particularly if more sub-
Each team should experiment to find what works for them; just remember that
classes get added later on. A good way to provide this signal is a trap method
separating refactoring commits is not a self-evident principle—it’s only worthwhile
like this:
if it makes life easier.
class Party…
get monthlyCost() { Long-Term Refactoring
throw new SubclassResponsibilityError();
} Most refactoring can be completed within a few minutes—hours at most. But
I call such an error a subclass responsibility error as that was the name used there are some larger refactoring efforts that can take a team weeks to complete.
in Smalltalk. Perhaps they need to replace an existing library with a new one. Or pull some
section of code out into a component that they can share with another team. Or
fix some nasty mess of dependencies that they had allowed to build up.
Even in such cases, I’m reluctant to have a team do dedicated refactoring. Often,
a useful strategy is to agree to gradually work on the problem over the course
of the next few weeks. Whenever anyone goes near any code that’s in the refac-
toring zone, they move it a little way in the direction they want to improve. This
takes advantage of the fact that refactoring doesn’t break the code—each small
change leaves everything in a still-working state. To change from one library to
another, start by introducing a new abstraction that can act as an interface
to either library. Once the calling code uses this abstraction, it’s much easier
54 Chapter 2 Principles in Refactoring Pull Up Method 351
to switch one library for another. (This tactic is called Branch By Abstraction Often, Pull Up Method comes after other steps. I see two methods in different
[mf-bba].) classes that can be parameterized in such a way that they end up as essentially
the same method. In that case, the smallest step is for me to apply Parameterize
Function (310) separately and then Pull Up Method.
Refactoring in a Code Review
The most awkward complication with Pull Up Method is if the body of the
Some organizations do regular code reviews; those that don’t would do better if method refers to features that are on the subclass but not on the superclass.
they did. Code reviews help spread knowledge through a development team. When that happens, I need to use Pull Up Field (353) and Pull Up Method on
Reviews help more experienced developers pass knowledge to those less experi- those elements first.
enced. They help more people understand more aspects of a large software system. If I have two methods with a similar overall flow, but differing in details, I’ll
They are also very important in writing clear code. My code may look clear to consider the Form Template Method [mf-ft].
me but not to my team. That’s inevitable—it’s hard for people to put themselves
in the shoes of someone unfamiliar with whatever they are working on. Reviews Mechanics
also give the opportunity for more people to suggest useful ideas. I can only think
of so many good ideas in a week. Having other people contribute makes my life Inspect methods to ensure they are identical.
easier, so I always look for reviews.
I’ve found that refactoring helps me review someone else’s code. Before I If they do the same thing, but are not identical, refactor them until they have
started using refactoring, I could read the code, understand it to some degree, identical bodies.
and make suggestions. Now, when I come up with ideas, I consider whether they Check that all method calls and field references inside the method body refer
can be easily implemented then and there with refactoring. If so, I refactor. When to features that can be called from the superclass.
I do it a few times, I can see more clearly what the code looks like with the
suggestions in place. I don’t have to imagine what it would be like—I can see it. If the methods have different signatures, use Change Function Declaration (124)
As a result, I can come up with a second level of ideas that I would never have to get them to the one you want to use on the superclass.
realized had I not refactored. Create a new method in the superclass. Copy the body of one of the methods
Refactoring also helps get more concrete results from the code review. Not over to it.
only are there suggestions; many suggestions are implemented there and then.
You end up with much more of a sense of accomplishment from the exercise. Run static checks.
How I’d embed refactoring into a code review depends on the nature of the
Delete one subclass method.
review. The common pull request model, where a reviewer looks at code without
the original author, doesn’t work too well. It’s better to have the original author Test.
of the code present because the author can provide context on the code and
fully appreciate the reviewers’ intentions for their changes. I’ve had my best Keep deleting subclass methods until they are all gone.
experiences with this by sitting one-on-one with the original author, going
through the code and refactoring as we go. The logical conclusion of this style Example
is pair programming: continuous code review embedded within the process of
programming. I have two subclass methods that do the same thing.
really doing is not refactoring but less careful restructuring that causes breakages
Pull Up Method in the code base.
To a manager who is genuinely savvy about technology and understands the
inverse of: Push Down Method (359) design stamina hypothesis, refactoring isn’t hard to justify. Such managers should
be encouraging refactoring on a regular basis and be looking for signs that indicate
a team isn’t doing enough. While it does happen that teams do too much
refactoring, it’s much rarer than teams not doing enough.
Of course, many managers and customer don’t have the technical awareness
to know how code base health impacts productivity. In these cases I give my
more controversial advice: Don’t tell!
Subversive? I don’t think so. Software developers are professionals. Our job is
to build effective software as rapidly as we can. My experience is that refactoring
is a big aid to building software quickly. If I need to add a new function and the
design does not suit the change, I find it’s quicker to refactor first and then add
class Employee {...} the function. If I need to fix a bug, I need to understand how the software
works—and I find refactoring is the fastest way to do this. A schedule-driven
class Salesman extends Employee { manager wants me to do things the fastest way I can; how I do it is my respon-
get name() {...} sibility. I’m being paid for my expertise in programming new capabilities fast,
}
and the fastest way is by refactoring—therefore I refactor.
class Engineer extends Employee {
get name() {...}
When Should I Not Refactor?
}
It may sound like I always recommend refactoring—but there are cases when it’s
not worthwhile.
If I run across code that is a mess, but I don’t need to modify it, then I don’t
need to refactor it. Some ugly code that I can treat as an API may remain ugly.
class Employee { It’s only when I need to understand how it works that refactoring gives me any
get name() {...} benefit.
} Another case is when it’s easier to rewrite it than to refactor it. This is a tricky
decision. Often, I can’t tell how easy it is to refactor some code unless I spend
class Salesman extends Employee {...}
some time trying and thus get a sense of how difficult it is. The decision to
class Engineer extends Employee {...}
refactor or rewrite requires good judgment and experience, and I can’t really boil
it down into a piece of simple advice.
Motivation
Eliminating duplicate code is important. Two duplicate methods may work fine
as they are, but they are nothing but a breeding ground for bugs in the future. Problems with Refactoring
Whenever there is duplication, there is risk that an alteration to one copy will
not be made to the other. Usually, it is difficult to find the duplicates. Whenever anyone advocates for some technique, tool, or architecture, I always
The easiest case of using Pull Up Method is when the methods have the same look for problems. Few things in life are all sunshine and clear skies. You need
body, implying there’s been a copy and paste. Of course it’s not always as obvious to understand the tradeoffs to decide when and where to apply something. I do
as that. I could just do the refactoring and see if the tests croak—but that puts a think refactoring is a valuable technique—one that should be used more by most
lot of reliance on my tests. I usually find it valuable to look for the differences— teams. But there are problems associated with it, and it’s important to understand
often, they show up behavior that I forgot to test for. how they manifest themselves and how we can react to them.
56 Chapter 2 Principles in Refactoring
349
Problems with Refactoring 57
top level… refactorings so exacerbate merge problems that they stop refactoring. CI and re-
function charge(customer, usage, provider) { factoring work well together, which is why Kent Beck combined them in Extreme
return new ChargeCalculator(customer, usage, provider).charge; Programming.
} I’m not saying that you should never use feature branches. If they are sufficiently
I have to decide how to deal with any supporting functions, in this case baseCharge. short, their problems are much reduced. (Indeed, users of CI usually also use
My usual approach for a function that returns a value is to first Extract Variable branches, but integrate them with mainline each day.) Feature branches may be
(119) on that value. the right technique for open source projects where you have infrequent commits
from programmers who you don’t know well (and thus don’t trust). But in a full-
class ChargeCalculator… time development team, the cost that feature branches impose on refactoring is
get baseCharge() { excessive. Even if you don’t go to full CI, I certainly urge you to integrate
return this._customer.baseRate * this._usage; as frequently as possible. You should also consider the objective evidence
} [Forsgren et al.] that teams that use CI are more effective in software delivery.
get charge() {
const baseCharge = this.baseCharge;
return baseCharge + this._provider.connectionCharge; Testing
}
One of the key characteristics of refactoring is that it doesn’t change the observable
Then, I use Inline Function (115) on the supporting function. behavior of the program. If I follow the refactorings carefully, I shouldn’t break
class ChargeCalculator… anything—but what if I make a mistake? (Or, knowing me, s/if/when.) Mistakes
happen, but they aren’t a problem provided I catch them quickly. Since each
get charge() {
const baseCharge = this._customer.baseRate * this._usage; refactoring is a small change, if I break anything, I only have a small change to
return baseCharge + this._provider.connectionCharge; look at to find the fault—and if I still can’t spot it, I can revert my version control
} to the last working version.
The key here is being able to catch an error quickly. To do this, realistically, I
I now have all the processing in a single function, so my next step is to move need to be able to run a comprehensive test suite on the code—and run it
the data passed to the constructor to the main method. I first use Change Function quickly, so that I’m not deterred from running it frequently. This means that in
Declaration (124) to add all the constructor parameters to the charge method. most cases, if I want to refactor, I need to have self-testing code [mf-stc].
class ChargeCalculator… To some readers, self-testing code sounds like a requirement so steep as to be
constructor (customer, usage, provider){
unrealizable. But over the last couple of decades, I’ve seen many teams build
this._customer = customer; software this way. It takes attention and dedication to testing, but the benefits
this._usage = usage; make it really worthwhile. Self-testing code not only enables refactoring—it
this._provider = provider; also makes it much safer to add new features, since I can quickly find and kill
} any bugs I introduce. The key point here is that when a test fails, I can look at the
charge(customer, usage, provider) { change I’ve made between when the tests were last running correctly and
const baseCharge = this._customer.baseRate * this._usage; the current code. With frequent test runs, that will be only a few lines of code.
return baseCharge + this._provider.connectionCharge; By knowing it was those few lines that caused the failure, I can much more easily
} find the bug.
This also answers those who are concerned that refactoring carries too much
top level…
risk of introducing bugs. Without self-testing code, that’s a reasonable
function charge(customer, usage, provider) {
return new ChargeCalculator(customer, usage, provider)
worry—which is why I put so much emphasis on having solid tests.
.charge(customer, usage, provider); There is another way to deal with the testing problem. If I use an environment
} that has good automated refactorings, I can trust those refactorings even without
running tests. I can then refactor, providing I only use those refactorings that are
Now I can alter the body of charge to use the passed parameters instead. I can safely automated. This removes a lot of nice refactorings from my menu, but still
do this one at a time.
60 Chapter 2 Principles in Refactoring Replace Command with Function 345
leaves me enough to deliver some useful benefits. I’d still rather have self-testing For each method called by the command’s execution method, apply Inline
code, but it’s an option that is useful to have in the toolkit. Function (115).
This also inspires a style of refactoring that only uses a limited set of refactorings
If the supporting function returns a value, use Extract Variable (119) on the call
that can be proven safe. Such refactorings require carefully following the steps,
first and then Inline Function (115).
and are language-specific. But teams using them have found they can do useful
refactoring on large code bases with poor test coverage. I don’t focus on that in Use Change Function Declaration (124) to put all the parameters of the
this book, as it’s a newer, less described and understood technique that involves constructor into the command’s execution method instead.
detailed, language-specific activity. (It is, however, something I hope talk about
more on my web site in the future. For a taste of it, see Jay Bazuzi’s description For each field, alter the references in the command’s execution method to
[Bazuzi] of a safer way to do Extract Method (106) in C++.) use the parameter instead. Test after each change.
Self-testing code is, unsurprisingly, closely associated with Continuous Inline the constructor call and command’s execution method call into the
Integration—it is the mechanism that we use to catch semantic integration conflicts. caller (which is the replacement function).
Such testing practices are another component of Extreme Programming and a
key part of Continuous Delivery. Test.
Apply Remove Dead Code (237) to the command class.
Legacy Code
Most people would regard a big legacy as a Good Thing—but that’s one of the Example
cases where programmers’ view is different. Legacy code is often complex, fre- I’ll begin with this small command object:
quently comes with poor tests, and, above all, is written by Someone Else
(shudder). class ChargeCalculator {
Refactoring can be a fantastic tool to help understand a legacy system. Functions constructor (customer, usage, provider){
with misleading names can be renamed so they make sense, awkward program- this._customer = customer;
this._usage = usage;
ming constructs smoothed out, and the program turned from a rough rock to a this._provider = provider;
polished gem. But the dragon guarding this happy tale is the common lack of }
tests. If you have a big legacy system with no tests, you can’t safely refactor it get baseCharge() {
into clarity. return this._customer.baseRate * this._usage;
The obvious answer to this problem is that you add tests. But while this sounds }
get charge() {
a simple, if laborious, procedure, it’s often much more tricky in practice. Usually, a
return this.baseCharge + this._provider.connectionCharge;
system is only easy to put under test if it was designed with testing in mind—in }
which case it would have the tests and I wouldn’t be worrying about it. }
There’s no simple route to dealing with this. The best advice I can give is to
get a copy of Working Effectively with Legacy Code [Feathers] and follow its guidance. It is used by code like this:
Don’t be worried by the age of the book—its advice is just as true more than a
caller…
decade later. To summarize crudely, it advises you to get the system under test
monthCharge = new ChargeCalculator(customer, usage, provider).charge;
by finding seams in the program where you can insert tests. Creating these seams
involves refactoring—which is much more dangerous since it’s done without tests, The command class is small and simple enough to be better off as a function.
but is a necessary risk to make progress. This is a situation where safe, automated I begin by using Extract Function (106) to wrap the class creation and invocation.
refactorings can be a godsend. If all this sounds difficult, that’s because it is.
Sadly, there’s no shortcut to getting out of a hole this deep—which is why I’m caller…
such a strong proponent of writing self-testing code from the start. monthCharge = charge(customer, usage, provider);
344 Chapter 11 Refactoring APIs Problems with Refactoring 61
Databases
class ChargeCalculator { When I wrote the first edition of this book, I said that refactoring databases was
constructor (customer, usage){ a problem area. But, within a year of the book’s publication, that was no longer
this._customer = customer; the case. My colleague Pramod Sadalage developed an approach to evolutionary
this._usage = usage;
database design [mf-evodb] and database refactoring [Ambler & Sadalage] that
}
execute() { is now widely used. The essence of the technique is to combine the structural
return this._customer.rate * this._usage; changes to a database’s schema and access code with data migration scripts that
} can easily compose to handle large changes.
} Consider a simple example of renaming a field (column). As in Change Function
Declaration (124), I need to find the original declaration of the structure and all
the callers of this structure and change them in a single change. The complication,
however, is that I also have to transform any data that uses the old field to use
the new one. I write a small hunk of code that carries out this transform and
function charge(customer, usage) { store it in version control, together with the code that changes any declared
return customer.rate * usage; structure and access routines. Then, whenever I need to migrate between two
} versions of the database, I run all the migration scripts that exist between my
current copy of the database and my desired version.
As with regular refactoring, the key here is that each individual change is small
Motivation yet captures a complete change, so the system still runs after applying the migra-
tion. Keeping them small means they are easy to write, but I can string many of
Command objects provide a powerful mechanism for handling complex compu- them into a sequence that can make a significant change to the database’s
tations. They can easily be broken down into separate methods sharing common structure and the data stored in it.
state through the fields; they can be invoked via different methods for different One difference from regular refactorings is that database changes often are
effects; they can have their data built up in stages. But that power comes at a best separated over multiple releases to production. This makes it easy to reverse
cost. Most of the time, I just want to invoke a function and have it do its thing. any change that causes a problem in production. So, when renaming a field, my
If that’s the case, and the function isn’t too complex, then a command object is first commit would add the new database field but not use it. I may then set up
more trouble than its worth and should be turned into a regular function. the updates so they update both old and new fields at once. I can then gradually
move the readers over to the new field. Only once they have all moved to the
Mechanics new field, and I’ve given a little time for any bugs to show themselves, would
I remove the now-unused old field. This approach to database changes is an
Apply Extract Function (106) to the creation of the command and the call to example of a general approach of parallel change [mf-pc] (also called expand-
the command’s execution method. contract).
This creates the new function that will replace the command in due course.
62 Chapter 2 Principles in Refactoring Replace Function with Command 343
scoreSmoking() {
Refactoring, Architecture, and Yagni if (this._medicalExam.isSmoker) {
this._healthLevel += 10;
this._highMedicalRiskFlag = true;
Refactoring has profoundly changed how people think about software architecture. }
Early in my career, I was taught that software design and architecture was }
something to be worked on, and mostly completed, before anyone started writing
code. Once the code was written, its architecture was fixed and could only decay This allows me to treat the command similarly to how I’d deal with a nested
due to carelessness. function. Indeed, when doing this refactoring in JavaScript, using nested functions
Refactoring changes this perspective. It allows me to significantly alter the ar- would be a reasonable alternative to using a command. I’d still use a command
chitecture of software that’s been running in production for years. Refactoring for this, partly because I’m more familiar with commands and partly because with
can improve the design of existing code, as this book’s subtitle implies. But as I a command I can write tests and debugging calls against the subfunctions.
indicated earlier, changing legacy code is often challenging, especially when it
lacks decent tests.
The real impact of refactoring on architecture is in how it can be used to form
a well-designed code base that can respond gracefully to changing needs. The
biggest issue with finishing architecture before coding is that such an approach
assumes the requirements for the software can be understood early on. But expe-
rience shows that this is often, even usually, an unachievable goal. Repeatedly,
I saw people only understand what they really needed from software once they’d
had a chance to use it, and saw the impact it made to their work.
One way of dealing with future changes is to put flexibility mechanisms into
the software. As I write some function, I can see that it has a general applicability.
To handle the different circumstances that I anticipate it to be used in, I can see
a dozen parameters I could add to that function. These parameters are flexibility
mechanisms—and, like most mechanisms, they are not a free lunch. Adding all
those parameters complicates the function for the one case it’s used right now.
If I miss a parameter, all the parameterization I have added makes it harder for
me to add more. I find I often get my flexibility mechanisms wrong—either because
the changing needs didn’t work out the way I expected or my mechanism design
was faulty. Once I take all that into account, most of the time my flexibility
mechanisms actually slow down my ability to react to change.
With refactoring, I can use a different strategy. Instead of speculating on what
flexibility I will need in the future and what mechanisms will best enable that, I
build software that solves only the currently understood needs, but I make this
software excellently designed for those needs. As my understanding of the users’
needs changes, I use refactoring to adapt the architecture to those new demands.
I can happily include mechanisms that don’t increase complexity (such as small,
well-named functions) but any flexibility that complicates the software has to
prove itself before I include it. If I don’t have different values for a parameter
from the callers, I don’t add it to the parameter list. Should the time come that
I need to add it, then Parameterize Function (310) is an easy refactoring to apply. I
often find it useful to estimate how hard it would be to use refactoring later to
support an anticipated change. Only if I can see that it would be substantially
harder to refactor later do I consider adding a flexibility mechanism now.
342 Chapter 11 Refactoring APIs Refactoring and the Wider Software Development Process 63
I repeat this for all the local variables. (This is one of those refactorings that I This approach to design goes under various names: simple design, incremental
felt was sufficiently simple that I haven’t given it an entry in the catalog. I feel design, or yagni [mf-yagni] (originally an acronym for “you aren’t going to need
slightly guilty about this.) it”). Yagni doesn’t imply that architectural thinking disappears, although it is
sometimes naively applied that way. I think of yagni as a different style of incor-
class Scorer… porating architecture and design into the development process—a style that isn’t
constructor(candidate, medicalExam, scoringGuide){ credible without the foundation of refactoring.
this._candidate = candidate;
Adopting yagni doesn’t mean I neglect all upfront architectural thinking. There
this._medicalExam = medicalExam;
this._scoringGuide = scoringGuide; are still cases where refactoring changes are difficult and some preparatory
} thinking can save time. But the balance has shifted a long way—I’m much more
inclined to deal with issues later when I understand them better. All this has led
execute () { to a growing discipline of evolutionary architecture [Ford et al.] where architects
this._result = 0; explore the patterns and practices that take advantage of our ability to iterate
this._healthLevel = 0;
this._highMedicalRiskFlag = false;
over architectural decisions.
if (this._medicalExam.isSmoker) {
this._healthLevel += 10;
this._highMedicalRiskFlag = true; Refactoring and the Wider Software Development Process
}
this._certificationGrade = "regular";
if (this._scoringGuide.stateWithLowCertification(this._candidate.originState)) {
If you’ve read the earlier section on problems, one lesson you’ve probably drawn
this._certificationGrade = "low"; is that the effectiveness of refactoring is tied to other software practices that a
this._result -= 5; team uses. Indeed, refactoring’s early adoption was as part of Extreme Program-
} ming [mf-xp] (XP), a process which was notable for putting together a set of
// lots more code like this relatively unusual and interdependent practices—such as continuous integra-
this._result -= Math.max(this._healthLevel - 5, 0);
tion, self-testing code, and refactoring (the latter two woven into test-driven
return this._result;
} development).
Extreme Programming was one of the first agile software methods [mf-nm]
Now I’ve moved all the function’s state to the command object, I can use and, for several years, led the rise of agile techniques. Enough projects now use
refactorings like Extract Function (106) without getting tangled up in all the agile methods that agile thinking is generally regarded as mainstream—but in
variables and their scopes. reality most “agile” projects only use the name. To really operate in an agile way,
a team has to be capable and enthusiastic refactorers—and for that, many aspects
class Scorer… of their process have to align with making refactoring a regular part of their work.
execute () { The first foundation for refactoring is self-testing code. By this, I mean that
this._result = 0;
this._healthLevel = 0;
there is a suite of automated tests that I can run and be confident that, if I made
this._highMedicalRiskFlag = false; an error in my programming, some test will fail. This is such an important
foundation for refactoring that I’ll spend a chapter talking more about this.
this.scoreSmoking(); To refactor on a team, it’s important that each member can refactor when they
this._certificationGrade = "regular"; need to without interfering with others’ work. This is why I encourage Continuous
if (this._scoringGuide.stateWithLowCertification(this._candidate.originState)) {
Integration. With CI, each member’s refactoring efforts are quickly shared with
this._certificationGrade = "low";
this._result -= 5; their colleagues. No one ends up building new work on interfaces that are being
} removed, and if the refactoring is going to cause a problem with someone else’s
// lots more code like this work, we know about this quickly. Self-testing code is also a key element of
this._result -= Math.max(this._healthLevel - 5, 0); Continuous Integration, so there is a strong synergy between the three practices
return this._result; of self-testing code, continuous integration, and refactoring.
}
64 Chapter 2 Principles in Refactoring Replace Function with Command 341
With this trio of practices in place, we enable the Yagni design approach that execute () {
I talked about in the previous section. Refactoring and yagni positively reinforce let result = 0;
let healthLevel = 0;
each other: Not just is refactoring (and its prerequisites) a foundation for
let highMedicalRiskFlag = false;
yagni—yagni makes it easier to do refactoring. This is because it’s easier to change
a simple system than one that has lots of speculative flexibility included. Balance if (this._medicalExam.isSmoker) {
these practices, and you can get into a virtuous circle with a code base that healthLevel += 10;
responds rapidly to changing needs and is reliable. highMedicalRiskFlag = true;
With these core practices in place, we have the foundation to take advantage }
let certificationGrade = "regular";
of the other elements of the agile mindset. Continuous Delivery keeps our software if (this._scoringGuide.stateWithLowCertification(this._candidate.originState)) {
in an always-releasable state. This is what allows many web organizations to re- certificationGrade = "low";
lease updates many times a day—but even if we don’t need that, it reduces risk result -= 5;
and allows us to schedule our releases to satisfy business needs rather than }
technological constraints. With a firm technical foundation, we can drastically // lots more code like this
result -= Math.max(healthLevel - 5, 0);
reduce the time it takes to get a good idea into production code, allowing us to
return result;
better serve our customers. Furthermore, these practices increase the reliability }
of our software, with less bugs to spend time fixing.
Stated like this, it all sounds rather simple—but in practice it isn’t. Software That completes Replace Function with Command, but the whole point of doing
development, whatever the approach, is a tricky business, with complex interac- this refactoring is to allow me to break down the complicated functions—so let
tions between people and machines. The approach I describe here is a proven me outline some steps to achieve that. My next move here is to change all the
way to handle this complexity, but like any approach, it requires practice and local variables into fields. Again, I do these one at a time.
skill.
class Scorer…
constructor(candidate, medicalExam, scoringGuide){
this._candidate = candidate;
this._medicalExam = medicalExam;
Refactoring and Performance this._scoringGuide = scoringGuide;
}
A common concern with refactoring is the effect it has on the performance of a
program. To make the software easier to understand, I often make changes that execute () {
will cause the program to run slower. This is an important issue. I don’t belong this._result = 0;
let healthLevel = 0;
to the school of thought that ignores performance in favor of design purity or in let highMedicalRiskFlag = false;
hopes of faster hardware. Software has been rejected for being too slow, and
faster machines merely move the goalposts. Refactoring can certainly make soft- if (this._medicalExam.isSmoker) {
ware go more slowly—but it also makes the software more amenable to perfor- healthLevel += 10;
mance tuning. The secret to fast software, in all but hard real-time contexts, is highMedicalRiskFlag = true;
}
to write tunable software first and then tune it for sufficient speed.
let certificationGrade = "regular";
I’ve seen three general approaches to writing fast software. The most serious if (this._scoringGuide.stateWithLowCertification(this._candidate.originState)) {
of these is time budgeting, often used in hard real-time systems. As you decom- certificationGrade = "low";
pose the design, you give each component a budget for resources—time and this._result -= 5;
footprint. That component must not exceed its budget, although a mechanism }
for exchanging budgeted resources is allowed. Time budgeting focuses attention // lots more code like this
this._result -= Math.max(healthLevel - 5, 0);
on hard performance times. It is essential for systems, such as heart pacemakers,
return this._result;
in which late data is always bad data. This technique is inappropriate for other }
kinds of systems, such as the corporate information systems with which I
usually work.
340 Chapter 11 Refactoring APIs Refactoring and Performance 65
Most of the time, I prefer to pass arguments to a command on the constructor The second approach is the constant attention approach. Here, every program-
and have the execute method take no parameters. While this matters less for a mer, all the time, does whatever she can to keep performance high. This is a
simple decomposition scenario like this, it’s very handy when I want to manipulate common approach that is intuitively attractive—but it does not work very well.
the command with a more complicated parameter setting lifecycle or customiza- Changes that improve performance usually make the program harder to work
tions. Different command classes can have different parameters but be mixed with. This slows development. This would be a cost worth paying if the resulting
together when queued for execution. software were quicker—but usually it is not. The performance improvements are
I can do these parameters one at a time. spread all around the program; each improvement is made with a narrow
perspective of the program’s behavior, and often with a misunderstanding of how
function score(candidate, medicalExam, scoringGuide) {
a compiler, runtime, and hardware behaves.
return new Scorer(candidate).execute(candidate, medicalExam, scoringGuide);
}
complicated function—but that would take too long to write, let alone for you to
the factory method to return that object instead of creating it every time. read. Instead, I’ll go with a function that’s short enough not to need it. This one
That change doubled the speed of the system, enough for the tests to be scores points for an insurance application:
bearable. It took us about five minutes.
I had speculated with various members of the team (Kent and Martin deny function score(candidate, medicalExam, scoringGuide) {
participating in the speculation) on what was likely wrong with code we let result = 0;
let healthLevel = 0;
knew very well. We had even sketched some designs for improvements let highMedicalRiskFlag = false;
without first measuring what was going on.
We were completely wrong. Aside from having a really interesting if (medicalExam.isSmoker) {
conversation, we were doing no good at all. healthLevel += 10;
The lesson is: Even if you know exactly what is going on in your system, highMedicalRiskFlag = true;
}
measure performance, don’t speculate. You’ll learn something, and nine times
let certificationGrade = "regular";
out of ten, it won’t be that you were right! if (scoringGuide.stateWithLowCertification(candidate.originState)) {
— Ron Jeffries certificationGrade = "low";
result -= 5;
}
The interesting thing about performance is that in most programs, most of their // lots more code like this
time is spent in a small fraction of the code. If I optimize all the code equally, result -= Math.max(healthLevel - 5, 0);
I’ll end up with 90 percent of my work wasted because it’s optimizing code that return result;
}
isn’t run much. The time spent making the program fast—the time lost because
of lack of clarity—is all wasted time. I begin by creating an empty class and then Move Function (198) to move the
The third approach to performance improvement takes advantage of this function into it.
90-percent statistic. In this approach, I build my program in a well-factored
manner without paying attention to performance until I begin a deliberate perfor- function score(candidate, medicalExam, scoringGuide) {
return new Scorer().execute(candidate, medicalExam, scoringGuide);
mance optimization exercise. During this performance optimization, I follow a }
specific process to tune the program.
I begin by running the program under a profiler that monitors the program class Scorer {
and tells me where it is consuming time and space. This way I can find that small execute (candidate, medicalExam, scoringGuide) {
let result = 0;
part of the program where the performance hot spots lie. I then focus on those
let healthLevel = 0;
performance hot spots using the same optimizations I would use in the constant- let highMedicalRiskFlag = false;
attention approach. But since I’m focusing my attention on a hot spot, I’m getting
much more effect with less work. Even so, I remain cautious. As in refactoring, if (medicalExam.isSmoker) {
I make the changes in small steps. After each step I compile, test, and rerun healthLevel += 10;
the profiler. If I haven’t improved performance, I back out the change. I highMedicalRiskFlag = true;
}
continue the process of finding and removing hot spots until I get the performance
let certificationGrade = "regular";
that satisfies my users. if (scoringGuide.stateWithLowCertification(candidate.originState)) {
Having a well-factored program helps with this style of optimization in two certificationGrade = "low";
ways. First, it gives me time to spend on performance tuning. With well-factored result -= 5;
code, I can add functionality more quickly. This gives me more time to focus on }
performance. (Profiling ensures I spend that time on the right place.) Second, // lots more code like this
result -= Math.max(healthLevel - 5, 0);
with a well-factored program I have finer granularity for my performance analysis. return result;
My profiler leads me to smaller parts of the code, which are easier to tune. With }
clearer code, I have a better understanding of my options and of what kind of }
tuning will work.
338 Chapter 11 Refactoring APIs Where Did Refactoring Come From? 67
support a richer lifecycle. I can build in customizations using inheritance and I’ve found that refactoring helps me write fast software. It slows the software
hooks. If I’m working in a language with objects but without first-class functions, in the short term while I’m refactoring, but makes it easier to tune during
I can provide much of that capability by using commands instead. Similarly, I optimization. I end up well ahead.
can use methods and fields to help break down a complex function, even in a
language that lacks nested functions, and I can call those methods directly while
testing and debugging.
All these are good reasons to use commands, and I need to be ready to refactor Where Did Refactoring Come From?
functions into commands when I need to. But we must not forget that this flexi-
bility, as ever, comes at a price paid in complexity. So, given the choice between I’ve not succeeded in pinning down the birth of the term “refactoring.” Good
a first-class function and a command, I’ll pick the function 95% of the time. I programmers have always spent at least some time cleaning up their code. They
only use a command when I specifically need a facility that simpler approaches do this because they have learned that clean code is easier to change than complex
can’t provide. and messy code, and good programmers know that they rarely write clean code
the first time around.
Like many words in software development, “command” is rather overloaded. In the Refactoring goes beyond this. In this book, I’m advocating refactoring as a key
context I’m using it here, it is an object that encapsulates a request, following
element in the whole process of software development. Two of the first people
the command pattern in Design Patterns [gof]. When I use “command” in this sense, I
use “command object” to set the context, and “command” afterwards. The word “com-
to recognize the importance of refactoring were Ward Cunningham and Kent
mand” is also used in the command-query separation principle [mf-cqs], where a com- Beck, who worked with Smalltalk from the 1980s onward. Smalltalk is an envi-
mand is an object method that changes observable state. I’ve always tried to avoid using ronment that even then was particularly hospitable to refactoring. It is a very
command in that sense, preferring “modifier” or “mutator.” dynamic environment that allows you to quickly write highly functional software.
Smalltalk had a very short compile-link-execute cycle for its time, which made it
easy to change things quickly at a time where overnight compile cycles were not
Mechanics unknown. It is also object-oriented and thus provides powerful tools for minimiz-
Create an empty class for the function. Name it based on the function. ing the impact of change behind well-defined interfaces. Ward and Kent explored
software development approaches geared to this kind of environment, and their
Use Move Function (198) to move the function to the empty class. work developed into Extreme Programming. They realized that refactoring was
important in improving their productivity and, ever since, have been working
Keep the original function as a forwarding function until at least the end of the
with refactoring, applying it to serious software projects and refining it.
refactoring.
Ward and Kent’s ideas were a strong influence on the Smalltalk community,
Follow any convention the language has for naming commands. If there is no and the notion of refactoring became an important element in the Smalltalk cul-
convention, choose a generic name for the command’s execute function, such as ture. Another leading figure in the Smalltalk community is Ralph Johnson, a
“execute” or “call”. professor at the University of Illinois at Urbana-Champaign, who is famous as
one of the authors of the “Gang of Four” [gof] book on design patterns. One of
Consider making a field for each argument, and move these arguments to
Ralph’s biggest interests is in developing software frameworks. He explored how
the constructor.
refactoring can help develop an efficient and flexible framework.
Bill Opdyke was one of Ralph’s doctoral students and was particularly interested
Example in frameworks. He saw the potential value of refactoring and saw that it could
be applied to much more than Smalltalk. His background was in telephone switch
The JavaScript language has many faults, but one of its great decisions was to
development, in which a great deal of complexity accrues over time and changes
make functions first-class entities. I thus don’t have to go through all the hoops
are difficult to make. Bill’s doctoral research looked at refactoring from a tool
of creating commands for common tasks that I need to do in languages without
builder’s perspective. Bill was interested in refactorings that would be useful for
this facility. But there are still times when a command is the right tool for the job.
C++ framework development; he researched the necessary semantics-preserving
One of these cases is breaking up a complex function so I can better understand
refactorings and showed how to prove they were semantics-preserving and how
and modify it. To really show the value of this refactoring, I need a long and
a tool could implement these ideas. Bill’s doctoral thesis [Opdyke] was the first
substantial work on refactoring.
68 Chapter 2 Principles in Refactoring Replace Function with Command 337
But I don’t like using the type code here—it’s generally a bad smell to pass a A crude way to automate a refactoring is to do text manipulation, such as a
code as a literal string. So I prefer to create a new factory function that embeds search/replace to change a name, or some simple reorganizing of code for Extract
the kind of employee I want into its name. Variable (119). This is a very crude approach that certainly can’t be trusted without
rerunning tests. It can, however, be a handy first step. I’ll use such macros in
caller… Emacs to speed up my refactoring work when I don’t have more sophisticated
const leadEngineer = createEngineer(document.leadEngineer); refactorings available to me.
top level… To do refactoring properly, the tool has to operate on the syntax tree of the
code, not on the text. Manipulating the syntax tree is much more reliable to
function createEngineer(name) {
return new Employee(name, 'E'); preserve what the code is doing. This is why at the moment, most refactoring
} capabilities are part of powerful IDEs—they use the syntax tree not just for
refactoring but also for code navigation, linting, and the like. This collaboration
between text and syntax tree is what takes them beyond text editors.
Refactoring isn’t just understanding and updating the syntax tree. The tool also
needs to figure out how to rerender the code into text back in the editor view.
All in all, implementing decent refactoring is a challenging programming
exercise—one that I’m mostly unaware of as I gaily use the tools.
Many refactorings are made much safer when applied in a language with static
typing. Consider the simple Rename Function (124). I might have addClient methods
on my Salesman class and on my Server class. I want to rename the one on my
salesman, but it is different in intent from the one on my server, which I don’t
want to rename. Without static typing, the tool will find it difficult to tell whether
any call to addClient is intended for the salesman. In the refactoring browser, it
would generate a list of call sites and I would manually decide which ones to
change. This makes it a nonsafe refactoring that forces me to rerun the tests.
Such a tool is still helpful—but the equivalent operation in Java can be completely
safe and automatic. Since the tool can resolve the method to the correct class
with static typing, I can be confident that the tool changes only the methods it
ought to.
Tools often go further. If I rename a variable, I can be prompted for changes
to comments that use that name. If I use Extract Function (106), the tool spots
some code that duplicates the new function’s body and offers to replace it with
a call. Programming with powerful refactorings like this is a compelling reason
to use an IDE rather than stick with a familiar text editor. Personally I’m a big
user of Emacs, but when working in Java I prefer IntelliJ IDEA or Eclipse—in large
part due to the refactoring support.
While sophisticated refactoring tools are almost magical in their ability to
safely refactor code, there are some edge cases where they slip up. Less mature
tools struggle with reflective calls, such as Method.invoke in Java (although more
mature tools handle this quite well). So even with mostly safe refactorings, it’s
wise to run the test suite every so often to ensure nothing has gone pear-shaped.
Usually I’m refactoring with a mix of automated and manual refactorings, so I
run my tests often enough.
The power of using the syntax tree to analyze and refactor programs is a
compelling advantage for IDEs over simple text editors, but many programmers
70 Chapter 2 Principles in Refactoring Replace Constructor with Factory Function 335
prefer the flexibility of their favorite text editor and would like to have Example
both. A technology that’s currently gaining momentum is Language Servers
[langserver]: software that will form a syntax tree and present an API to text A quick but wearisome example uses kinds of employees. Consider an employee
editors. Such language servers can support many text editors and provide class:
commands to do sophisticated code analysis and refactoring operations.
class Employee…
constructor (name, typeCode) {
this._name = name;
this._typeCode = typeCode;
Going Further }
get name() {return this._name;}
It seems a little strange to be talking about further reading in only the second get type() {
return Employee.legalTypeCodes[this._typeCode];
chapter, but this is as good a spot as any to point out there is more material out
}
there on refactoring that goes beyond the basics in this book. static get legalTypeCodes() {
This book has taught refactoring to many people, but I have focused more on return {"E": "Engineer", "M": "Manager", "S": "Salesman"};
a refactoring reference than on taking readers through the learning process. If }
you are looking for such a book, I suggest Bill Wake’s Refactoring Workbook [Wake]
that contains many exercises to practice refactoring. This is used from
Many of those who pioneered refactoring were also active in the software caller…
patterns community. Josh Kerievsky tied these two worlds closely together with candidate = new Employee(document.name, document.empType);
Refactoring to Patterns [Kerievsky], which looks at the most valuable patterns from
the hugely influential “Gang of Four” book [gof] and shows how to use refactoring and
to evolve towards them.
This book concentrates on refactoring in general-purpose programming, but caller…
refactoring also applies in specialized areas. Two that have got useful attention const leadEngineer = new Employee(document.leadEngineer, 'E');
are Refactoring Databases [Ambler & Sadalage] (by Scott Ambler and Pramod My first step is to create the factory function. Its body is a simple delegation
Sadalage) and Refactoring HTML [Harold] (by Elliotte Rusty Harold). to the constructor.
Although it doesn’t have refactoring in the title, also worth including is Michael
Feathers’s Working Effectively with Legacy Code [Feathers], which is primarily a top level…
book about how to think about refactoring an older codebase with poor test function createEmployee(name, typeCode) {
coverage. return new Employee(name, typeCode);
Although this book (and its predecessor) are intended for programmers with }
any language, there is a place for language-specific refactoring books. Two of my I then find the callers of the constructor and change them, one at a time, to
former colleagues, Jay Fields and Shane Harvey, did this for the Ruby programming use the factory function instead.
language [Fields et al.]. The first one is obvious:
For more up-to-date material, look up the web representation of this book, as
well as the main refactoring web site: refactoring.com [ref.com]. caller…
candidate = createEmployee(document.name, document.empType);
With the second case, I could use the new factory function like this:
caller…
const leadEngineer = createEmployee(document.leadEngineer, 'E');
334 Chapter 11 Refactoring APIs
71
72 Chapter 3 Bad Smells in Code Remove Setting Method 333
skim the table) and try to identify what it is you’re smelling, then go to the re- const martin = new Person("1234");
factorings we suggest to see whether they will help you. You may not find the martin.name = "martin";
martin.id = "1234";
exact smell you can detect, but hopefully it should point you in the right direction.
I do this in each place I create a person, testing after each change.
When they are all done, I can apply Inline Function (115) to the setting method.
Duplicated Code
If you see the same code structure in more than one place, you can be sure that
your program will be better if you find a way to unify them. Duplication means
that every time you read these copies, you need to read them carefully to see if
there’s any difference. If you need to change the duplicated code, you have to
find and catch each duplication.
The simplest duplicated code problem is when you have the same expression
in two methods of the same class. Then all you have to do is Extract Function
(106) and invoke the code from both places. If you have code that’s similar, but
not quite identical, see if you can use Slide Statements (223) to arrange the code
so the similar items are all together for easy extraction. If the duplicate fragments
are in subclasses of a common base class, you can use Pull Up Method (350) to
avoid calling one from another.
332 Chapter 11 Refactoring APIs Long Function 73
Mechanics
Long Function
If the value that’s being set isn’t provided to the constructor, use Change
Function Declaration (124) to add it. Add a call to the setting method within In our experience, the programs that live best and longest are those with short
the constructor. functions. Programmers new to such a code base often feel that no computation
ever takes place—that the program is an endless sequence of delegation. When
If you wish to remove several setting methods, add all their values to the
you have lived with such a program for a few years, however, you learn just how
constructor at once. This simplifies the later steps.
valuable all those little functions are. All of the payoffs of indirection—explanation,
Remove each call of a setting method outside of the constructor, using the sharing, and choosing—are supported by small functions.
new constructor value instead. Test after each one. Since the early days of programming, people have realized that the longer a
function is, the more difficult it is to understand. Older languages carried an
If you can’t replace the call to the setter by creating a new object (because you
overhead in subroutine calls, which deterred people from small functions. Modern
are updating a shared reference object), abandon the refactoring.
languages have pretty much eliminated that overhead for in-process calls. There
Use Inline Function (115) on the setting method. Make the field immutable is still overhead for the reader of the code because you have to switch context
if possible. to see what the function does. Development environments that allow you to
quickly jump between a function call and its declaration, or to see both functions
Test. at once, help eliminate this step, but the real key to making it easy to understand
small functions is good naming. If you have a good name for a function, you
Example mostly don’t need to look at its body.
The net effect is that you should be much more aggressive about decomposing
I have a simple person class. functions. A heuristic we follow is that whenever we feel the need to comment
something, we write a function instead. Such a function contains the code that
class Person…
we wanted to comment but is named after the intention of the code rather than
get name() {return this._name;}
set name(arg) {this._name = arg;}
the way it works. We may do this on a group of lines or even on a single line of
get id() {return this._id;} code. We do this even if the method call is longer than the code it replaces—
set id(arg) {this._id = arg;} provided the method name explains the purpose of the code. The key here is
not function length but the semantic distance between what the method does
At the moment, I create a new object with code like this: and how it does it.
const martin = new Person(); Ninety-nine percent of the time, all you have to do to shorten a function is
martin.name = "martin"; Extract Function (106). Find parts of the function that seem to go nicely together
martin.id = "1234"; and make a new one.
If you have a function with lots of parameters and temporary variables, they
The name of a person may change after it’s created, but the ID does not. To
get in the way of extracting. If you try to use Extract Function (106), you end up
make this clear, I want to remove the setting method for ID.
passing so many parameters to the extracted method that the result is scarcely
I still need to set the ID initially, so I’ll use Change Function Declaration (124)
more readable than the original. You can often use Replace Temp with Query (178)
to add it to the constructor.
to eliminate the temps. Long lists of parameters can be slimmed down with
class Person… Introduce Parameter Object (140) and Preserve Whole Object (319).
constructor(id) { If you’ve tried that and you still have too many temps and parameters, it’s time
this.id = id; to get out the heavy artillery: Replace Function with Command (337).
} How do you identify the clumps of code to extract? A good technique is to
look for comments. They often signal this kind of semantic distance. A block of
I then adjust the creation script to set the ID via the constructor. code with a comment that tells you what it is doing can be replaced by a method
whose name is based on the comment. Even a single line is worth extracting if
it needs explanation.
74 Chapter 3 Bad Smells in Code Remove Setting Method 331
Conditionals and loops also give signs for extractions. Use Decompose Conditional
(260) to deal with conditional expressions. A big switch statement should have Remove Setting Method
its legs turned into single function calls with Extract Function (106). If there’s more
than one switch statement switching on the same condition, you should apply
Replace Conditional with Polymorphism (272).
With loops, extract the loop and the code within the loop into its own method.
If you find it hard to give an extracted loop a name, that may be because it’s
doing two different things—in which case don’t be afraid to use Split Loop (227)
to break out the separate tasks.
class Person {
get name() {...}
Long Parameter List set name(aString) {...}
I take advantage of the easily searchable name of the new function to rename Our key defense here is Encapsulate Variable (132), which is always our first
it by removing the prefix. move when confronted with data that is open to contamination by any part of a
program. At least when you have it wrapped by a function, you can start seeing
caller… where it’s modified and start to control its access. Then, it’s good to limit its
if (thePlan.targetTemperature(thermostat.selectedTemperature) > scope as much as possible by moving it within a class or module where only that
thermostat.currentTemperature)
module’s code can see it.
setToHeat();
else if (thePlan.targetTemperature(thermostat.selectedTemperature) < Global data is especially nasty when it’s mutable. Global data that you can
thermostat.currentTemperature) guarantee never changes after the program starts is relatively safe—if you have
setToCool(); a language that can enforce that guarantee.
else Global data illustrates Paracelsus’s maxim: The difference between a poison
setOff(); and something benign is the dose. You can get away with small doses of global
class HeatingPlan… data, but it gets exponentially harder to deal with the more you have. Even with
targetTemperature(selectedTemperature) {
little bits, we like to keep it encapsulated—that’s the key to coping with changes
if (selectedTemperature > this._max) return this._max; as the software evolves.
else if (selectedTemperature < this._min) return this._min;
else return selectedTemperature;
}
As is often the case with this refactoring, the calling code looks more unwieldy Mutable Data
than before. Moving a dependency out of a module pushes the responsibility of
dealing with that dependency back to the caller. That’s the trade-off for the Changes to data can often lead to unexpected consequences and tricky bugs. I
reduced coupling. can update some data here, not realizing that another part of the software expects
But removing the coupling to the thermostat object isn’t the only gain I’ve something different and now fails—a failure that’s particularly hard to spot if it
made with this refactoring. The HeatingPlan class is immutable—its fields are set in only happens under rare conditions. For this reason, an entire school of software
the constructor with no methods to alter them. (I’ll save you the effort of looking development—functional programming—is based on the notion that data should
at the whole class; just trust me on this.) Given an immutable heating plan, by never change and that updating a data structure should always return a new copy
moving the thermostat reference out of the function body I’ve also made of the structure with the change, leaving the old data pristine.
targetTemperature referentially transparent. Every time I call targetTemperature on the
These kinds of languages, however, are still a relatively small part of program-
same object, with the same argument, I will get the same result. If all the methods ming; many of us work in languages that allow variables to vary. But this doesn’t
of the heating plan have referential transparency, that makes this class much mean we should ignore the advantages of immutability—there are still many
easier to test and reason about. things we can do to limit the risks on unrestricted data updates.
A problem with JavaScript’s class model is that it’s impossible to enforce an You can use Encapsulate Variable (132) to ensure that all updates occur through
immutable class—there’s always a way to get at an object’s data. But writing a narrow functions that can be easier to monitor and evolve. If a variable is being
class to signal and encourage immutability is often good enough. Creating classes updated to store different things, use Split Variable (240) both to keep them sepa-
that have this characteristic is often a sound strategy and Replace Query with rate and avoid the risky update. Try as much as possible to move logic out of
Parameter is a handy tool for doing this. code that processes the update by using Slide Statements (223) and Extract Function
(106) to separate the side-effect-free code from anything that performs the update.
In APIs, use Separate Query from Modifier (306) to ensure callers don’t need to call
code that has side effects unless they really need to. We like to use Remove Setting
Method (331) as soon as we can—sometimes, just trying to find clients of a setter
helps spot opportunities to reduce the scope of a variable.
Mutable data that can be calculated elsewhere is particularly pungent. It’s not
just a rich source of confusion, bugs, and missed dinners at home—it’s also
unnecessary. We spray it with a concentrated solution of vinegar and Replace
Derived Variable with Query (248).
76 Chapter 3 Bad Smells in Code Replace Query with Parameter 329
Mutable data isn’t a big problem when it’s a variable whose scope is just a As a user of such a system, I might be annoyed to have my desires overridden
couple of lines—but its risk increases as its scope grows. Use Combine Functions by the heating plan rules, but as a programmer I might be more concerned about
into Class (144) or Combine Functions into Transform (149) to limit how much code how the targetTemperature function has a dependency on a global thermostat object.
needs to update a variable. If a variable contains some data with internal structure, I can break this dependency by moving it to a parameter.
it’s usually better to replace the entire structure rather than modify it in place, My first step is to use Extract Variable (119) on the parameter that I want to
using Change Reference to Value (252). have in my function.
class HeatingPlan…
get targetTemperature() {
const selectedTemperature = thermostat.selectedTemperature;
Divergent Change if (selectedTemperature > this._max) return this._max;
else if (selectedTemperature < this._min) return this._min;
We structure our software to make change easier; after all, software is meant to else return selectedTemperature;
be soft. When we make a change, we want to be able to jump to a single clear }
point in the system and make the change. When you can’t do this, you are
smelling one of two closely related pungencies. That makes it easy to apply Extract Function (106) on the entire body of the
Divergent change occurs when one module is often changed in different ways function except for the bit that figures out the parameter.
for different reasons. If you look at a module and say, “Well, I will have to change class HeatingPlan…
these three functions every time I get a new database; I have to change these get targetTemperature() {
four functions every time there is a new financial instrument,” this is an indication const selectedTemperature = thermostat.selectedTemperature;
of divergent change. The database interaction and financial processing problems return this.xxNEWtargetTemperature(selectedTemperature);
are separate contexts, and we can make our programming life better by moving }
such contexts into separate modules. That way, when we have a change to one
xxNEWtargetTemperature(selectedTemperature) {
context, we only have to understand that one context and ignore the other. We if (selectedTemperature > this._max) return this._max;
always found this to be important, but now, with our brains shrinking with age, else if (selectedTemperature < this._min) return this._min;
it becomes all the more imperative. Of course, you often discover this only after else return selectedTemperature;
you’ve added a few databases or financial instruments; context boundaries are }
usually unclear in the early days of a program and continue to shift as a software
I then inline the variable I just extracted, which leaves the function as a
system’s capabilities change.
simple call.
If the two aspects naturally form a sequence—for example, you get data from
the database and then apply your financial processing on it—then Split Phase (154) class HeatingPlan…
separates the two with a clear data structure between them. If there’s more back- get targetTemperature() {
and-forth in the calls, then create appropriate modules and use Move Function return this.xxNEWtargetTemperature(thermostat.selectedTemperature);
(198) to divide the processing up. If functions mix the two types of processing }
within themselves, use Extract Function (106) to separate them before moving. If
I can now use Inline Function (115) on this method.
the modules are classes, then Extract Class (182) helps formalize how to do the
split. caller…
if (thePlan.xxNEWtargetTemperature(thermostat.selectedTemperature) >
thermostat.currentTemperature)
setToHeat();
Shotgun Surgery else if (thePlan.xxNEWtargetTemperature(thermostat.selectedTemperature) <
thermostat.currentTemperature)
setToCool();
Shotgun surgery is similar to divergent change but is the opposite. You whiff this else
when, every time you make a change, you have to make a lot of little edits to a setOff();
328 Chapter 11 Refactoring APIs Feature Envy 77
moving that element to a parameter. Although such a move will shift responsibil- lot of different classes. When the changes are all over the place, they are hard
ity to the caller, there is often a lot to be gained by creating clear modules with to find, and it’s easy to miss an important change.
referential transparency. A common pattern is to have modules consisting of pure In this case, you want to use Move Function (198) and Move Field (207) to put
functions which are wrapped by logic that handles the I/O and other variable all the changes into a single module. If you have a bunch of functions operating
elements of a program. I can use Replace Query with Parameter to purify parts on similar data, use Combine Functions into Class (144). If you have functions that
of a program, making those parts easier to test and reason about. are transforming or enriching a data structure, use Combine Functions into Transform
But Replace Query with Parameter isn’t just a bag of benefits. By moving a (149). Split Phase (154) is often useful here if the common functions can combine
query to a parameter, I force my caller to figure out how to provide this value. their output for a consuming phase of logic.
This complicates life for callers of the functions, and my usual bias is to design A useful tactic for shotgun surgery is to use inlining refactorings, such as Inline
interfaces that make life easier for their consumers. In the end, it boils down to Function (115) or Inline Class (186), to pull together poorly separated logic. You’ll
allocation of responsibility around the program, and that’s a decision that’s neither end up with a Long Method or a Large Class, but can then use extractions to
easy nor immutable—which is why this refactoring (and its inverse) is one that break it up into more sensible pieces. Even though we are inordinately fond of
I need to be very familiar with. small functions and classes in our code, we aren’t afraid of creating something
large as an intermediate step to reorganization.
Mechanics
Use Extract Variable (119) on the query code to separate it from the rest of
the function body. Feature Envy
Apply Extract Function (106) to the body code that isn’t the call to the query. When we modularize a program, we are trying to separate the code into zones
to maximize the interaction inside a zone and minimize interaction between
Give the new function an easily searchable name, for later renaming.
zones. A classic case of Feature Envy occurs when a function in one module
Use Inline Variable (123) to get rid of the variable you just created. spends more time communicating with functions or data inside another mod-
ule than it does within its own module. We’ve lost count of the times we’ve seen
Apply Inline Function (115) to the original function. a function invoking half-a-dozen getter methods on another object to calculate
Rename the new function to that of the original. some value. Fortunately, the cure for that case is obvious: The function clearly
wants to be with the data, so use Move Function (198) to get it there. Sometimes,
only a part of a function suffers from envy, in which case use Extract Function
Example (106) on the jealous bit, and Move Function (198) to give it a dream home.
Consider a simple, yet annoying, control system for temperature. It allows the Of course not all cases are cut-and-dried. Often, a function uses features of
user to select a temperature on a thermostat—but only sets the target temperature several modules, so which one should it live with? The heuristic we use is to
within a range determined by a heating plan. determine which module has most of the data and put the function with that
data. This step is often made easier if you use Extract Function (106) to break the
class HeatingPlan… function into pieces that go into different places.
get targetTemperature() { Of course, there are several sophisticated patterns that break this rule. From
if (thermostat.selectedTemperature > this._max) return this._max; the Gang of Four [gof], Strategy and Visitor immediately leap to mind. Kent
else if (thermostat.selectedTemperature < this._min) return this._min; Beck’s Self Delegation [Beck SBPP] is another. Use these to combat the diver-
else return thermostat.selectedTemperature;
gent change smell. The fundamental rule of thumb is to put things together that
}
change together. Data and the behavior that references that data usually change
caller… together—but there are exceptions. When the exceptions occur, we move the
if (thePlan.targetTemperature > thermostat.currentTemperature) setToHeat(); behavior to keep changes in one place. Strategy and Visitor allow you to
else if (thePlan.targetTemperature < thermostat.currentTemperature) setToCool(); change behavior easily because they isolate the small amount of behavior that
else setOff(); needs to be overridden, at the cost of further indirection.
78 Chapter 3 Bad Smells in Code Replace Query with Parameter 327
Motivation
Primitive Obsession When looking through a function’s body, I sometimes see references to something
in the function’s scope that I’m not happy with. This might be a reference to a
Most programming environments are built on a widely used set of primitive global variable, or to an element in the same module that I intend to move away.
types: integers, floating point numbers, and strings. Libraries may add some ad- To resolve this, I need to replace the internal reference with a parameter, shifting
ditional small objects such as dates. We find many programmers are curiously the responsibility of resolving the reference to the caller of the function.
reluctant to create their own fundamental types which are useful for their Most of these cases are due to my wish to alter the dependency relationships
domain—such as money, coordinates, or ranges. We thus see calculations that in the code—to make the target function no longer dependent on the element I
treat monetary amounts as plain numbers, or calculations of physical quantities want to parameterize. There’s a tension here between converting everything to
that ignore units (adding inches to millimeters), or lots of code doing if (a < upper parameters, which results in long repetitive parameter lists, and sharing a lot of
&& a > lower). scope which can lead to a lot of coupling between functions. Like most tricky
Strings are particularly common petri dishes for this kind of odor: A telephone decisions, it’s not something I can reliably get right, so it’s important that I can
number is more than just a collection of characters. If nothing else, a proper type reliably change things so the program can take advantage of my increasing
can often include consistent display logic for when it needs to be displayed in a understanding.
user interface. Representing such types as strings is such a common stench that It’s easier to reason about a function that will always give the same result when
people call them “stringly typed” variables. called with same parameter values—this is called referential transparency. If a
function accesses some element in its scope that isn’t referentially transparent,
then the containing function also lacks referential transparency. I can fix that by
326 Chapter 11 Refactoring APIs Loops 79
class Order… You can move out of the primitive cave into the centrally heated world of
get finalPrice() { meaningful types by using Replace Primitive with Object (174). If the primitive is a
const basePrice = this.quantity * this.itemPrice; type code controlling conditional behavior, use Replace Type Code with Subclasses
return this.discountedPrice(basePrice, this.discountLevel); (362) followed by Replace Conditional with Polymorphism (272).
}
Groups of primitives that commonly appear together are data clumps and
get discountLevel() { should be civilized with Extract Class (182) and Introduce Parameter Object (140).
return (this.quantity > 100) ? 2 : 1;
}
Once I’ve done this, there’s no need to pass the result of discountLevel to
discountedPrice—it
can just as easily make the call itself. Repeated Switches
I replace any reference to the parameter with a call to the method instead.
Talk to a true object-oriented evangelist and they’ll soon get onto the evils of
class Order… switch statements. They’ll argue that any switch statement you see is begging for
discountedPrice(basePrice, discountLevel) { Replace Conditional with Polymorphism (272). We’ve even heard some people argue
switch (this.discountLevel) { that all conditional logic should be replaced with polymorphism, tossing most
case 1: return basePrice * 0.95; ifs into the dustbin of history.
case 2: return basePrice * 0.9; Even in our more wild-eyed youth, we were never unconditionally opposed to
}
the conditional. Indeed, the first edition of this book had a smell entitled “switch
}
statements.” The smell was there because in the late 90’s we found polymorphism
I can then use Change Function Declaration (124) to remove the parameter. sadly underappreciated, and saw benefit in getting people to switch over.
These days there is more polymorphism about, and it isn’t the simple red flag
class Order… that it often was fifteen years ago. Furthermore, many languages support more
get finalPrice() { sophisticated forms of switch statements that use more than some primitive code
const basePrice = this.quantity * this.itemPrice; as their base. So we now focus on the repeated switch, where the same condi-
return this.discountedPrice(basePrice, this.discountLevel);
}
tional switching logic (either in a switch/case statement or in a cascade of if/else
statements) pops up in different places. The problem with such duplicate
discountedPrice(basePrice, discountLevel) { switches is that, whenever you add a clause, you have to find all the switches
switch (this.discountLevel) { and update them. Against the dark forces of such repetition, polymorphism
case 1: return basePrice * 0.95;
provides an elegant weapon for a more civilized codebase.
case 2: return basePrice * 0.9;
}
}
Loops
Loops have been a core part of programming since the earliest languages. But we
feel they are no more relevant today than bell-bottoms and flock wallpaper.
We disdained them at the time of the first edition—but Java, like most other
languages at the time, didn’t provide a better alternative. These days, however,
first-class functions are widely supported, so we can use Replace Loop with Pipeline
(231) to retire those anachronisms. We find that pipeline operations, such as
filter and map, help us quickly see the elements that are included in the processing
and what is done with them.
80 Chapter 3 Bad Smells in Code Replace Parameter with Query 325
new dependency, or an existing one that I’d like to remove. Usually this comes
Lazy Element up where I’d need to add a problematic function call to the function body, or
access something within a receiver object that I’d prefer to move out later.
We like using program elements to add structure—providing opportunities for The safest case for Replace Parameter with Query is when the value of the
variation, reuse, or just having more helpful names. But sometimes the structure parameter I want to remove is determined merely by querying another parameter
isn’t needed. It may be a function that’s named the same as its body code reads, in the list. There’s rarely any point in passing two parameters if one can be
or a class that is essentially one simple function. Sometimes, this reflects a function determined from the other.
that was expected to grow and be popular later, but never realized its dreams. One thing to watch out for is if the function I’m looking at has referential
Sometimes, it’s a class that used to pay its way, but has been downsized with transparency—that is, if I can be sure that it will behave the same way whenever
refactoring. Either way, such program elements need to die with dignity. Usually it’s called with the same parameter values. Such functions are much easier to
this means using Inline Function (115) or Inline Class (186). With inheritance, you reason about and test, and I don’t want to alter them to lose that property. So I
can use Collapse Hierarchy (380). wouldn’t replace a parameter with an access to a mutable global variable.
Mechanics
Speculative Generality If necessary, use Extract Function (106) on the calculation of the parameter.
Brian Foote suggested this name for a smell to which we are very sensitive. You Replace references to the parameter in the function body with references to
get it when people say, “Oh, I think we’ll need the ability to do this kind of thing the expression that yields the parameter. Test after each change.
someday” and thus add all sorts of hooks and special cases to handle things that
Use Change Function Declaration (124) to remove the parameter.
aren’t required. The result is often harder to understand and maintain. If all this
machinery were being used, it would be worth it. But if it isn’t, it isn’t. The ma-
chinery just gets in the way, so get rid of it. Example
If you have abstract classes that aren’t doing much, use Collapse Hierarchy (380).
I most often use Replace Parameter with Query when I’ve done some other
Unnecessary delegation can be removed with Inline Function (115) and Inline Class
refactorings that make a parameter no longer needed. Consider this code.
(186). Functions with unused parameters should be subject to Change Function
Declaration (124) to remove those parameters. You should also apply Change class Order…
Function Declaration (124) to remove any unneeded parameters, which often get get finalPrice() {
tossed in for future variations that never come to pass. const basePrice = this.quantity * this.itemPrice;
Speculative generality can be spotted when the only users of a function or class let discountLevel;
are test cases. If you find such an animal, delete the test case and apply Remove if (this.quantity > 100) discountLevel = 2;
Dead Code (237). else discountLevel = 1;
return this.discountedPrice(basePrice, discountLevel);
}
discountedPrice(basePrice, discountLevel) {
Temporary Field switch (discountLevel) {
case 1: return basePrice * 0.95;
case 2: return basePrice * 0.9;
Sometimes you see a class in which a field is set only in certain circumstances. }
Such code is difficult to understand, because you expect an object to need all of }
its fields. Trying to understand why a field is there when it doesn’t seem to be
used can drive you nuts. When I’m simplifying a function, I’m keen to apply Replace Temp with Query
Use Extract Class (182) to create a home for the poor orphan variables. Use (178), which would lead me to
Move Function (198) to put all the code that concerns the fields into this new class.
324 Chapter 11 Refactoring APIs Middle Man 81
You may also be able to eliminate conditional code by using Introduce Special Case
Replace Parameter with Query (289) to create an alternative class for when the variables aren’t valid.
Motivation
Middle Man
The parameter list to a function should summarize the points of variability of
that function, indicating the primary ways in which that function may behave One of the prime features of objects is encapsulation—hiding internal details from
differently. As with any statement in code, it’s good to avoid any duplication, the rest of the world. Encapsulation often comes with delegation. You ask a di-
and it’s easier to understand if the parameter list is short. rector whether she is free for a meeting; she delegates the message to her diary
If a call passes in a value that the function can just as easily determine for itself, and gives you an answer. All well and good. There is no need to know whether
that’s a form of duplication—one that unnecessarily complicates the caller which the director uses a diary, an electronic gizmo, or a secretary to keep track of her
has to determine the value of a parameter when it could be freed from that work. appointments.
The limit on this is suggested by the phrase “just as easily.” By removing the c However, this can go too far. You look at a class’s interface and find half the
parameter, I’m shifting the responsibility for determining the parameter value. methods are delegating to this other class. After a while, it is time to use Remove
When the parameter is present, determining its value is the caller’s responsibility; Middle Man (192) and talk to the object that really knows what’s going on. If only
otherwise, that responsibility shifts to the function body. My usual habit is to a few methods aren’t doing much, use Inline Function (115) to inline them into
simplify life for callers, which implies moving responsibility to the function the caller. If there is additional behavior, you can use Replace Superclass with
body—but only if that responsibility is appropriate there. Delegate (399) or Replace Subclass with Delegate (381) to fold the middle man into
The most common reason to avoid Replace Parameter with Query is if removing the real object. That allows you to extend behavior without chasing all that
the parameter adds an unwanted dependency to the function body—forcing it to delegation.
access a program element that I’d rather it remained ignorant of. This may be a
82 Chapter 3 Bad Smells in Code Preserve Whole Object 323
top level…
Insider Trading function xxNEWwithinRange(aPlan, tempRange) {
const low = tempRange.low;
Software people like strong walls between their modules and complain bitterly const high = tempRange.high;
const isWithinRange = aPlan.withinRange(low, high);
about how trading data around too much increases coupling. To make things
return isWithinRange;
work, some trade has to occur, but we need to reduce it to a minimum and keep }
it all above board.
Modules that whisper to each other by the coffee machine need to be separated Since the original function is in a different context (the HeatingPlan class), I need
by using Move Function (198) and Move Field (207) to reduce the need to chat. If to use Move Function (198).
modules have common interests, try to create a third module to keep that
commonality in a well-regulated vehicle, or use Hide Delegate (189) to make another caller…
module act as an intermediary. const tempRange = aRoom.daysTempRange;
const isWithinRange = aPlan.xxNEWwithinRange(tempRange);
Inheritance can often lead to collusion. Subclasses are always going to know if (!isWithinRange)
more about their parents than their parents would like them to know. If it’s time alerts.push("room temperature went outside range");
to leave home, apply Replace Subclass with Delegate (381) or Replace Superclass with
Delegate (399). class HeatingPlan…
xxNEWwithinRange(tempRange) {
const low = tempRange.low;
const high = tempRange.high;
const isWithinRange = this.withinRange(low, high);
Large Class return isWithinRange;
}
When a class is trying to do too much, it often shows up as too many fields. When
a class has too many fields, duplicated code cannot be far behind. I then continue as before, replacing other callers and inlining the old function
You can Extract Class (182) to bundle a number of the variables. Choose vari- into the new one. I would also inline the variables I extracted to provide the
ables to go together in the component that makes sense for each. For example, clean separation for extracting the new function.
“depositAmount” and “depositCurrency” are likely to belong together in a compo- Because this variation is entirely composed of refactorings, it’s particularly
nent. More generally, common prefixes or suffixes for some subset of the variables handy when I have a refactoring tool with robust extract and inline operations.
in a class suggest the opportunity for a component. If the component makes
sense with inheritance, you’ll find Extract Superclass (375) or Replace Type Code
with Subclasses (362) (which essentially is extracting a subclass) are often easier.
Sometimes a class does not use all of its fields all of the time. If so, you may
be able to do these extractions many times.
As with a class with too many instance variables, a class with too much code
is a prime breeding ground for duplicated code, chaos, and death. The simplest
solution (have we mentioned that we like simple solutions?) is to eliminate re-
dundancy in the class itself. If you have five hundred-line methods with lots of
code in common, you may be able to turn them into five ten-line methods with
another ten two-line methods extracted from the original.
The clients of such a class are often the best clue for splitting up the class.
Look at whether clients use a subset of the features of the class. Each subset is
a possible separate class. Once you’ve identified a useful subset, use Extract Class
(182), Extract Superclass (375), or Replace Type Code with Subclasses (362) to break
it out.
322 Chapter 11 Refactoring APIs Refused Bequest 83
caller…
const tempRange = aRoom.daysTempRange; Refused Bequest
const isWithinRange = xxNEWwithinRange(aPlan, tempRange);
if (!isWithinRange)
alerts.push("room temperature went outside range");
Subclasses get to inherit the methods and data of their parents. But what if they
don’t want or need what they are given? They are given all these great gifts and
pick just a few to play with.
The traditional story is that this means the hierarchy is wrong. You need to
create a new sibling class and use Push Down Method (359) and Push Down Field
(361) to push all the unused code to the sibling. That way the parent holds only
what is common. Often, you’ll hear advice that all superclasses should be abstract.
84 Chapter 3 Bad Smells in Code Preserve Whole Object 321
You’ll guess from our snide use of “traditional” that we aren’t going to advise class HeatingPlan…
this—at least not all the time. We do subclassing to reuse a bit of behavior all xxNEWwithinRange(aNumberRange) {
the time, and we find it a perfectly good way of doing business. There is a return this.withinRange(aNumberRange.low, aNumberRange.high);
smell—we can’t deny it—but usually it isn’t a strong smell. So, we say that if the }
refused bequest is causing confusion and problems, follow the traditional advice. Now I can begin the serious work, taking the existing function calls and having
However, don’t feel you have to do it all the time. Nine times out of ten this them call the new function.
smell is too faint to be worth cleaning.
The smell of refused bequest is much stronger if the subclass is reusing behavior caller…
but does not want to support the interface of the superclass. We don’t mind re- const low = aRoom.daysTempRange.low;
fusing implementations—but refusing interface gets us on our high horses. In this const high = aRoom.daysTempRange.high;
case, however, don’t fiddle with the hierarchy; you want to gut it by applying if (!aPlan.xxNEWwithinRange(aRoom.daysTempRange))
Replace Subclass with Delegate (381) or Replace Superclass with Delegate (399). alerts.push("room temperature went outside range");
When I’ve changed the calls, I may see that some of the earlier code isn’t
needed anymore, so I wield Remove Dead Code (237).
Comments caller…
const low = aRoom.daysTempRange.low;
Don’t worry, we aren’t saying that people shouldn’t write comments. In our olfac- const high = aRoom.daysTempRange.high;
tory analogy, comments aren’t a bad smell; indeed they are a sweet smell. The if (!aPlan.xxNEWwithinRange(aRoom.daysTempRange))
alerts.push("room temperature went outside range");
reason we mention comments here is that comments are often used as a deodor-
ant. It’s surprising how often you look at thickly commented code and notice I replace these one at a time, testing after each change.
that the comments are there because the code is bad. Once I’ve replaced them all, I can use Inline Function (115) on the original
Comments lead us to bad code that has all the rotten whiffs we’ve discussed function.
in the rest of this chapter. Our first action is to remove the bad smells by refac-
toring. When we’re finished, we often find that the comments are superfluous. class HeatingPlan…
If you need a comment to explain what a block of code does, try Extract Function xxNEWwithinRange(aNumberRange) {
(106). If the method is already extracted but you still need a comment to explain return (aNumberRange.low >= this._temperatureRange.low) &&
(aNumberRange.high <= this._temperatureRange.high);
what it does, use Change Function Declaration (124) to rename it. If you need to
}
state some rules about the required state of the system, use Introduce Assertion
(302). And I finally remove that ugly prefix from the new function and all its callers.
A good time to use a comment is The prefix makes it a simple global replace, even if I don’t have a robust rename
When you feel the need to when you don’t know what to do. In support in my editor.
addition to describing what is going on,
write a comment, first try to comments can indicate areas in which class HeatingPlan…
refactor the code so that any you aren’t sure. A comment can also ex- withinRange(aNumberRange) {
why you did something. This kind return (aNumberRange.low >= this._temperatureRange.low) &&
comment becomes superfluous. plain (aNumberRange.high <= this._temperatureRange.high);
of information helps future modifiers, }
especially forgetful ones.
caller…
if (!aPlan.withinRange(aRoom.daysTempRange))
alerts.push("room temperature went outside range");
320 Chapter 11 Refactoring APIs
Mechanics Chapter 4
Create an empty function with the desired parameters.
Give the function an easily searchable name so it can be replaced at the end.
Fill the body of the new function with a call to the old function, mapping Building Tests
from the new parameters to the old ones.
Run static checks.
Adjust each caller to use the new function, testing after each change.
This may mean that some code that derives the parameter isn’t needed, so can
fall to Remove Dead Code (237).
Refactoring is a valuable tool, but it can’t come alone. To do refactoring properly,
Once all original callers have been changed, use Inline Function (115) on the I need a solid suite of tests to spot my inevitable mistakes. Even with automated
original function. refactoring tools, many of my refactorings will still need checking via a test suite.
Change the name of the new function and all its callers. I don’t find this to be a disadvantage. Even without refactoring, writing good
tests increases my effectiveness as a programmer. This was a surprise for me and
is counterintuitive for most programmers—so it’s worth explaining why.
Example
Consider a room monitoring system. It compares its daily temperature range with
a range in a predefined heating plan.
The Value of Self-Testing Code
caller…
const low = aRoom.daysTempRange.low; If you look at how most programmers spend their time, you’ll find that writing
const high = aRoom.daysTempRange.high; code is actually quite a small fraction. Some time is spent figuring out what ought
if (!aPlan.withinRange(low, high)) to be going on, some time is spent designing, but most time is spent debugging.
alerts.push("room temperature went outside range");
I’m sure every reader can remember long hours of debugging—often, well into
class HeatingPlan… the night. Every programmer can tell a story of a bug that took a whole day (or
withinRange(bottom, top) {
more) to find. Fixing the bug is usually pretty quick, but finding it is a nightmare.
return (bottom >= this._temperatureRange.low) && (top <= this._temperatureRange.high); And then, when you do fix a bug, there’s always a chance that another one will
} appear and that you might not even notice it till much later. And you’ll spend
ages finding that bug.
Instead of unpacking the range information when I pass it in, I can pass in the The event that started me on the road to self-testing code was a talk at OOPSLA
whole range object. in 1992. Someone (I think it was “Bedarra” Dave Thomas) said offhandedly,
I begin by stating the interface I want as an empty function. “Classes should contain their own tests.” So I decided to incorporate tests into
class HeatingPlan… the code base together with the production code. As I was also doing iterative
xxNEWwithinRange(aNumberRange) {
development, I tried adding tests as I completed each iteration. The project on
} which I was working at that time was quite small, so we put out iterations every
week or so. Running the tests became fairly straightforward—but although it was
Since I intend it to replace the existing withinRange, I name it the same but with easy, it was still pretty boring. This was because every test produced output to
an easily replaceable prefix. the console that I had to check. Now I’m a pretty lazy person and am prepared
I then add the body of the function, which relies on calling the existing to work quite hard in order to avoid work. I realized that, instead of looking at
withinRange. The body thus consists of a mapping from the new parameter to the the screen to see if it printed out some information from the model, I could get the
existing ones.
85
86 Chapter 4 Building Tests Preserve Whole Object 319
computer to make that test. All I had to do was put the output I expected in the
test code and do a comparison. Now I could run the tests and they would just Preserve Whole Object
print “OK” to the screen if all was well. The software was now self-testing.
Now it was easy to run tests—as easy
Make sure all tests are fully as compiling. So I started to run tests
every time I compiled. Soon, I began to
automatic and that they check notice my productivity had shot upward.
their own results. I realized that I wasn’t spending so much
time debugging. If I added a bug that
was caught by a previous test, it would show up as soon as I ran that test. The
test had worked before, so I would know that the bug was in the work I had const low = aRoom.daysTempRange.low;
done since I last tested. And I ran the tests frequently—which means only a few const high = aRoom.daysTempRange.high;
minutes had elapsed. I thus knew that the source of the bug was the code I had if (aPlan.withinRange(low, high))
just written. As it was a small amount of code that was still fresh in my mind,
the bug was easy to find. Bugs that would have otherwise taken an hour or more
to find now took a couple of minutes at most. Not only was my software
self-testing, but by running the tests frequently I had a powerful bug detector.
As I noticed this, I became more aggressive about doing the tests. Instead of if (aPlan.withinRange(aRoom.daysTempRange))
waiting for the end of an increment, I would add the tests immediately after
writing a bit of function. Every day I would add a couple of new features and
the tests to test them. I hardly ever spent more than a few minutes hunting for Motivation
a regression bug.
Tools for writing and organizing these If I see code that derives a couple of values from a record and then passes these
A suite of tests is a powerful tests have developed a great deal since values into a function, I like to replace those values with the whole record itself,
my experiments. While flying from letting the function body derive the values it needs.
bug detector that decapitates Switzerland to Atlanta for OOPSLA 1997, Passing the whole record handles change better should the called function
the time it takes to find bugs. Kent Beck paired with Erich Gamma to need more data from the whole in the future—that change would not require me
port his unit testing framework from to alter the parameter list. It also reduces the size of the parameter list, which
Smalltalk to Java. The resulting framework, called JUnit, has been enormously usually makes the function call easier to understand. If many functions are called
influential for program testing, inspiring a huge variety of similar tools [mf-xunit] with the parts, they often duplicate the logic that manipulates these parts—logic
in lots of different languages. that can often be moved to the whole.
Admittedly, it is not so easy to persuade others to follow this route. Writing The main reason I wouldn’t do this is if I don’t want the called function to
the tests means a lot of extra code to write. Unless you have actually experienced have a dependency on the whole—which typically occurs when they are in
how it speeds programming, self-testing does not seem to make sense. This is different modules.
not helped by the fact that many people have never learned to write tests or even Pulling several values from an object to do some logic on them alone is a smell
to think about tests. When tests are manual, they are gut-wrenchingly boring. (Feature Envy (77)), and usually a signal that this logic should be moved into the
But when they are automatic, tests can actually be quite fun to write. whole itself. Preserve Whole Object is particularly common after I’ve done
In fact, one of the most useful times to write tests is before I start programming. Introduce Parameter Object (140), as I hunt down any occurrences of the original
When I need to add a feature, I begin by writing the test. This isn’t as backward data clump to replace them with the new object.
as it sounds. By writing the test, I’m asking myself what needs to be done to add If several bits of code only use the same subset of an object’s features, then
the function. Writing the test also concentrates me on the interface rather than the that may indicate a good opportunity for Extract Class (182).
implementation (always a good thing). It also means I have a clear point at which One case that many people miss is when an object calls another object with
I’m done coding—when the test works. several of its own data values. If I see this, I can replace those values with a
self-reference (this in JavaScript).
318 Chapter 11 Refactoring APIs Sample Code to Test 87
function deliveryDate(anOrder, isRush) { Kent Beck baked this habit of writing the test first into a technique called Test-
let result; Driven Development (TDD) [mf-tdd]. The Test-Driven Development approach
let deliveryTime;
to programming relies on short cycles of writing a (failing) test, writing the code to
if (anOrder.deliveryState === "MA" || anOrder.deliveryState === "CT")
deliveryTime = isRush? 1 : 2; make that test work, and refactoring to ensure the result is as clean as possible.
else if (anOrder.deliveryState === "NY" || anOrder.deliveryState === "NH") { This test-code-refactor cycle should occur many times per hour, and can be a
deliveryTime = 2; very productive and calming way to write code. I’m not going to discuss it further
if (anOrder.deliveryState === "NH" && !isRush) here, but I do use and warmly recommend it.
deliveryTime = 3; That’s enough of the polemic. Although I believe everyone would benefit by
}
else if (isRush)
writing self-testing code, it is not the point of this book. This book is about
deliveryTime = 3; refactoring. Refactoring requires tests. If you want to refactor, you have to write
else if (anOrder.deliveryState === "ME") tests. This chapter gives you a start in doing this for JavaScript. This is not
deliveryTime = 3; a testing book, so I’m not going to go into much detail. I’ve found, however, that
else with testing a remarkably small amount of work can have surprisingly big benefits.
deliveryTime = 4;
As with everything else in this book, I describe the testing approach using ex-
result = anOrder.placedOn.plusDays(2 + deliveryTime);
if (isRush) result = result.minusDays(1);
amples. When I develop code, I write the tests as I go. But sometimes, I need to
return result; refactor some code without tests—then I have to make the code self-testing before
} I begin.
In this case, teasing out isRush into a top-level dispatch conditional is likely more
work than I fancy. So instead, I can layer functions over the deliveryDate:
function rushDeliveryDate (anOrder) {return deliveryDate(anOrder, true);} Sample Code to Test
function regularDeliveryDate(anOrder) {return deliveryDate(anOrder, false);}
Here’s some code to look at and test. The code supports a simple application
These wrapping functions are essentially partial applications of deliveryDate, although that allows a user to examine and manipulate a production plan. The (crude) UI
they are defined in program text rather than by composition of functions.
looks like this:
I can then do the same replacement of callers that I did with the decomposed
conditional earlier on. If there aren’t any callers using the parameter as data, I
like to restrict its visibility or rename it to a name that conveys that it shouldn’t
be used directly (e.g., deliveryDateHelperOnly).
88 Chapter 4 Building Tests Remove Flag Argument 317
The production plan has a demand and price for each province. Each province function rushDeliveryDate(anOrder) {
has producers, each of which can produce a certain number of units at a particular let deliveryTime;
if (["MA", "CT"] .includes(anOrder.deliveryState)) deliveryTime = 1;
price. The UI also shows how much revenue each producer would earn if they
else if (["NY", "NH"].includes(anOrder.deliveryState)) deliveryTime = 2;
sell all their production. At the bottom, the screen shows the shortfall in produc- else deliveryTime = 3;
tion (the demand minus the total production) and the profit for this plan. The return anOrder.placedOn.plusDays(1 + deliveryTime);
UI allows the user to manipulate the demand, price, and the individual producer’s }
production and costs to see the effect on the production shortfall and profits. function regularDeliveryDate(anOrder) {
Whenever a user changes any number in the display, all the others update let deliveryTime;
if (["MA", "CT", "NY"].includes(anOrder.deliveryState)) deliveryTime = 2;
immediately. else if (["ME", "NH"] .includes(anOrder.deliveryState)) deliveryTime = 3;
I’m showing a user interface here, so you can sense how the software is used, else deliveryTime = 4;
but I’m only going to concentrate on the business logic part of the software—that return anOrder.placedOn.plusDays(2 + deliveryTime);
is, the classes that calculate the profit and the shortfall, not the code that generates }
the HTML and hooks up the field changes to the underlying business logic. This
The two new functions capture the intent of the call better, so I can replace
chapter is just an introduction to the world of self-testing code, so it makes sense
each call of
for me to start with the easiest case—which is code that doesn’t involve user in-
terface, persistence, or external service interaction. Such separation, however, is aShipment.deliveryDate = deliveryDate(anOrder, true);
a good idea in any case: Once this kind of business logic gets at all complicated,
I will separate it from the UI mechanics so I can more easily reason about it and with
test it. aShipment.deliveryDate = rushDeliveryDate(anOrder);
This business logic code involves two classes: one that represents a single
producer, and the other that represents a whole province. The province’s con- and similarly with the other case.
structor takes a JavaScript object—one we could imagine being supplied by a When I’ve replaced all the callers, I remove deliveryDate.
JSON document. A flag argument isn’t just the presence of a boolean value; it’s that the boolean
Here’s the code that loads the province from the JSON data: is set with a literal rather than data. If all the callers of deliveryDate were like this:
The way that set production updates the derived data in the province is ugly, and To book a premium concert, I issue the call like so:
whenever I see that I want to refactor to remove it. But I have to write tests before
bookConcert(aCustomer, true);
that I can refactor it.
The calculation for the shortfall is simple. Flag arguments can also come as enums:
class Province… bookConcert(aCustomer, CustomerType.PREMIUM);
get shortfall() {
return this._demand - this.totalProduction; or strings (or symbols in languages that use them):
}
bookConcert(aCustomer, "premium");
That for the profit is a bit more involved.
I dislike flag arguments because they complicate the process of understanding
class Province… what function calls are available and how to call them. My first route into an API
get profit() { is usually the list of available functions, and flag arguments hide the differences
return this.demandValue - this.demandCost; in the function calls that are available. Once I select a function, I have to figure
} out what values are available for the flag arguments. Boolean flags are even worse
get demandCost() { since they don’t convey their meaning to the reader—in a function call, I can’t
let remainingDemand = this.demand;
figure out what true means. It’s clearer to provide an explicit function for the task
let result = 0;
this.producers
I want to do.
.sort((a,b) => a.cost - b.cost)
premiumBookConcert(aCustomer);
.forEach(p => {
const contribution = Math.min(remainingDemand, p.production); Not all arguments like this are flag arguments. To be a flag argument, the
remainingDemand -= contribution;
result += contribution * p.cost;
callers must be setting the boolean value to a literal value, not data that’s flowing
}); through the program. Also, the implementation function must be using the argu-
return result; ment to influence its control flow, not as data that it passes to further functions.
} Removing flag arguments doesn’t just make the code clearer—it also helps my
get demandValue() { tooling. Code analysis tools can now more easily see the difference between
return this.satisfiedDemand * this.price;
calling the premium logic and calling regular logic.
}
get satisfiedDemand() {
Flag arguments can have a place if there’s more than one of them in the func-
return Math.min(this._demand, this.totalProduction); tion, since otherwise I would need explicit functions for every combination of
} their values. But that’s also a signal of a function doing too much, and I should
look for a way to create simpler functions that I can compose for this logic.
Mechanics
A First Test
Create an explicit function for each value of the parameter.
To test this code, I’ll need some sort of testing framework. There are many out
there, even just for JavaScript. The one I’ll use is Mocha [mocha], which is rea- If the main function has a clear dispatch conditional, use Decompose Conditional
sonably common and well-regarded. I won’t go into a full explanation of how to (260) to create the explicit functions. Otherwise, create wrapping functions.
use the framework, just show some example tests with it. You should be able to For each caller that uses a literal value for the parameter, replace it with a
adapt, easily enough, a different framework to build similar tests. call to the explicit function.
314 Chapter 11 Refactoring APIs A First Test 91
The Mocha framework divides up the test code into blocks, each grouping to-
gether a suite of tests. Each test appears in an it block. For this simple case, the
test has two steps. The first step sets up some fixture—data and objects that are
needed for the test: in this case, a loaded province object. The second line verifies
some characteristic of that fixture—in this case, that the shortfall is the amount
that should be expected given the initial data.
Different developers use the descriptive strings in the describe and it blocks differently.
function setDimension(name, value) { Some would write a sentence that explains what the test is testing, but others prefer
if (name === "height") { to leave them empty, arguing that the descriptive sentence is just duplicating the code
this._height = value; in the same way a comment does. I like to put in just enough to identify which test is
return;
which when I get failures.
}
if (name === "width") { If I run this test in a NodeJS console, the output looks like this:
this._width = value;
return; ’’’’’’’’’’’’’’
}
} 1 passing (61ms)
Note the simplicity of the feedback—just a summary of how many tests are run
and how many have passed.
When I write a test against existing
code like this, it’s nice to see that all is Always make sure a test will
function setHeight(value) {this._height = value;}
function setWidth (value) {this._width = value;}
well—but I’m naturally skeptical. Particu-
larly, once I have a lot of tests running, fail when it should.
I’m always nervous that a test isn’t really
exercising the code the way I think it is, and thus won’t catch a bug when I need
Motivation
it to. So I like to see every test fail at least once when I write it. My favorite way
A flag argument is a function argument that the caller uses to indicate which of doing that is to temporarily inject a fault into the code, for example:
logic the called function should execute. I may call a function that looks like this:
class Province…
function bookConcert(aCustomer, isPremium) { get shortfall() {
if (isPremium) { return this._demand - this.totalProduction * 2;
// logic for premium booking }
} else {
// logic for regular booking Here’s what the console now looks like:
}
}
92 Chapter 4 Building Tests Parameterize Function 313
! function baseCharge(usage) {
if (usage < 0) return usd(0);
0 passing (72ms) const amount =
1 failing withinBand(usage, 0, 100) * 0.03
+ withinBand(usage, 100, 200) * 0.05
1) province shortfall: + withinBand(usage, 200, Infinity) * 0.07;
AssertionError: expected -20 to equal 5 return usd(amount);
at Context.<anonymous> (src/tester.js:10:12) }
The framework indicates which test failed and gives some information about function topBand(usage) {
the nature of the failure—in this case, what value was expected and what value return usage > 200 ? usage - 200 : 0;
}
actually turned up. I therefore notice at once that something failed—and I can
immediately see which tests failed, giving me a clue as to what went wrong (and, With the logic working the way it does now, I could remove the initial guard
in this case, confirming the failure was where I injected it). clause. But although it’s logically unnecessary now, I like to keep it as it documents
In a real system, I might have thou- how to handle that case.
Run tests frequently. Run those sands of tests. A good test framework
allows me to run them easily and to
exercising the code you’re quickly see if any have failed. This simple
working on at least every few feedback is essential to self-testing code.
I work, I’ll be running tests very
minutes; run all tests at least When frequently—checking progress with new
daily. code or checking for mistakes with
refactoring.
The Mocha framework can use different libraries, which it calls assertion li-
braries, to verify the fixture for a test. Being JavaScript, there are a quadzillion
of them out there, some of which may still be current when you’re reading this.
The one I’m using at the moment is Chai [chai]. Chai allows me to write my
validations either using an “assert” style:
describe('province', function() {
it('shortfall', function() {
const asia = new Province(sampleProvinceData());
assert.equal(asia.shortfall, 5);
});
});
or an “expect” style:
describe('province', function() {
it('shortfall', function() {
const asia = new Province(sampleProvinceData());
expect(asia.shortfall).equal(5);
});
});
I usually prefer the assert style, but at the moment I mostly use the expect style
while working in JavaScript.
Different environments provide different ways to run tests. When I’m program-
ming in Java, I use an IDE that gives me a graphical test runner. Its progress bar
312 Chapter 11 Refactoring APIs Add Another Test 93
middleBand uses two literal values: 100 and 200. These represent the bottom and is green as long as all the tests pass, and turns red should any of them fail. My
top of this middle band. I begin by using Change Function Declaration (124) to add colleagues often use the phrases “green bar” and “red bar” to describe the state
them to the call. While I’m at it, I’ll also change the name of the function to of tests. I might say, “Never refactor on a red bar,” meaning you shouldn’t be
something that makes sense with the parameterization. refactoring if your test suite has a failing test. Or, I might say, “Revert to green”
to say you should undo recent changes and go back to the last state where you
function withinBand(usage, bottom, top) {
had all-passing test suite (usually by going back to a recent version-control
return usage > 100 ? Math.min(usage, 200) - 100 : 0;
} checkpoint).
Graphical test runners are nice, but not essential. I usually have my tests set
function baseCharge(usage) { to run from a single key in Emacs, and observe the text feedback in my com-
if (usage < 0) return usd(0);
pilation window. The key point is that I can quickly see if my tests are all OK.
const amount =
bottomBand(usage) * 0.03
+ withinBand(usage, 100, 200) * 0.05
+ topBand(usage) * 0.07;
return usd(amount); Add Another Test
}
I replace each literal with a reference to the parameter: Now I’ll continue adding more tests. The style I follow is to look at all the things
the class should do and test each one of them for any conditions that might
function withinBand(usage, bottom, top) { cause the class to fail. This is not the same as testing every public method, which
return usage > bottom ? Math.min(usage, 200) - bottom : 0; is what some programmers advocate. Testing should be risk-driven; remember,
}
I’m trying to find bugs, now or in the future. Therefore I don’t test accessors that
then: just read and write a field: They are so simple that I’m not likely to find a bug
there.
function withinBand(usage, bottom, top) { This is important because trying to write too many tests usually leads to not
return usage > bottom ? Math.min(usage, top) - bottom : 0; writing enough. I get many benefits from testing even if I do only a little testing.
}
My focus is to test the areas that I’m most worried about going wrong. That way
I replace the call to the bottom band with a call to the newly parameterized I get the most benefit for my testing effort.
function. So I’ll start by hitting the other main
output for this code—the profit calcula- It is better to write and run
function baseCharge(usage) { tion. Again, I’ll just do a basic test for
if (usage < 0) return usd(0); incomplete tests than not to
profit on my initial fixture.
const amount =
withinBand(usage, 0, 100) * 0.03 describe('province', function() {
run complete tests.
+ withinBand(usage, 100, 200) * 0.05 it('shortfall', function() {
+ topBand(usage) * 0.07; const asia = new Province(sampleProvinceData());
return usd(amount); expect(asia.shortfall).equal(5);
} });
it('profit', function() {
function bottomBand(usage) {
const asia = new Province(sampleProvinceData());
return Math.min(usage, 100);
expect(asia.profit).equal(230);
}
});
To replace the call to the top band, I need to make use of infinity. });
That shows the final result, but the way I got it was by first setting the expected
value to a placeholder, then replacing it with whatever the program produced (230).
I could have calculated it by hand myself, but since the code is supposed to be
working correctly, I’ll just trust it for now. Once I have that new test working
94 Chapter 4 Building Tests Parameterize Function 311
correctly, I break it by altering the profit calculation with a spurious * 2. I satisfy If the original parameterized function doesn’t work for a similar function, adjust
myself that the test fails as it should, then revert my injected fault. This pat- it for the new function before moving on to the next.
tern—write with a placeholder for the expected value, replace the placeholder
with the code’s actual value, inject a fault, revert the fault—is a common one I Example
use when adding tests to existing code.
There is some duplication between these tests—both of them set up the fixture An obvious example is something like this:
with the same first line. Just as I’m suspicious of duplicated code in regular code,
function tenPercentRaise(aPerson) {
I’m suspicious of it in test code, so will look to remove it by factoring to a common aPerson.salary = aPerson.salary.multiply(1.1);
place. One option is to raise the constant to the outer scope. }
function fivePercentRaise(aPerson) {
describe('province', function() {
aPerson.salary = aPerson.salary.multiply(1.05);
const asia = new Province(sampleProvinceData()); // DON'T DO THIS
}
it('shortfall', function() {
expect(asia.shortfall).equal(5); Hopefully it’s obvious that I can replace these with
});
it('profit', function() { function raise(aPerson, factor) {
expect(asia.profit).equal(230); aPerson.salary = aPerson.salary.multiply(1 + factor);
}); }
});
But it can be a bit more involved than that. Consider this code:
But as the comment indicates, I never do this. It will work for the moment,
but it introduces a petri dish that’s primed for one of the nastiest bugs in testing—a function baseCharge(usage) {
shared fixture which causes tests to interact. The const keyword in JavaScript only if (usage < 0) return usd(0);
const amount =
means the reference to asia is constant, not the content of that object. Should a
bottomBand(usage) * 0.03
future test change that common object, I’ll end up with intermittent test failures + middleBand(usage) * 0.05
due to tests interacting through the shared fixture, yielding different results de- + topBand(usage) * 0.07;
pending on what order the tests are run in. That’s a nondeterminism in the tests return usd(amount);
that can lead to long and difficult debugging at best, and a collapse of confidence }
in the tests at worst. Instead, I prefer to do this: function bottomBand(usage) {
return Math.min(usage, 100);
describe('province', function() {
}
let asia;
beforeEach(function() { function middleBand(usage) {
asia = new Province(sampleProvinceData()); return usage > 100 ? Math.min(usage, 200) - 100 : 0;
}); }
it('shortfall', function() {
expect(asia.shortfall).equal(5); function topBand(usage) {
}); return usage > 200 ? usage - 200 : 0;
it('profit', function() { }
expect(asia.profit).equal(230);
}); Here the logic is clearly pretty similar—but is it similar enough to support cre-
}); ating a parameterized method for the bands? It is, but may be a touch less obvious
than the trivial case above.
The beforeEach clause is run before each test runs, clearing out asia and setting
When looking to parameterize some related functions, my approach is to take
it to a fresh value each time. This way I build a fresh fixture before each test is
one of the functions and add parameters to it, with an eye to the other cases.
run, which keeps the tests isolated and prevents the nondeterminism that causes
With range-oriented things like this, usually the place to start is with the middle
so much trouble.
range. So I’ll work on middleBand to change it to use parameters, and then adjust
When I give this advice, some people are concerned that building a fresh fixture
other callers to fit.
every time will slow down the tests. Most of the time, it won’t be noticeable. If
310 Chapter 11 Refactoring APIs Modifying the Fixture 95
it is a problem, I’d consider a shared fixture, but then I will need to be really
Parameterize Function careful that no test ever changes it. I can also use a shared fixture if I’m sure it
is truly immutable. But my reflex is to use a fresh fixture because the debugging
formerly: Parameterize Method cost of making a mistake with a shared fixture has bit me too often in the past.
Given I run the setup code in beforeEach with every test, why not leave the setup
code inside the individual it blocks? I like my tests to all operate on a common
bit of fixture, so I can become familiar with that standard fixture and see the
various characteristics to test on it. The presence of the beforeEach block signals to
the reader that I’m using a standard fixture. You can then look at all the tests
function tenPercentRaise(aPerson) { within the scope of that describe block and know they all take the same base data
aPerson.salary = aPerson.salary.multiply(1.1); as a starting point.
}
function fivePercentRaise(aPerson) {
aPerson.salary = aPerson.salary.multiply(1.05);
}
Modifying the Fixture
So far, the tests I’ve written show how I probe the properties of the fixture once
I’ve loaded it. But in use, that fixture will be regularly updated by the users as
they change values.
function raise(aPerson, factor) {
aPerson.salary = aPerson.salary.multiply(1 + factor);
Most of the updates are simple setters, and I don’t usually bother to test those
} as there’s little chance they will be the source of a bug. But there is some compli-
cated behavior around Producer’s production setter, so I think that’s worth a test.
describe(’province’…
Motivation it('change production', function() {
asia.producers[0].production = 20;
If I see two functions that carry out very similar logic with different literal values,
expect(asia.shortfall).equal(-6);
I can remove the duplication by using a single function with parameters for the expect(asia.profit).equal(292);
different values. This increases the usefulness of the function, since I can apply });
it elsewhere with different values.
This is a common pattern. I take the initial standard fixture that’s set up by
the beforeEach block, I exercise that fixture for the test, then I verify the fixture has
Mechanics done what I think it should have done. If you read much about testing, you’ll
hear these phases described variously as setup-exercise-verify, given-when-then,
Select one of the similar methods.
or arrange-act-assert. Sometimes you’ll see all the steps present within the test
Use Change Function Declaration (124) to add any literals that need to turn itself, in other cases the common early phases can be pushed out into standard
into parameters. setup routines such as beforeEach.
For each caller of the function, add the literal value. (There is an implicit fourth phase that’s usually not mentioned: teardown. Teardown
removes the fixture between tests so that different tests don’t interact with each other.
Test. By doing all my setup in beforeEach, I allow the test framework to implicitly tear down
my fixture between tests, so I can take the teardown phase for granted. Most writers
Change the body of the function to use the new parameters. Test after each
on tests gloss over teardown—reasonably so, since most of the time we ignore it. But
change. occasionally, it can be important to have an explicit teardown operation, particularly if
For each similar function, replace the call with a call to the parameterized we have a fixture that we have to share between tests because it’s slow to create.)
function. Test after each one.
96 Chapter 4 Building Tests Separate Query from Modifier 309
In this test, I’m verifying two different characteristics in a single it clause. As function alertForMiscreant (people) {
a general rule, it’s wise to have only a single verify statement in each it clause. for (const p of people) {
if (p === "Don") {
This is because the test will fail on the first verification failure—which can often
setOffAlarms();
hide useful information when you’re figuring out why a test is broken. In this return;
case, I feel the two are closely enough connected that I’m happy to have them }
in the same test. Should I wish to separate them into separate it clauses, I can if (p === "John") {
do that later. setOffAlarms();
return;
}
}
return;
Probing the Boundaries }
So far my tests have focused on regular usage, often referred to as “happy path” Now I have a lot of duplication between the original modifier and the new
conditions where everything is going OK and things are used as expected. But query, so I can use Substitute Algorithm (195) so that the modifier uses the query.
it’s also good to throw tests at the boundaries of these conditions—to see what function alertForMiscreant (people) {
happens when things might go wrong. if (findMiscreant(people) !== "") setOffAlarms();
Whenever I have a collection of something, such as producers in this example, }
I like to see what happens when it’s empty.
describe('no producers', function() {
let noProducers;
beforeEach(function() {
const data = {
name: "No proudcers",
producers: [],
demand: 30,
price: 20
};
noProducers = new Province(data);
});
it('shortfall', function() {
expect(noProducers.shortfall).equal(30);
});
it('profit', function() {
expect(noProducers.profit).equal(0);
});
describe(’province’…
it('zero demand', function() {
asia.demand = 0;
expect(asia.shortfall).equal(-25);
expect(asia.profit).equal(0);
});
as are negatives:
308 Chapter 11 Refactoring APIs Probing the Boundaries 97
I begin by copying the function, naming it after the query aspect of the function. describe(’province’…
it('negative demand', function() {
function findMiscreant (people) {
asia.demand = -1;
for (const p of people) {
expect(asia.shortfall).equal(-26);
if (p === "Don") {
expect(asia.profit).equal(-10);
setOffAlarms();
});
return "Don";
} At this point, I may start to wonder if a negative demand resulting in a negative
if (p === "John") {
profit really makes any sense for the domain. Shouldn’t the minimum demand
setOffAlarms();
return "John"; be zero? In which case, perhaps, the setter should react differently to a negative
} argument—raising an error or setting the value to zero anyway. These are good
} questions to ask, and writing tests like this helps me think about how the code
return ""; ought to react to boundary cases.
} The setters take a string from the fields
I remove the side effects from this new query. in the UI, which are constrained to only Think of the boundary condi-
accept numbers—but they can still be
function findMiscreant (people) { blank, so I should have tests that ensure tions under which things might
for (const p of people) { the code responds to the blanks the way go wrong and concentrate your
if (p === "Don") {
I want it to.
setOffAlarms(); tests there.
return "Don";
describe(’province’…
}
if (p === "John") { it('empty string demand', function() {
setOffAlarms(); asia.demand = "";
return "John"; expect(asia.shortfall).NaN;
} expect(asia.profit).NaN;
} });
return "";
} Notice how I’m playing the part of an enemy to my code. I’m actively thinking
about how I can break it. I find that state of mind to be both productive and fun.
I now go to each caller and replace it with a call to the query, followed by a It indulges the mean-spirited part of my psyche.
call to the modifier. So This one is interesting:
const found = alertForMiscreant(people); describe('string for producers', function() {
it('', function() {
changes to const data = {
name: "String producers",
const found = findMiscreant(people); producers: "",
alertForMiscreant(people); demand: 30,
price: 20
I now remove the return values from the modifier. };
const prov = new Province(data);
expect(prov.shortfall).equal(0);
});
This doesn’t produce a simple failure reporting that the shortfall isn’t 0. Here’s
the console output:
98 Chapter 4 Building Tests Separate Query from Modifier 307
’’’’’’’’’! Note that I use the phrase observable side effects. A common optimization is
to cache the value of a query in a field so that repeated calls go quicker. Although
9 passing (74ms)
this changes the state of the object with the cache, the change is not observable.
1 failing
Any sequence of queries will always return the same results for each query.
1) string for producers :
TypeError: doc.producers.forEach is not a function
at new Province (src/main.js:22:19)
Mechanics
at Context.<anonymous> (src/tester.js:86:18)
Copy the function, name it as a query.
Mocha treats this as a failure—but many testing frameworks distinguish between
Look into the function to see what is returned. If the query is used to populate a
this situation, which they call an error, and a regular failure. A failure indicates
variable, the variable’s name should provide a good clue.
a verify step where the actual value is outside the bounds expected by the verify
statement. But this error is a different animal—it’s an exception raised during an Remove any side effects from the new query function.
earlier phase (in this case, the setup). This looks like an exception that the authors
of the code hadn’t anticipated, so we get an error sadly familiar to JavaScript Run static checks.
programmers (“… is not a function”). Find each call of the original method. If that call uses the return value, replace
How should the code respond to such a case? One approach is to add some the original call with a call to the query and insert a call to the original
handling that would give a better error response—either raising a more meaningful method below it. Test after each change.
error message, or just setting producers to an empty array (with perhaps a log
message). But there may also be valid reasons to leave it as it is. Perhaps the Remove return values from original.
input object is produced by a trusted source—such as another part of the same Test.
code base. Putting in lots of validation checks between modules in the same code
base can result in duplicate checks that cause more trouble than they are worth, Often after doing this there will be duplication between the query and the
especially if they duplicate validation done elsewhere. But if that input object is original method that can be tidied up.
coming in from an external source, such as a JSON-encoded request, then valida-
tion checks are needed, and should be tested. In either case, writing tests like
this raises these kinds of questions.
Example
If I’m writing tests like this before refactoring, I would probably discard this Here is a function that scans a list of names for a miscreant. If it finds one, it
test. Refactoring should preserve observable behavior; an error like this is outside returns the name of the bad guy and sets off the alarms. It only does this for the
the bounds of observable, so I need not be concerned if my refactoring first miscreant it finds (I guess one is enough).
changes the code’s response to this condition.
function alertForMiscreant (people) {
If this error could lead to bad data running around the program, causing a failure that for (const p of people) {
will be hard to debug, I might use Introduce Assertion (302) to fail fast. I don’t add tests if (p === "Don") {
to catch such assertion failures, as they are themselves a form of test. setOffAlarms();
return "Don";
When do you stop? I’m sure you have }
heard many times that you cannot prove if (p === "John") {
Don’t let the fear that testing setOffAlarms();
that a program has no bugs by testing.
can’t catch all bugs stop you return "John";
That’s true, but it does not affect the }
from writing tests that catch ability of testing to speed up program- }
ming. I’ve seen various proposed rules return "";
most bugs. to ensure you have tested every combi- }
nation of everything. It’s worth taking a
look at these—but don’t let them get to you. There is a law of diminishing returns
in testing, and there is the danger that by trying to write too many tests you
306 Chapter 11 Refactoring APIs Much More Than This 99
become discouraged and end up not writing any. You should concentrate on
Separate Query from Modifier where the risk is. Look at the code and see where it becomes complex. Look at
a function and consider the likely areas of error. Your tests will not find every
bug, but as you refactor, you will understand the program better and thus find
more bugs. Although I always start refactoring with a test suite, I invariably add
to it as I go along.
function getTotalOutstandingAndSendBill() {
Much More Than This
const result = customer.invoices.reduce((total, each) => each.amount + total, 0);
sendBill(); That’s as far as I’m going to go with this chapter—after all, this is a book on
return result; refactoring, not on testing. But testing is an important topic, both because it’s a
} necessary foundation for refactoring and because it’s a valuable tool in its own
right. While I’ve been happy to see the growth of refactoring as a programming
practice since I wrote this book, I’ve been even happier to see the change in atti-
tudes to testing. Previously seen as the responsibility of a separate (and inferior)
group, testing is now increasingly a first-class concern of any decent software
function totalOutstanding() {
developer. Architectures often are, rightly, judged on their testability.
return customer.invoices.reduce((total, each) => each.amount + total, 0); The kinds of tests I’ve shown here are unit tests, designed to operate on a
} small area of the code and run fast. They are the backbone of self-testing code;
function sendBill() { most tests in such a system are unit tests. There are other kinds of tests too, fo-
emailGateway.send(formatBill(customer)); cusing on integration between components, exercising multiple levels of the
}
software together, looking for performance issues, etc. (And even more varied
than the types of tests are the arguments people get into about how to classify
tests.)
Motivation Like most aspects of programming, testing is an iterative activity. Unless you
When I have a function that gives me a value and has no observable side effects, are either very skilled or very lucky, you won’t get your tests right the first time.
I have a very valuable thing. I can call this function as often as I like. I can move I find I’m constantly working on the test suite—just as much as I work on the
the call to other places in a calling function. It’s easier to test. In short, I have a main code. Naturally, this means adding new tests as I add new features, but it
lot less to worry about. also involves looking at the existing tests. Are they clear enough? Do I need to
It is a good idea to clearly signal the difference between functions with side refactor them so I can more easily understand what they are doing? Have I got
effects and those without. A good rule to follow is that any function that returns the right tests? An important habit to get into is to respond to a bug by first
a value should not have observable side effects—the command-query separation writing a test that clearly reveals the bug. Only after I have the test do I fix the
[mf-cqs]. Some programmers treat this as an absolute rule. I’m not 100 percent bug. By having the test, I know the bug will stay dead. I also think about that
pure on this (as on anything), but I try to follow it most of the time, and it has bug and its test: Does it give me clues to other gaps in the test suite?
served me well. A common question is, “How much
If I come across a method that returns a value but also has side effects, I always testing is enough?” There’s no good When you get a bug report,
try to separate the query from the modifier. measurement for this. Some people advo-
cate using test coverage [mf-tc] as a start by writing a unit test that
measure, but test coverage analysis is exposes the bug.
only good for identifying untested areas
of the code, not for assessing the quality of a test suite.
100 Chapter 4 Building Tests
The best measure for a good enough test suite is subjective: How confident Chapter 11
are you that if someone introduces a defect into the code, some test will fail?
This isn’t something that can be objectively analyzed, and it doesn’t account for
false confidence, but the aim of self-testing code is to get that confidence. If I
can refactor my code and be pretty sure that I’ve not introduced a bug because
my tests come back green—then I can be happy that I have good enough tests. Refactoring APIs
It is possible to write too many tests. One sign of that is when I spend more
time changing the tests than the code under test—and I feel the tests are slowing
me down. But while over-testing does happen, it’s vanishingly rare compared to
under-testing.
Modules and their functions are the building blocks of our software. APIs are
the joints that we use to plug them together. Making these APIs easy to under-
stand and use is important but also difficult: I need to refactor them as I learn
how to improve them.
A good API clearly separates any functions that update data from those that
only read data. If I see them combined, I use Separate Query from Modifier (306)
to tease them apart. I can unify functions that only vary due to a value with
Parameterize Function (310). Some parameters, however, are really just a signal of
an entirely different behavior and are best excised with Remove Flag Argument
(314).
Data structures are often unpacked unnecessarily when passed between func-
tions; I prefer to keep them together with Preserve Whole Object (319). Decisions
on what should be passed as a parameter, and what can be resolved by the called
function, are ones I often need to revisit with Replace Parameter with Query (324)
and Replace Query with Parameter (327).
A class is a common form of module. I prefer my objects to be as immutable
as possible, so I use Remove Setting Method (331) whenever I can. Often, when a
caller asks for a new object, I need more flexibility than a simple constructor
gives, which I can get by using Replace Constructor with Factory Function (334).
The last two refactorings address the difficulty of breaking down a particularly
complex function that passes a lot of data around. I can turn that function into
an object with Replace Function with Command (337), which makes it easier to use
Extract Function (106) on the function’s body. If I later simplify the function and
no longer need it as a command object, I turn it back into a function with Replace
Command with Function (344).
305
304 Chapter 10 Simplifying Conditional Logic
An assertion like this can be particularly valuable if it’s hard to spot the error Introducing the Catalog
source—which may be an errant minus sign in some input data or some inversion
elsewhere in the code.
There is a real danger of overusing assertions. I don’t use assertions to check
everything that I think is true, but only to check things that need to be true. Du-
plication is a particular problem, as it’s common to tweak these kinds of condi-
tions. So I find it’s essential to remove any duplication in these conditions, usually
by a liberal use of Extract Function (106).
The rest of this book is a catalog of refactorings. This catalog started from my
I only use assertions for things that are programmer errors. If I’m reading data
personal notes that I made to remind myself how to do refactorings in a safe and
from an external source, any value checking should be a first-class part of the
efficient way. Since then, I’ve refined the catalog, and there’s more of it that
program, not an assertion—unless I’m really confident in the external source.
comes from deliberate exploration of some refactoring moves. It’s still something
Assertions are a last resort to help track bugs—though, ironically, I only use them
I use when I do a refactoring I haven’t done in a while.
when I think they should never fail.
The sketch shows a code example of the transformation of the refactoring. It’s
not meant to explain what the refactoring is, let alone how to do it, but it should
remind you what the refactoring is if you’ve come across it before. If not, you’ll
probably need to work through the example to get a better idea. I also include
101
102 Chapter 5 Introducing the Catalog Introduce Assertion 303
In this case, I’d rather put this assertion into the setting method. If the assertion
fails in applyDiscount, my first puzzle is how it got into the field in the first place.
The Choice of Refactorings
This is by no means a complete catalog of refactorings. It is, I hope, a collection
of those most useful to have them written down. By “most useful” I mean those
302 Chapter 10 Simplifying Conditional Logic The Choice of Refactorings 103
that are both commonly used and worthwhile to name and describe. I find
Introduce Assertion something worthwhile to describe for a combination of reasons: Some have in-
teresting mechanics which help general refactoring skills, some have a strong
effect on improving the design of code.
Some refactorings are missing because they are so small and straightforward
that I don’t feel they are worth writing up. An example in the first edition was
Slide Statements (223)—which I use frequently but didn’t recognize as something
I should include in the catalog (obviously, I changed my mind for this edition).
if (this.discountRate) These may well get added to the book over time, depending on how much energy
base = base - (this.discountRate * base); I devote to new refactorings in the future.
Another category is refactorings that logically exist, but either aren’t used much
by me or show a simple similarity to other refactorings. Every refactoring in this
book has a logical inverse refactoring, but I didn’t write all of them up because
I don’t find many inverses interesting. Encapsulate Variable (132) is a common and
assert(this.discountRate >= 0); powerful refactoring but its inverse is something I hardly ever do (and it is easy
if (this.discountRate) to perform anyway) so I didn’t think we need a catalog entry for it.
base = base - (this.discountRate * base);
Motivation
Often, sections of code work only if certain conditions are true. This may be as
simple as a square root calculation only working on a positive input value. With
an object, it may require that at least one of a group of fields has a value in it.
Such assumptions are often not stated but can only be deduced by looking
through an algorithm. Sometimes, the assumptions are stated with a comment.
A better technique is to make the assumption explicit by writing an assertion.
An assertion is a conditional statement that is assumed to be always true.
Failure of an assertion indicates a programmer error. Assertion failures should
never be checked by other parts of the system. Assertions should be written so
that the program functions equally correctly if they are all removed; indeed, some
languages provide assertions that can be disabled by a compile-time switch.
I often see people encourage using assertions in order to find errors. While
this is certainly a Good Thing, it’s not the only reason to use them. I find asser-
tions to be a valuable form of communication—they tell the reader something
about the assumed state of the program at this point of execution. I also find
them handy for debugging, and their communication value means I’m inclined
to leave them in once I’ve fixed the error I’m chasing. Self-testing code reduces
their value for debugging, as steadily narrowing unit tests often do the job better,
but I still like assertions for communication.
Introduce Special Case 301
function enrichSite(aSite) {
const result = _.cloneDeep(aSite);
const unknownCustomer = {
isUnknown: true,
name: "occupant",
billingPlan: registry.billingPlans.basic,
paymentHistory: {
weeksDelinquentInLastYear: 0,
}
This page intentionally left blank };
client 3…
const weeksDelinquent = aCustomer.paymentHistory.weeksDelinquentInLastYear;
300 Chapter 10 Simplifying Conditional Logic
I can then modify the special-case condition test to include probing for this Chapter 6
new property. I keep the original test as well, so that the test will work on both
raw and enriched sites.
function isUnknown(aCustomer) {
if (aCustomer === "unknown") return true;
else return aCustomer.isUnknown; A First Set of Refactorings
}
I test to ensure that’s all OK, then start applying Combine Functions into Transform
(149) on the special case. First, I move the choice of name into the enrichment
function.
function enrichSite(aSite) {
const result = _.cloneDeep(aSite);
const unknownCustomer = { I’m starting the catalog with a set of refactorings that I consider the most useful
isUnknown: true, to learn first.
name: "occupant", Probably the most common refactoring I do is extracting code into a function
}; (Extract Function (106)) or a variable (Extract Variable (119)). Since refactoring is
all about change, it’s no surprise that I also frequently use the inverses of those
if (isUnknown(result.customer)) result.customer = unknownCustomer;
else result.customer.isUnknown = false;
two (Inline Function (115) and Inline Variable (123)).
return result; Extraction is all about giving names, and I often need to change the names as
} I learn. Change Function Declaration (124) changes names of functions; I also
use that refactoring to add or remove a function’s arguments. For variables, I use
client 1… Rename Variable (137), which relies on Encapsulate Variable (132). When changing
const rawSite = acquireSiteData(); function arguments, I often find it useful to combine a common clump of
const site = enrichSite(rawSite);
arguments into a single object with Introduce Parameter Object (140).
const aCustomer = site.customer;
// ... lots of intervening code ...
Forming and naming functions are essential low-level refactorings—but, once
const customerName = aCustomer.name; created, it’s necessary to group functions into higher-level modules. I use Combine
Functions into Class (144) to group functions, together with the data they operate
I test, then do the billing plan. on, into a class. Another path I take is to combine them into a transform (Combine
function enrichSite(aSite) {
Functions into Transform (149)), which is particularly handy with read-only data.
const result = _.cloneDeep(aSite); At a step further in scale, I can often form these modules into distinct processing
const unknownCustomer = { phases using Split Phase (154).
isUnknown: true,
name: "occupant",
billingPlan: registry.billingPlans.basic,
};
client 2…
const plan = aCustomer.billingPlan;
105
106 Chapter 6 A First Set of Refactorings Introduce Special Case 299
client 1…
Extract Function const rawSite = acquireSiteData();
const site = enrichSite(rawSite);
formerly: Extract Method const aCustomer = site.customer;
inverse of: Inline Function (115) // ... lots of intervening code ...
let customerName;
if (aCustomer === "unknown") customerName = "occupant";
else customerName = aCustomer.name;
function enrichSite(inputSite) {
return _.cloneDeep(inputSite);
}
client 2…
function printOwing(invoice) { const plan = (isUnknown(aCustomer)) ?
printBanner(); registry.billingPlans.basic
let outstanding = calculateOutstanding(); : aCustomer.billingPlan;
printDetails(outstanding);
client 3…
function printDetails(outstanding) { const weeksDelinquent = (isUnknown(aCustomer)) ?
console.log(`name: ${invoice.customer}`); 0
console.log(`amount: ${outstanding}`); : aCustomer.paymentHistory.weeksDelinquentInLastYear;
}
} I begin the enrichment by adding an isUnknown property to the customer.
function enrichSite(aSite) {
const result = _.cloneDeep(aSite);
Motivation const unknownCustomer = {
isUnknown: true,
Extract Function is one of the most common refactorings I do. (Here, I use the };
term “function” but the same is true for a method in an object-oriented language,
or any kind of procedure or subroutine.) I look at a fragment of code, understand if (isUnknown(result.customer)) result.customer = unknownCustomer;
what it is doing, then extract it into its own function named after its purpose. else result.customer.isUnknown = false;
During my career, I’ve heard many arguments about when to enclose code in return result;
}
its own function. Some of these guidelines were based on length: Functions
should be no larger than fit on a screen. Some were based on reuse: Any code
298 Chapter 10 Simplifying Conditional Logic Extract Function 107
{ used more than once should be put in its own function, but code only used once
name: "Acme Boston", should be left inline. The argument that makes most sense to me, however, is
location: "Malden MA",
the separation between intention and implementation. If you have to spend effort
// more site details
customer: { looking at a fragment of code and figuring out what it’s doing, then you should
name: "Acme Industries", extract it into a function and name the function after the “what.” Then, when you
billingPlan: "plan-451", read it again, the purpose of the function leaps right out at you, and most of the
paymentHistory: { time you won’t need to care about how the function fulfills its purpose (which
weeksDelinquentInLastYear: 7 is the body of the function).
//more
},
Once I accepted this principle, I developed a habit of writing very small
// more functions—typically, only a few lines long. To me, any function with more than
} half-a-dozen lines of code starts to smell, and it’s not unusual for me to have
} functions that are a single line of code. The fact that size isn’t important was
brought home to me by an example that Kent Beck showed me from the original
In some cases, the customer isn’t known, and such cases are marked in the
Smalltalk system. Smalltalk in those days ran on black-and-white systems. If you
same way:
wanted to highlight some text or graphics, you would reverse the video. Smalltalk’s
{ graphics class had a method for this called highlight, whose implementation was
name: "Warehouse Unit 15", just a call to the method reverse. The name of the method was longer than its
location: "Malden MA", implementation—but that didn’t matter because there was a big distance between
// more site details the intention of the code and its implementation.
customer: "unknown",
}
Some people are concerned about short functions because they worry about
the performance cost of a function call. When I was young, that was occasionally
I have similar client code that checks for the unknown customer: a factor, but that’s very rare now. Optimizing compilers often work better with
shorter functions which can be cached more easily. As always, follow the general
client 1… guidelines on performance optimization.
const site = acquireSiteData(); Small functions like this only work if the names are good, so you need to pay
const aCustomer = site.customer;
good attention to naming. This takes practice—but once you get good at it, this
// ... lots of intervening code ...
let customerName; approach can make code remarkably self-documenting.
if (aCustomer === "unknown") customerName = "occupant"; Often, I see fragments of code in a larger function that start with a comment
else customerName = aCustomer.name; to say what they do. The comment is often a good hint for the name of the
function when I extract that fragment.
client 2…
const plan = (aCustomer === "unknown") ?
registry.billingPlans.basic Mechanics
: aCustomer.billingPlan;
Create a new function, and name it after the intent of the function (name
client 3… it by what it does, not by how it does it).
const weeksDelinquent = (aCustomer === "unknown") ?
0 If the code I want to extract is very simple, such as a single function call, I still
: aCustomer.paymentHistory.weeksDelinquentInLastYear; extract it if the name of the new function will reveal the intent of the code in a
better way. If I can’t come up with a more meaningful name, that’s a sign that I
My first step is to run the site data structure through a transform that, currently, shouldn’t extract the code. However, I don’t have to come up with the best name
does nothing but a deep copy. right away; sometimes a good name only appears as I work with the extraction.
It’s OK to extract a function, try to work with it, realize it isn’t helping, and then
inline it back again. As long as I’ve learned something, my time wasn’t wasted.
108 Chapter 6 A First Set of Refactorings Introduce Special Case 297
If the language supports nested functions, nest the extracted function inside the client 1…
source function. That will reduce the amount of out-of-scope variables to deal const customerName = aCustomer.name;
with after the next couple of steps. I can always use Move Function (198) later.
Then, the billing plan:
Copy the extracted code from the source function into the new target
function. function createUnknownCustomer() {
return {
Scan the extracted code for references to any variables that are local in scope isUnknown: true,
to the source function and will not be in scope for the extracted function. name: "occupant",
Pass them as parameters. billingPlan: registry.billingPlans.basic,
};
If I extract into a nested function of the source function, I don’t run into these }
problems.
client 2…
Usually, these are local variables and parameters to the function. The most general const plan = aCustomer.billingPlan;
approach is to pass all such parameters in as arguments. There are usually no
difficulties for variables that are used but not assigned to. Similarly, I can create a nested null payment history with the literal:
If a variable is only used inside the extracted code but is declared outside, move function createUnknownCustomer() {
the declaration into the extracted code. return {
isUnknown: true,
Any variables that are assigned to need more care if they are passed by value. If name: "occupant",
there’s only one of them, I try to treat the extracted code as a query and assign billingPlan: registry.billingPlans.basic,
the result to the variable concerned. paymentHistory: {
weeksDelinquentInLastYear: 0,
Sometimes, I find that too many local variables are being assigned by the extracted },
code. It’s better to abandon the extraction at this point. When this happens, I };
consider other refactorings such as Split Variable (240) or Replace Temp with Query }
(178) to simplify variable usage and revisit the extraction later.
client 3…
Compile after all variables are dealt with. const weeksDelinquent = aCustomer.paymentHistory.weeksDelinquentInLastYear;
Once all the variables are dealt with, it can be useful to compile if the language If I use a literal like this, I should make it immutable, which I might do with
environment does compile-time checks. Often, this will help find any variables freeze.Usually, I’d rather use a class.
that haven’t been dealt with properly.
Replace the extracted code in the source function with a call to the target Example: Using a Transform
function.
Both previous cases involve a class, but the same idea can be applied to a record
Test. by using a transform step.
Let’s assume our input is a simple record structure that looks something
Look for other code that’s the same or similar to the code just extracted,
like this:
and consider using Replace Inline Code with Function Call (222) to call the new
function.
Some refactoring tools support this directly. Otherwise, it can be worth doing some
quick searches to see if duplicate code exists elsewhere.
296 Chapter 10 Simplifying Conditional Logic Extract Function 109
I apply Extract Function (106) to the special case condition test. console.log("***********************");
console.log("**** Customer Owes ****");
function isUnknown(arg) { console.log("***********************");
return (arg === "unknown");
} // calculate outstanding
for (const o of invoice.orders) {
client 1… outstanding += o.amount;
let customerName; }
if (isUnknown(aCustomer)) customerName = "occupant";
else customerName = aCustomer.name; // record due date
const today = Clock.today;
client 2… invoice.dueDate = new Date(today.getFullYear(), today.getMonth(), today.getDate() + 30);
const plan = isUnknown(aCustomer) ?
registry.billingPlans.basic //print details
: aCustomer.billingPlan; console.log(`name: ${invoice.customer}`);
console.log(`amount: ${outstanding}`);
client 3… console.log(`due: ${invoice.dueDate.toLocaleDateString()}`);
}
const weeksDelinquent = isUnknown(aCustomer) ?
0 You may be wondering what the Clock.today is about. It is a Clock Wrapper [mf-cw]—an object
: aCustomer.paymentHistory.weeksDelinquentInLastYear; that wraps calls to the system clock. I avoid putting direct calls to things like Date.now() in my
I change the site class and the condition test to work with the special case. code, because it leads to nondeterministic tests and makes it difficult to reproduce error conditions
when diagnosing failures.
class Site…
It’s easy to extract the code that prints the banner. I just cut, paste, and put in
get customer() {
return (this._customer === "unknown") ? createUnknownCustomer() : this._customer;
a call:
} function printOwing(invoice) {
let outstanding = 0;
top level…
function isUnknown(arg) { printBanner();
return arg.isUnknown;
} // calculate outstanding
for (const o of invoice.orders) {
Then I replace each standard response with the appropriate literal value. I start outstanding += o.amount;
with the name: }
//print details When I’m done with all the clients, I should be able to use Remove Dead Code
console.log(`name: ${invoice.customer}`); (237) on the global isPresent function, as nobody should be calling it any more.
console.log(`amount: ${outstanding}`);
console.log(`due: ${invoice.dueDate.toLocaleDateString()}`);
} Example: Using an Object Literal
function printBanner() {
console.log("***********************"); Creating a class like this is a fair bit of work for what is really a simple value.
console.log("**** Customer Owes ****"); But for the example I gave, I had to make the class since the customer could be
console.log("***********************");
updated. If, however, I only read the data structure, I can use a literal object
}
instead.
Similarly, I can take the printing of details and extract that too: Here is the opening case again—just the same, except this time there is no
client that updates the customer:
function printOwing(invoice) {
let outstanding = 0; class Site…
get customer() {return this._customer;}
printBanner();
class Customer…
// calculate outstanding
for (const o of invoice.orders) { get name() {...}
outstanding += o.amount; get billingPlan() {...}
} set billingPlan(arg) {...}
get paymentHistory() {...}
// record due date
const today = Clock.today;
client 1…
invoice.dueDate = new Date(today.getFullYear(), today.getMonth(), today.getDate() + 30); const aCustomer = site.customer;
// ... lots of intervening code ...
printDetails(); let customerName;
if (aCustomer === "unknown") customerName = "occupant";
function printDetails() { else customerName = aCustomer.name;
console.log(`name: ${invoice.customer}`);
console.log(`amount: ${outstanding}`); client 2…
console.log(`due: ${invoice.dueDate.toLocaleDateString()}`); const plan = (aCustomer === "unknown") ?
} registry.billingPlans.basic
: aCustomer.billingPlan;
This makes Extract Function seem like a trivially easy refactoring. But in many
situations, it turns out to be rather more tricky. client 3…
In the case above, I defined printDetails so it was nested inside printOwing. That const weeksDelinquent = (aCustomer === "unknown") ?
way it was able to access all the variables defined in printOwing. But that’s not an 0
: aCustomer.paymentHistory.weeksDelinquentInLastYear;
option to me if I’m programming in a language that doesn’t allow nested functions.
Then I’m faced, essentially, with the problem of extracting the function to the As with the previous case, I start by adding an isUnknown property to the customer
top level, which means I have to pay attention to any variables that exist only and creating a special-case object with that field. The difference is that this time,
in the scope of the source function. These are the arguments to the original the special case is a literal.
function and the temporary variables defined in the function.
class Customer…
get isUnknown() {return false;}
Example: Using Local Variables
The easiest case with local variables is when they are used but not reassigned.
In this case, I can just pass them in as parameters. So if I have the following
function:
294 Chapter 10 Simplifying Conditional Logic Extract Function 111
client…
const name = aCustomer.isUnknown ? "unknown occupant" : aCustomer.name;
112 Chapter 6 A First Set of Refactorings Introduce Special Case 293
Example: Reassigning a Local Variable I add a suitable method to the unknown customer:
It’s the assignment to local variables that becomes complicated. In this case, we’re class UnknownCustomer…
only talking about temps. If I see an assignment to a parameter, I immediately get name() {return "occupant";}
use Split Variable (240), which turns it into a temp.
For temps that are assigned to, there are two cases. The simpler case is where Now I can make all that conditional code go away.
the variable is a temporary variable used only within the extracted code. When
client 1…
that happens, the variable just exists within the extracted code. Sometimes, par-
const customerName = aCustomer.name;
ticularly when variables are initialized at some distance before they are used, it’s
handy to use Slide Statements (223) to get all the variable manipulation together. Once I’ve tested that this works, I’ll probably be able to use Inline Variable (123)
The more awkward case is where the variable is used outside the extracted on that variable too.
function. In that case, I need to return the new value. I can illustrate this with Next is the billing plan property.
the following familiar-looking function:
client 2…
function printOwing(invoice) {
const plan = (isUnknown(aCustomer)) ?
let outstanding = 0;
registry.billingPlans.basic
: aCustomer.billingPlan;
printBanner();
client 3…
// calculate outstanding
for (const o of invoice.orders) { if (!isUnknown(aCustomer)) aCustomer.billingPlan = newPlan;
outstanding += o.amount;
}
For read behavior, I do the same thing I did with the name—take the common
response and reply with it. With the write behavior, the current code doesn’t call
recordDueDate(invoice); the setter for an unknown customer—so for the special case, I let the setter be
printDetails(invoice, outstanding); called, but it does nothing.
}
292 Chapter 10 Simplifying Conditional Logic Extract Function 113
There is a common technique to use whenever I find myself in this bind. I use I’ve shown the previous refactorings all in one step, since they were straight-
Extract Function (106) on the code that I’d have to change in lots of places—in forward, but this time I’ll take it one step at a time from the mechanics.
this case, the special-case comparison code. First, I’ll slide the declaration next to its use.
function isUnknown(arg) { function printOwing(invoice) {
if (!((arg instanceof Customer) || (arg === "unknown"))) printBanner();
throw new Error(`investigate bad value: <${arg}>`);
return (arg === "unknown"); // calculate outstanding
} let outstanding = 0;
for (const o of invoice.orders) {
I’ve put a trap in here for an unexpected value. This can help me to spot any mistakes outstanding += o.amount;
or odd behavior as I’m doing this refactoring. }
I can now use this function whenever I’m testing for an unknown customer. I recordDueDate(invoice);
can change these calls one at a time, testing after each change. printDetails(invoice, outstanding);
}
client 1…
let customerName; I then copy the code I want to extract into a target function.
if (isUnknown(aCustomer)) customerName = "occupant";
function printOwing(invoice) {
else customerName = aCustomer.name;
printBanner();
After a while, I have done them all.
// calculate outstanding
client 2… let outstanding = 0;
for (const o of invoice.orders) {
const plan = (isUnknown(aCustomer)) ? outstanding += o.amount;
registry.billingPlans.basic }
: aCustomer.billingPlan;
recordDueDate(invoice);
client 3…
printDetails(invoice, outstanding);
if (!isUnknown(aCustomer)) aCustomer.billingPlan = newPlan; }
function calculateOutstanding(invoice) {
client 4… let outstanding = 0;
const weeksDelinquent = isUnknown(aCustomer) ? for (const o of invoice.orders) {
0 outstanding += o.amount;
: aCustomer.paymentHistory.weeksDelinquentInLastYear; }
return outstanding;
Once I’ve changed all the callers to use isUnknown, I can change the site class to }
return an unknown customer.
Since I moved the declaration of outstanding into the extracted code, I don’t need
class Site… to pass it in as a parameter. The outstanding variable is the only one reassigned in
get customer() { the extracted code, so I can return it.
return (this._customer === "unknown") ? new UnknownCustomer() : this._customer; My JavaScript environment doesn’t yield any value by compiling—indeed less
} than I’m getting from the syntax analysis in my editor—so there’s no step to do
I can check that I’m no longer using the “unknown” string by changing isUnknown here. My next thing to do is to replace the original code with a call to the new
to use the unknown value. function. Since I’m returning the value, I need to store it in the original variable.
114 Chapter 6 A First Set of Refactorings Introduce Special Case 291
Use Inline Function (115) on the special-case comparison function for the
places where it’s still needed.
function getRating(driver) {
Example return (driver.numberOfLateDeliveries > 5) ? 2 : 1;
}
A utility company installs its services in sites.
Mechanics
Introduce Special Case
Check that this isn’t a polymorphic method.
formerly: Introduce Null Object
If this is a method in a class, and has subclasses that override it, then I can’t
inline it.
Written this way, Inline Function is simple. In general, it isn’t. I could write
pages on how to handle recursion, multiple return points, inlining a method into
another object when you don’t have accessors, and the like. The reason I don’t
class UnknownCustomer {
is that if you encounter these complexities, you shouldn’t do this refactoring.
get name() {return "occupant";}
Example
Motivation
In the simplest case, this refactoring is so easy it’s trivial. I start with
A common case of duplicated code is when many users of a data structure check
function rating(aDriver) {
return moreThanFiveLateDeliveries(aDriver) ? 2 : 1; a specific value, and then most of them do the same thing. If I find many parts
} of the code base having the same reaction to a particular value, I want to bring
function moreThanFiveLateDeliveries(aDriver) { that reaction into a single place.
return aDriver.numberOfLateDeliveries > 5; A good mechanism for this is the Special Case pattern where I create a special-
} case element that captures all the common behavior. This allows me to replace
I can just take the return expression of the called function and paste it into the most of the special-case checks with simple calls.
caller to replace the call. A special case can manifest itself in several ways. If all I’m doing with the object
is reading data, I can supply a literal object with all the values I need filled in.
function rating(aDriver) { If I need more behavior than simple values, I can create a special object with
return aDriver.numberOfLateDeliveries > 5 ? 2 : 1; methods for all the common behavior. The special-case object can be returned
}
by an encapsulating class, or inserted into a data structure with a transform.
But it can be a little more involved than that, requiring me to do more work A common value that needs special-case processing is null, which is why this
to fit the code into its new home. Consider the case where I start with this slight pattern is often called the Null Object pattern. But it’s the same approach for any
variation on the earlier initial code. special case—I like to say that Null Object is a special case of Special Case.
Mechanics
Begin with a container data structure (or class) that contains a property which
is the subject of the refactoring. Clients of the container compare the subject
288 Chapter 10 Simplifying Conditional Logic Inline Function 117
Inlining gatherCustomerData into reportLines isn’t a simple cut and paste. It’s not too
complicated, and most times I would still do this in one go, with a bit of fitting.
But to be cautious, it may make sense to move one line at a time. So I’d start
with using Move Statements to Callers (217) on the first line (I’d do it the simple
way with a cut, paste, and fit).
function reportLines(aCustomer) {
const lines = [];
lines.push(["name", aCustomer.name]);
gatherCustomerData(lines, aCustomer);
return lines;
}
function gatherCustomerData(out, aCustomer) {
out.push(["name", aCustomer.name]);
out.push(["location", aCustomer.location]);
}
function reportLines(aCustomer) { At the end of the refactoring, I have the following code. First, there is the basic
const lines = []; rating class which can ignore any complications of the experienced China case:
lines.push(["name", aCustomer.name]);
lines.push(["location", aCustomer.location]); class Rating {
return lines; constructor(voyage, history) {
} this.voyage = voyage;
this.history = history;
The point here is to always be ready to take smaller steps. Most of the time, }
with the small functions I normally write, I can do Inline Function in one go, get value() {
even if there is a bit of refitting to do. But if I run into complications, I go one const vpf = this.voyageProfitFactor;
line at a time. Even with one line, things can get a bit awkward; then, I’ll use the const vr = this.voyageRisk;
const chr = this.captainHistoryRisk;
more elaborate mechanics for Move Statements to Callers (217) to break things
if (vpf * 3 > (vr + chr * 2)) return "A";
down even more. And if, feeling confident, I do something the quick way and else return "B";
the tests break, I prefer to revert back to my last green code and repeat the }
refactoring with smaller steps and a touch of chagrin. get voyageRisk() {
let result = 1;
if (this.voyage.length > 4) result += 2;
if (this.voyage.length > 8) result += this.voyage.length - 8;
if (["china", "east-indies"].includes(this.voyage.zone)) result += 4;
return Math.max(result, 0);
}
get captainHistoryRisk() {
let result = 1;
if (this.history.length < 5) result += 4;
result += this.history.filter(v => v.profit < 0).length;
return Math.max(result, 0);
}
get voyageProfitFactor() {
let result = 2;
if (this.voyage.zone === "china") result += 1;
if (this.voyage.zone === "east-indies") result += 1;
result += this.historyLengthFactor;
result += this.voyageLengthFactor;
return result;
}
get voyageLengthFactor() {
return (this.voyage.length > 14) ? - 1: 0;
}
get historyLengthFactor() {
return (this.history.length > 8) ? 1 : 0;
}
}
The code for the experienced China case reads as a set of variations on the base:
286 Chapter 10 Simplifying Conditional Logic Extract Variable 119
class ExperiencedChinaRating…
get voyageAndHistoryLengthFactor() { Extract Variable
let result = 0;
result += 3; formerly: Introduce Explaining Variable
result += this.historyLengthFactor; inverse of: Inline Variable (123)
if (this.voyage.length > 12) result += 1;
if (this.voyage.length > 18) result -= 1;
return result;
}
class Rating…
return order.quantity * order.itemPrice -
get voyageProfitFactor() {
Math.max(0, order.quantity - 500) * order.itemPrice * 0.05 +
let result = 2;
Math.min(order.quantity * order.itemPrice * 0.1, 100);
if (this.voyage.zone === "china") result += 1;
if (this.voyage.zone === "east-indies") result += 1;
result += this.historyLengthFactor;
result += this.voyageLengthFactor;
return result;
}
const basePrice = order.quantity * order.itemPrice;
get voyageLengthFactor() { const quantityDiscount = Math.max(0, order.quantity - 500) * order.itemPrice * 0.05;
return (this.voyage.length > 14) ? - 1: 0; const shipping = Math.min(basePrice * 0.1, 100);
} return basePrice - quantityDiscount + shipping;
Mechanics (106) on the history length modification, both in the superclass and subclass. I
start with just the superclass:
Ensure that the expression you want to extract does not have side effects.
class Rating…
Declare an immutable variable. Set it to a copy of the expression you want get voyageAndHistoryLengthFactor() {
to name. let result = 0;
result += this.historyLengthFactor;
Replace the original expression with the new variable. if (this.voyage.length > 14) result -= 1;
return result;
Test. }
get historyLengthFactor() {
If the expression appears more than once, replace each occurrence with the return (this.history.length > 8) ? 1 : 0;
variable, testing after each replacement. }
class Rating…
get voyageProfitFactor() {
let result = 2;
get hasChinaHistory() {
return this.history.some(v => "china" === v.zone); Inline Variable
}
}
formerly: Inline Temp
That’s given me the class for the base case. I now need to create an empty inverse of: Extract Variable (119)
subclass to house the variant behavior.
class ExperiencedChinaRating extends Rating {
}
I then create a factory function to return the variant class when needed.
function createRating(voyage, history) { let basePrice = anOrder.basePrice;
if (voyage.zone === "china" && history.some(v => "china" === v.zone)) return (basePrice > 1000);
return new ExperiencedChinaRating(voyage, history);
else return new Rating(voyage, history);
}
I need to modify any callers to use the factory function instead of directly
invoking the constructor, which in this case is just the rating function. return anOrder.basePrice > 1000;
What I want to focus on here is how a couple of places use conditional logic the next time I’m looking at this code, I don’t have to figure out again what’s
to handle the case of a voyage to China where the captain has been to China going on. (Often, a good way to improve a name is to write a comment to describe
before. the function’s purpose, then turn that comment into a name.)
Similar logic applies to a function’s parameters. The parameters of a function
function rating(voyage, history) {
dictate how a function fits in with the rest of its world. Parameters set the context
const vpf = voyageProfitFactor(voyage, history);
const vr = voyageRisk(voyage); in which I can use a function. If I have a function to format a person’s telephone
const chr = captainHistoryRisk(voyage, history); number, and that function takes a person as its argument, then I can’t use it to
if (vpf * 3 > (vr + chr * 2)) return "A"; format a company’s telephone number. If I replace the person parameter with
else return "B"; the telephone number itself, then the formatting code is more widely useful.
} Apart from increasing a function’s range of applicability, I can also remove
function voyageRisk(voyage) {
let result = 1;
some coupling, changing what modules need to connect to others. Telephone
if (voyage.length > 4) result += 2; formatting logic may sit in a module that has no knowledge about people. Reduc-
if (voyage.length > 8) result += voyage.length - 8; ing how much modules need to know about each other helps reduce how much
if (["china", "east-indies"].includes(voyage.zone)) result += 4; I need to put into my brain when I change something—and my brain isn’t as big
return Math.max(result, 0); as it used to be (that doesn’t say anything about the size of its container, though).
}
Choosing the right parameters isn’t something that adheres to simple rules. I
function captainHistoryRisk(voyage, history) {
let result = 1; may have a simple function for determining if a payment is overdue, by looking
if (history.length < 5) result += 4; at if it’s older than 30 days. Should the parameter to this function be the payment
result += history.filter(v => v.profit < 0).length; object, or the due date of the payment? Using the payment couples the function
if (voyage.zone === "china" && hasChina(history)) result -= 2; to the interface of the payment object. But if I use the payment, I can easily access
return Math.max(result, 0); other properties of the payment, should the logic evolve, without having to change
}
every bit of code that calls this function—essentially, increasing the encapsulation
function hasChina(history) {
return history.some(v => "china" === v.zone); of the function.
} The only right answer to this puzzle is that there is no right answer, especially
function voyageProfitFactor(voyage, history) { over time. So I find it’s essential to be familiar with Change Function Declaration
let result = 2; so the code can evolve with my understanding of what the best joints in the code
if (voyage.zone === "china") result += 1; need to be.
if (voyage.zone === "east-indies") result += 1;
if (voyage.zone === "china" && hasChina(history)) {
Usually, I only use the main name of a refactoring when I refer to it from
result += 3; elsewhere in this book. However, since renaming is such a significant use case
if (history.length > 10) result += 1; for Change Function Declaration, if I’m just renaming something, I’ll refer to this
if (voyage.length > 12) result += 1; refactoring as Rename Function to make it clearer what I’m doing. Whether I’m
if (voyage.length > 18) result -= 1; merely renaming or manipulating the parameters, I use the same mechanics.
}
else {
if (history.length > 8) result += 1; Mechanics
if (voyage.length > 14) result -= 1;
} In most of the refactorings in this book, I present only a single set of mechanics.
return result; This isn’t because there is only one set that will do the job but because, usually,
}
one set of mechanics will work reasonably well for most cases. Change Function
I will use inheritance and polymorphism to separate out the logic for handling Declaration, however, is an exception. The simple mechanics are often effective,
these cases from the base logic. This is a particularly useful refactoring if I’m but there are plenty of cases when a more gradual migration makes more sense.
about to introduce more special logic for this case—and the logic for these repeat So, with this refactoring, I look at the change and ask myself if I think I can
China voyages can make it harder to understand the base case. change the declaration and all its callers easily in one go. If so, I follow the simple
mechanics. The migration-style mechanics allow me to change the callers more
gradually—which is important if I have lots of them, they are awkward to get
126 Chapter 6 A First Set of Refactorings Replace Conditional with Polymorphism 279
to, the function is a polymorphic method, or I have a more complicated change to function voyageRisk(voyage) {
the declaration. let result = 1;
if (voyage.length > 4) result += 2;
if (voyage.length > 8) result += voyage.length - 8;
Simple Mechanics if (["china", "east-indies"].includes(voyage.zone)) result += 4;
return Math.max(result, 0);
If you’re removing a parameter, ensure it isn’t referenced in the body of the }
function. function captainHistoryRisk(voyage, history) {
let result = 1;
Change the method declaration to the desired declaration. if (history.length < 5) result += 4;
result += history.filter(v => v.profit < 0).length;
Find all references to the old method declaration, update them to the if (voyage.zone === "china" && hasChina(history)) result -= 2;
new one. return Math.max(result, 0);
}
Test. function hasChina(history) {
return history.some(v => "china" === v.zone);
It’s often best to separate changes, so if you want to both change the name }
and add a parameter, do these as separate steps. (In any case, if you run into function voyageProfitFactor(voyage, history) {
trouble, revert and use the migration mechanics instead.) let result = 2;
if (voyage.zone === "china") result += 1;
if (voyage.zone === "east-indies") result += 1;
Migration Mechanics if (voyage.zone === "china" && hasChina(history)) {
result += 3;
If necessary, refactor the body of the function to make it easy to do the if (history.length > 10) result += 1;
following extraction step. if (voyage.length > 12) result += 1;
if (voyage.length > 18) result -= 1;
Use Extract Function (106) on the function body to create the new function. }
else {
If the new function will have the same name as the old one, give the new function if (history.length > 8) result += 1;
a temporary name that’s easy to search for. if (voyage.length > 14) result -= 1;
}
If the extracted function needs additional parameters, use the simple return result;
mechanics to add them. }
Test. The functions voyageRisk and captainHistoryRisk score points for risk, voyageProfitFactor
Apply Inline Function (115) to the old function. scores points for the potential profit, and rating combines these to give the overall
rating for the voyage.
If you used a temporary name, use Change Function Declaration (124) again The calling code would look something like this:
to restore it to the original name.
const voyage = {zone: "west-indies", length: 10};
Test. const history = [
{zone: "east-indies", profit: 5},
If you’re changing a method on a class with polymorphism, you’ll need to add {zone: "west-indies", profit: 15},
indirection for each binding. If the method is polymorphic within a single class {zone: "china", profit: -2},
{zone: "west-africa", profit: 7},
hierarchy, you only need the forwarding method on the superclass. If the poly- ];
morphism has no superclass link, then you’ll need forwarding methods on each
implementation class. const myRating = rating(voyage, history);
If you are refactoring a published API, you can pause the refactoring once
you’ve created the new function. During this pause, deprecate the original function
and wait for clients to change to the new function. The original function declara-
278 Chapter 10 Simplifying Conditional Logic Change Function Declaration 127
class AfricanSwallow extends Bird { tion can be removed when (and if) you’re confident all the clients of the old
get plumage() { function have migrated to the new one.
return (this.numberOfCoconuts > 2) ? "tired" : "average";
}
get airSpeedVelocity() { Example: Renaming a Function (Simple Mechanics)
return 40 - 2 * this.numberOfCoconuts;
} Consider this function with an overly abbreved name:
}
class NorwegianBlueParrot extends Bird { function circum(radius) {
get plumage() { return 2 * Math.PI * radius;
return (this.voltage > 100) ? "scorched" : "beautiful"; }
}
get airSpeedVelocity() { I want to change that to something more sensible. I begin by changing the
return (this.isNailed) ? 0 : 10 + this.voltage / 10; declaration:
}
} function circumference(radius) {
return 2 * Math.PI * radius;
Looking at this final code, I can see that the superclass Bird isn’t strictly needed. }
In JavaScript, I don’t need a type hierarchy for polymorphism; as long as my ob-
jects implement the appropriately named methods, everything works fine. In this I then find all the callers of circum and change the name to circumference.
situation, however, I like to keep the unnecessary superclass as it helps explain Different language environments have an impact on how easy it is to find all
the way the classes are related in the domain. the references to the old function. Static typing and a good IDE provide the best
experience, usually allowing me to rename functions automatically with little
chance of error. Without static typing, this can be more involved; even good
Example: Using Polymorphism for Variation searching tools will then have a lot of false positives.
I use the same approach for adding or removing parameters: find all the callers,
With the birds example, I’m using a clear generalization hierarchy. That’s how
change the declaration, and change the callers. It’s often better to do these as
subclassing and polymorphism is often discussed in textbooks (including
separate steps—so, if I’m both renaming the function and adding a parameter, I
mine)—but it’s not the only way inheritance is used in practice; indeed, it probably
first do the rename, test, then add the parameter, and test again.
isn’t the most common or best way. Another case for inheritance is when I wish
A disadvantage of this simple way of doing the refactoring is that I have to do
to indicate that one object is mostly similar to another, but with some variations.
all the callers and the declaration (or all of them, if polymorphic) at once. If there
As an example of this case, consider some code used by a rating agency to
are only a few of them, or if I have decent automated refactoring tools, this is
compute an investment rating for the voyages of sailing ships. The rating agency
reasonable. But if there’s a lot, it can get tricky. Another problem is when the
gives out either an “A” or “B” rating, depending of various factors due to risk and
names aren’t unique—e.g., I want to rename the a changeAddress method on a person
profit potential. The risk comes from assessing the nature of the voyage as well
class but the same method, which I don’t want to change, exists on an insurance
as the history of the captain’s prior voyages.
agreement class. The more complex the change is, the less I want to do it in one
function rating(voyage, history) { go like this. When this kind of problem arises, I use the migration mechanics
const vpf = voyageProfitFactor(voyage, history); instead. Similarly, if I use simple mechanics and something goes wrong, I’ll revert
const vr = voyageRisk(voyage); the code to the last known good state and try again using migration mechanics.
const chr = captainHistoryRisk(voyage, history);
if (vpf * 3 > (vr + chr * 2)) return "A";
else return "B"; Example: Renaming a Function (Migration Mechanics)
}
Again, I have this function with its overly abbreved name:
function circum(radius) {
return 2 * Math.PI * radius;
}
128 Chapter 6 A First Set of Refactorings Replace Conditional with Polymorphism 277
To do this refactoring with migration mechanics, I begin by applying Extract class Bird…
Function (106) to the entire function body. get plumage() {
return "unknown";
function circum(radius) { }
return circumference(radius);
} I repeat the same process for airSpeedVelocity. Once I’m done, I end up with the
function circumference(radius) { following code (I also inlined the top-level functions for airSpeedVelocity and plumage):
return 2 * Math.PI * radius;
} function plumages(birds) {
return new Map(birds
I test that, then apply Inline Function (115) to the old functions. I find all the .map(b => createBird(b))
calls of the old function and replace each one with a call of the new one. I can .map(bird => [bird.name, bird.plumage]));
test after each change, which allows me to do them one at a time. Once I’ve got }
them all, I remove the old function. function speeds(birds) {
return new Map(birds
With most refactorings, I’m changing code that I can modify, but this refactoring
.map(b => createBird(b))
can be handy with a published API—that is, one used by code that I’m unable .map(bird => [bird.name, bird.airSpeedVelocity]));
to change myself. I can pause the refactoring after creating circumference and, if }
possible, mark circum as deprecated. I will then wait for callers to change to use
circumference; once they do, I can delete circum. Even if I’m never able to reach the function createBird(bird) {
switch (bird.type) {
happy point of deleting circum, at least I have a better name for new code. case 'EuropeanSwallow':
return new EuropeanSwallow(bird);
case 'AfricanSwallow':
Example: Adding a Parameter return new AfricanSwallow(bird);
In some software, to manage a library of books, I have a book class which has case 'NorwegianBlueParrot':
return new NorwegianBlueParrot(bird);
the ability to take a reservation for a customer. default:
return new Bird(bird);
class Book… }
addReservation(customer) { }
this._reservations.push(customer);
} class Bird {
constructor(birdObject) {
I need to support a priority queue for reservations. Thus, I need an extra pa- Object.assign(this, birdObject);
rameter on addReservation to indicate whether the reservation should go in the }
usual queue or the high-priority queue. If I can easily find and change all get plumage() {
return "unknown";
the callers, then I can just go ahead with the change—but if not, I can
}
use the migration approach, which I’ll show here. get airSpeedVelocity() {
I begin by using Extract Function (106) on the body of addReservation to create the return null;
new function. Although it will eventually be called addReservation, the new and old }
functions can’t coexist with the same name. So I use a temporary name that will }
be easy to search for later. class EuropeanSwallow extends Bird {
get plumage() {
class Book… return "average";
}
addReservation(customer) {
get airSpeedVelocity() {
this.zz_addReservation(customer);
return 35;
}
}
}
276 Chapter 10 Simplifying Conditional Logic Change Function Declaration 129
I can compile and test at this point. Then, if all is well, I do the next leg.
Example: Changing a Parameter to One of Its Properties
class AfricanSwallow…
get plumage() {
The examples so far are simple changes of a name and adding a new parameter,
return (this.numberOfCoconuts > 2) ? "tired" : "average"; but with the migration mechanics, this refactoring can handle more complicated
} cases quite neatly. Here’s an example that is a bit more involved.
I have a function which determines if a customer is based in New England.
Then, the Norwegian Blue:
function inNewEngland(aCustomer) {
class NorwegianBlueParrot… return ["MA", "CT", "ME", "VT", "NH", "RI"].includes(aCustomer.address.state);
get plumage() { }
return (this.voltage > 100) ? "scorched" : "beautiful";
} Here is one of its callers:
inNewEngland only uses the customer’s home state to determine if it’s in New class Bird {
England. I’d prefer to refactor inNewEngland so that it takes a state code as a param- constructor(birdObject) {
Object.assign(this, birdObject);
eter, making it usable in more contexts by removing the dependency on the
}
customer. get plumage() {
With Change Function Declaration, my usual first move is to apply Extract switch (this.type) {
Function (106), but in this case I can make it easier by first refactoring the function case 'EuropeanSwallow':
body a little. I use Extract Variable (119) on my desired new parameter. return "average";
case 'AfricanSwallow':
function inNewEngland(aCustomer) { return (this.numberOfCoconuts > 2) ? "tired" : "average";
const stateCode = aCustomer.address.state; case 'NorwegianBlueParrot':
return ["MA", "CT", "ME", "VT", "NH", "RI"].includes(stateCode); return (this.voltage > 100) ? "scorched" : "beautiful";
} default:
return "unknown";
Now I use Extract Function (106) to create that new function. }
}
function inNewEngland(aCustomer) { get airSpeedVelocity() {
const stateCode = aCustomer.address.state; switch (this.type) {
return xxNEWinNewEngland(stateCode); case 'EuropeanSwallow':
} return 35;
case 'AfricanSwallow':
function xxNEWinNewEngland(stateCode) {
return 40 - 2 * this.numberOfCoconuts;
return ["MA", "CT", "ME", "VT", "NH", "RI"].includes(stateCode);
case 'NorwegianBlueParrot':
}
return (this.isNailed) ? 0 : 10 + this.voltage / 10;
default:
I give the function a name that’s easy to automatically replace to turn into the
return null;
original name later. (You can tell I don’t have a standard for these temporary }
names.) }
I apply Inline Variable (123) on the input parameter in the original function. }
function inNewEngland(aCustomer) { I now add subclasses for each kind of bird, together with a factory function to
return xxNEWinNewEngland(aCustomer.address.state); instantiate the appropriate subclass.
}
function plumage(bird) {
I use Inline Function (115) to fold the old function into its callers, effectively return createBird(bird).plumage;
replacing the call to the old function with a call to the new one. I can do these }
one at a time.
function airSpeedVelocity(bird) {
caller… return createBird(bird).airSpeedVelocity;
}
const newEnglanders = someCustomers.filter(c => xxNEWinNewEngland(c.address.state));
function createBird(bird) {
Once I’ve inlined the old function into every caller, I use Change Function switch (bird.type) {
Declaration again to change the name of the new function to that of the original. case 'EuropeanSwallow':
return new EuropeanSwallow(bird);
caller… case 'AfricanSwallow':
const newEnglanders = someCustomers.filter(c => inNewEngland(c.address.state)); return new AfricanSwallow(bird);
case 'NorweigianBlueParrot':
top level… return new NorwegianBlueParrot(bird);
function inNewEngland(stateCode) { default:
return ["MA", "CT", "ME", "VT", "NH", "RI"].includes(stateCode); return new Bird(bird);
} }
}
274 Chapter 10 Simplifying Conditional Logic Change Function Declaration 131
Example Automated refactoring tools make the migration mechanics both less useful
and more effective. They make it less useful because they handle even complicated
My friend has a collection of birds and wants to know how fast they can fly and renames and parameter changes safer, so I don’t have to use the migration ap-
what they have for plumage. So we have a couple of small programs to determine proach as often as I do without that support. However, in cases like this example,
the information. where the tools can’t do the whole refactoring, they still make it much easier as
function plumages(birds) {
the key moves of extract and inline can be done more quickly and safely
return new Map(birds.map(b => [b.name, plumage(b)])); with the tool.
}
function speeds(birds) {
return new Map(birds.map(b => [b.name, airSpeedVelocity(b)]));
}
function plumage(bird) {
switch (bird.type) {
case 'EuropeanSwallow':
return "average";
case 'AfricanSwallow':
return (bird.numberOfCoconuts > 2) ? "tired" : "average";
case 'NorwegianBlueParrot':
return (bird.voltage > 100) ? "scorched" : "beautiful";
default:
return "unknown";
}
}
function airSpeedVelocity(bird) {
switch (bird.type) {
case 'EuropeanSwallow':
return 35;
case 'AfricanSwallow':
return 40 - 2 * bird.numberOfCoconuts;
case 'NorwegianBlueParrot':
return (bird.isNailed) ? 0 : 10 + bird.voltage / 10;
default:
return null;
}
}
We have a couple of different operations that vary with the type of bird, so it
makes sense to create classes and use polymorphism for any type-specific behavior.
I begin by using Combine Functions into Class (144) on airSpeedVelocity and plumage.
function plumage(bird) {
return new Bird(bird).plumage;
}
function airSpeedVelocity(bird) {
return new Bird(bird).airSpeedVelocity;
}
132 Chapter 6 A First Set of Refactorings Replace Conditional with Polymorphism 273
divide the conditions. Sometimes it’s enough to represent this division within
Encapsulate Variable the structure of a conditional itself, but using classes and polymorphism can
make the separation more explicit.
formerly: Self-Encapsulate Field A common case for this is where I can form a set of types, each handling the
formerly: Encapsulate Field conditional logic differently. I might notice that books, music, and food vary in
how they are handled because of their type. This is made most obvious when
there are several functions that have a switch statement on a type code. In that
case, I remove the duplication of the common switch logic by creating classes
for each case and using polymorphism to bring out the type-specific behavior.
Another situation is where I can think of the logic as a base case with variants.
The base case may be the most common or most straightforward. I can put this
logic into a superclass which allows me to reason about it without having to
let defaultOwner = {firstName: "Martin", lastName: "Fowler"};
worry about the variants. I then put each variant case into a subclass, which I
express with code that emphasizes its difference from the base case.
Polymorphism is one of the key features of object-oriented programming—and,
like any useful feature, it’s prone to overuse. I’ve come across people who argue
that all examples of conditional logic should be replaced with polymorphism. I
let defaultOwnerData = {firstName: "Martin", lastName: "Fowler"}; don’t agree with that view. Most of my conditional logic uses basic conditional
export function defaultOwner() {return defaultOwnerData;} statements—if/else and switch/case. But when I see complex conditional logic
export function setDefaultOwner(arg) {defaultOwnerData = arg;}
that can be improved as discussed above, I find polymorphism a powerful tool.
Motivation Mechanics
Refactoring is all about manipulating the elements of our programs. Data is more If classes do not exist for polymorphic behavior, create them together with
awkward to manipulate than functions. Since using a function usually means a factory function to return the correct instance.
calling it, I can easily rename or move a function while keeping the old function
Use the factory function in calling code.
intact as a forwarding function (so my old code calls the old function, which calls
the new function). I’ll usually not keep this forwarding function around for long, Move the conditional function to the superclass.
but it does simplify the refactoring.
Data is more awkward because I can’t do that. If I move data around, I have If the conditional logic is not a self-contained function, use Extract Function (106)
to make it so.
to change all the references to the data in a single cycle to keep the code working.
For data with a very small scope of access, such as a temporary variable in a Pick one of the subclasses. Create a subclass method that overrides the
small function, this isn’t a problem. But as the scope grows, so does the difficulty, conditional statement method. Copy the body of that leg of the conditional
which is why global data is such a pain. statement into the subclass method and adjust it to fit.
So if I want to move widely accessed data, often the best approach is to first
encapsulate it by routing all its access through functions. That way, I turn the Repeat for each leg of the conditional.
difficult task of reorganizing data into the simpler task of reorganizing functions. Leave a default case for the superclass method. Or, if superclass should be
Encapsulating data is valuable for other things too. It provides a clear point to abstract, declare that method as abstract or throw an error to show it should
monitor changes and use of the data; I can easily add validation or consequential be the responsibility of a subclass.
logic on the updates. It is my habit to make all mutable data encapsulated like
this and only accessed through functions if its scope is greater than a single
function. The greater the scope of the data, the more important it is to encapsulate.
272 Chapter 10 Simplifying Conditional Logic Encapsulate Variable 133
My approach with legacy code is that whenever I need to change or add a new
Replace Conditional with Polymorphism reference to such a variable, I should take the opportunity to encapsulate it. That
way I prevent the increase of coupling to commonly used data.
This principle is why the object-oriented approach puts so much emphasis on
keeping an object’s data private. Whenever I see a public field, I consider using
Encapsulate Variable (in that case often called Encapsulate Field) to reduce its
visibility. Some go further and argue that even internal references to fields within
a class should go through accessor functions—an approach known as self--
encapsulation. On the whole, I find self-encapsulation excessive—if a class is so
big that I need to self-encapsulate its fields, it needs to be broken up anyway.
But self-encapsulating a field is a useful step before splitting a class.
Keeping data encapsulated is much less important for immutable data. When
switch (bird.type) { the data doesn’t change, I don’t need a place to put in validation or other logic
case 'EuropeanSwallow':
hooks before updates. I can also freely copy the data rather than move it—so I
return "average";
case 'AfricanSwallow': don’t have to change references from old locations, nor do I worry about sections
return (bird.numberOfCoconuts > 2) ? "tired" : "average"; of code getting stale data. Immutability is a powerful preservative.
case 'NorwegianBlueParrot':
return (bird.voltage > 100) ? "scorched" : "beautiful";
default: Mechanics
return "unknown";
Create encapsulating functions to access and update the variable.
Run static checks.
For each reference to the variable, replace with a call to the appropriate
encapsulating function. Test after each replacement.
class EuropeanSwallow {
get plumage() { Restrict the visibility of the variable.
return "average";
} Sometimes it’s not possible to prevent access to the variable. If so, it may be
class AfricanSwallow { useful to detect any remaining references by renaming the variable and testing.
get plumage() {
return (this.numberOfCoconuts > 2) ? "tired" : "average"; Test.
}
class NorwegianBlueParrot { If the value of the variable is a record, consider Encapsulate Record (162).
get plumage() {
return (this.voltage > 100) ? "scorched" : "beautiful";
} Example
Consider some useful data held in a global variable.
Motivation let defaultOwner = {firstName: "Martin", lastName: "Fowler"};
Complex conditional logic is one of the hardest things to reason about in pro- Like any data, it’s referenced with code like this:
gramming, so I always look for ways to add structure to conditional logic. Often,
spaceship.owner = defaultOwner;
I find I can separate the logic into different circumstances—high-level cases—to
and updated like this:
defaultOwner = {firstName: "Rebecca", lastName: "Parsons"};
134 Chapter 6 A First Set of Refactorings Replace Nested Conditional with Guard Clauses 271
To do a basic encapsulation on this, I start by defining functions to read and function adjustedCapital(anInstrument) {
write the data. if ( anInstrument.capital <= 0
|| anInstrument.interestRate <= 0
function getDefaultOwner() {return defaultOwner;} || anInstrument.duration <= 0) return 0;
function setDefaultOwner(arg) {defaultOwner = arg;} return (anInstrument.income / anInstrument.duration) * anInstrument.adjustmentFactor;
}
I then start working on references to defaultOwner. When I see a reference, I replace
it with a call to the getting function.
spaceship.owner = getDefaultOwner();
defaultOwner.js…
let defaultOwner = {firstName: "Martin", lastName: "Fowler"};
export function getDefaultOwner() {return defaultOwner;}
export function setDefaultOwner(arg) {defaultOwner = arg;}
defaultOwner.js…
let defaultOwnerData = {firstName: "Martin", lastName: "Fowler"};
export function getdefaultOwner() {return defaultOwnerData;}
export function setDefaultOwner(arg) {defaultOwnerData = arg;}
Again, I make the replacements one at a time, but this time I reverse the const owner1 = defaultOwner();
condition as I put in the guard clause. assert.equal("Fowler", owner1.lastName, "when set");
const owner2 = defaultOwner();
function adjustedCapital(anInstrument) { owner2.lastName = "Parsons";
let result = 0; assert.equal("Parsons", owner1.lastName, "after change owner2"); // is this ok?
if (anInstrument.capital <= 0) return result;
if (anInstrument.interestRate > 0 && anInstrument.duration > 0) { The basic refactoring encapsulates the reference to the data item. In many
result = (anInstrument.income / anInstrument.duration) * anInstrument.adjustmentFactor; cases, this is all I want to do for the moment. But I often want to take the encap-
} sulation deeper to control not just changes to the variable but also to its contents.
return result; For this, I have a couple of options. The simplest one is to prevent any changes
}
to the value. My favorite way to handle this is by modifying the getting function to
The next conditional is a bit more complicated, so I do it in two steps. First, I return a copy of the data.
simply add a not.
defaultOwner.js…
function adjustedCapital(anInstrument) { let defaultOwnerData = {firstName: "Martin", lastName: "Fowler"};
let result = 0; export function defaultOwner() {return Object.assign({}, defaultOwnerData);}
if (anInstrument.capital <= 0) return result; export function setDefaultOwner(arg) {defaultOwnerData = arg;}
if (!(anInstrument.interestRate > 0 && anInstrument.duration > 0)) return result;
result = (anInstrument.income / anInstrument.duration) * anInstrument.adjustmentFactor; I use this approach particularly often with lists. If I return a copy of the data,
return result; any clients using it can change it, but that change isn’t reflected in the shared
} data. I have to be careful with using copies, however: Some code may expect to
change shared data. If that’s the case, I’m relying on my tests to detect a problem.
Leaving nots in a conditional like that twists my mind around at a painful angle,
An alternative is to prevent changes—and a good way of doing that is Encapsulate
so I simplify it:
Record (162).
function adjustedCapital(anInstrument) {
let result = 0; let defaultOwnerData = {firstName: "Martin", lastName: "Fowler"};
if (anInstrument.capital <= 0) return result; export function defaultOwner() {return new Person(defaultOwnerData);}
if (anInstrument.interestRate <= 0 || anInstrument.duration <= 0) return result; export function setDefaultOwner(arg) {defaultOwnerData = arg;}
result = (anInstrument.income / anInstrument.duration) * anInstrument.adjustmentFactor;
class Person {
return result;
constructor(data) {
}
this._lastName = data.lastName;
this._firstName = data.firstName
Both of those lines have conditions with the same result, so I apply Consolidate
}
Conditional Expression (263). get lastName() {return this._lastName;}
get firstName() {return this._firstName;}
function adjustedCapital(anInstrument) {
// and so on for other properties
let result = 0;
if ( anInstrument.capital <= 0
Now, any attempt to reassign the properties of the default owner will cause
|| anInstrument.interestRate <= 0
|| anInstrument.duration <= 0) return result;
an error. Different languages have different techniques to detect or prevent
result = (anInstrument.income / anInstrument.duration) * anInstrument.adjustmentFactor; changes like this, so depending on the language I’d consider other options.
return result; Detecting and preventing changes like this is often worthwhile as a temporary
} measure. I can either remove the changes, or provide suitable mutating functions.
Then, once they are all dealt with, I can modify the getting method to return
The result variable is doing two things here. Its first setting to zero indicates
a copy.
what to return when the guard clause triggers; its second value is the final com-
So far I’ve talked about copying on getting data, but it may be worthwhile to
putation. I can get rid of it, which both eliminates its double usage and gets me
make a copy in the setter too. That will depend on where the data comes from
a strawberry.
and whether I need to maintain a link to reflect any changes in that original data.
136 Chapter 6 A First Set of Refactorings Replace Nested Conditional with Guard Clauses 269
If I don’t need such a link, a copy prevents accidents due to changes on that function payAmount(employee) {
source data. Taking a copy may be superfluous most of the time, but copies in let result;
if (employee.isSeparated) return {amount: 0, reasonCode: "SEP"};
these cases usually have a negligible effect on performance; on the other hand,
if (employee.isRetired) return {amount: 0, reasonCode: "RET"};
if I don’t do them, there is a risk of a long and difficult bout of debugging in the // logic to compute amount
future. lorem.ipsum(dolor.sitAmet);
Remember that the copying above, and the class wrapper, both only work one consectetur(adipiscing).elit();
level deep in the record structure. Going deeper requires more levels of copies sed.do.eiusmod = tempor.incididunt.ut(labore) && dolore(magna.aliqua);
or object wrapping. ut.enim.ad(minim.veniam);
result = someFinalComputation();
As you can see, encapsulating data is valuable, but often not straightforward. return result;
Exactly what to encapsulate—and how to do it—depends on the way the data is }
being used and the changes I have in mind. But the more widely it’s used, the
more it’s worth my attention to encapsulate properly. At which point the result variable isn’t really doing anything useful, so I
remove it.
function payAmount(employee) {
let result;
if (employee.isSeparated) return {amount: 0, reasonCode: "SEP"};
if (employee.isRetired) return {amount: 0, reasonCode: "RET"};
// logic to compute amount
lorem.ipsum(dolor.sitAmet);
consectetur(adipiscing).elit();
sed.do.eiusmod = tempor.incididunt.ut(labore) && dolore(magna.aliqua);
ut.enim.ad(minim.veniam);
return someFinalComputation();
}
The rule is that you always get an extra strawberry when you remove a mutable
variable.
function payAmount(employee) {
let result; Rename Variable
if(employee.isSeparated) {
result = {amount: 0, reasonCode: "SEP"};
}
else {
if (employee.isRetired) {
result = {amount: 0, reasonCode: "RET"};
}
else { let a = height * width;
// logic to compute amount
lorem.ipsum(dolor.sitAmet);
consectetur(adipiscing).elit();
sed.do.eiusmod = tempor.incididunt.ut(labore) && dolore(magna.aliqua);
ut.enim.ad(minim.veniam);
result = someFinalComputation();
let area = height * width;
}
}
return result;
} Motivation
Nesting the conditionals here masks the true meaning of what it going on. The Naming things well is the heart of clear programming. Variables can do a lot to
primary purpose of this code only applies if these conditions aren’t the case. In explain what I’m up to—if I name them well. But I frequently get my names
this situation, the intention of the code reads more clearly with guard clauses. wrong—sometimes because I’m not thinking carefully enough, sometimes because
As with any refactoring change, I like to take small steps, so I begin with the my understanding of the problem improves as I learn more, and sometimes
topmost condition. because the program’s purpose changes as my users’ needs change.
Even more than most program elements, the importance of a name depends
function payAmount(employee) {
let result; on how widely it’s used. A variable used in a one-line lambda expression is
if (employee.isSeparated) return {amount: 0, reasonCode: "SEP"}; usually easy to follow—I often use a single letter in that case since the variable’s
if (employee.isRetired) { purpose is clear from its context. Parameters for short functions can often be
result = {amount: 0, reasonCode: "RET"}; terse for the same reason, although in a dynamically typed language like
} JavaScript, I do like to put the type into the name (hence parameter names
else {
// logic to compute amount
like aCustomer).
lorem.ipsum(dolor.sitAmet); Persistent fields that last beyond a single function invocation require more
consectetur(adipiscing).elit(); careful naming. This is where I’m likely to put most of my attention.
sed.do.eiusmod = tempor.incididunt.ut(labore) && dolore(magna.aliqua);
ut.enim.ad(minim.veniam);
Mechanics
result = someFinalComputation();
}
return result;
If the variable is used widely, consider Encapsulate Variable (132).
} Find all references to the variable, and change every one.
I test that change and move on to the next one. If there are references from another code base, the variable is a published variable,
and you cannot do this refactoring.
If the variable does not change, you can copy it to one with the new name, then
change gradually, testing after each change.
Test.
138 Chapter 6 A First Set of Refactorings Replace Nested Conditional with Guard Clauses 267
Example Motivation
The simplest case for renaming a variable is when it’s local to a single function: I often find that conditional expressions come in two styles. In the first style,
a temp or argument. It’s too trivial for even an example: I just find each reference both legs of the conditional are part of normal behavior, while in the second
and change it. After I’m done, I test to ensure I didn’t mess up. style, one leg is normal and the other indicates an unusual condition.
Problems occur when the variable has a wider scope than just a single function. These kinds of conditionals have different intentions—and these intentions
There may be a lot of references all over the code base: should come through in the code. If both are part of normal behavior, I use a
condition with an if and an else leg. If the condition is an unusual condition, I
let tpHd = "untitled";
check the condition and return if it’s true. This kind of check is often called a
Some references access the variable: guard clause.
The key point of Replace Nested Conditional with Guard Clauses is emphasis.
result += `<h1>${tpHd}</h1>`; If I’m using an if-then-else construct, I’m giving equal weight to the if leg and
Others update it: the else leg. This communicates to the reader that the legs are equally likely and
important. Instead, the guard clause says, “This isn’t the core to this function,
tpHd = obj['articleTitle']; and if it happens, do something and get out.”
I often find I use Replace Nested Conditional with Guard Clauses when I’m
My usual response to this is apply Encapsulate Variable (132).
working with a programmer who has been taught to have only one entry point
result += `<h1>${title()}</h1>`; and one exit point from a method. One entry point is enforced by modern lan-
guages, but one exit point is really not a useful rule. Clarity is the key principle:
setTitle(obj['articleTitle']);
If the method is clearer with one exit point, use one exit point; otherwise don’t.
function title() {return tpHd;}
function setTitle(arg) {tpHd = arg;}
Mechanics
At this point, I can rename the variable.
Select outermost condition that needs to be replaced, and change it into a
let _title = "untitled"; guard clause.
function title() {return _title;} Test.
function setTitle(arg) {_title = arg;}
Repeat as needed.
I could continue by inlining the wrapping functions so all callers are using the
variable directly. But I’d rarely want to do this. If the variable is used widely If all the guard clauses return the same result, use Consolidate Conditional
enough that I feel the need to encapsulate it in order to change its name, it’s Expression (263).
worth keeping it encapsulated behind functions for the future.
In cases where I was going to inline, I’d call the getting function getTitle and not use Example
an underscore for the variable name when I rename it.
Here’s some code to calculate a payment amount for an employee. It’s only rele-
vant if the employee is still with the company, so it has to check for the two
Renaming a Constant other cases.
If I’m renaming a constant (or something that acts like a constant to clients) I
can avoid encapsulation, and still do the rename gradually, by copying. If the
original declaration looks like this:
const cpyNm = "Acme Gooseberries";
With the copy, I can gradually change references from the old name to the
new name. When I’m done, I remove the copy. I prefer to declare the new name
and copy to the old name if it makes it a tad easier to remove the old name and
put it back again should a test fail.
This works for constants as well as for variables that are read-only to clients
(such as an exported variable in JavaScript).
function getPayAmount() {
let result;
if (isDead)
result = deadAmount();
else {
if (isSeparated)
result = separatedAmount();
else {
if (isRetired)
result = retiredAmount();
else
result = normalPayAmount();
}
}
return result;
}
function getPayAmount() {
if (isDead) return deadAmount();
if (isSeparated) return separatedAmount();
if (isRetired) return retiredAmount();
return normalPayAmount();
}
140 Chapter 6 A First Set of Refactorings Consolidate Conditional Expression 265
function disabilityAmount(anEmployee) {
Introduce Parameter Object if (isNotEligableForDisability()) return 0;
// compute the disability amount
function isNotEligableForDisability() {
return ((anEmployee.seniority < 2)
|| (anEmployee.monthsDisabled > 12)
|| (anEmployee.isPartTime));
}
Mechanics Mechanics
Ensure that none of the conditionals have any side effects. If there isn’t a suitable structure already, create one.
If any do, use Separate Query from Modifier (306) on them first. I prefer to use a class, as that makes it easier to group behavior later on. I usually
like to ensure these structures are value objects [mf-vo].
Take two of the conditional statements and combine their conditions using
a logical operator. Test.
Sequences combine with or, nested if statements combine with and. Use Change Function Declaration (124) to add a parameter for the new
structure.
Test.
Test.
Repeat combining conditionals until they are all in a single condition.
Adjust each caller to pass in the correct instance of the new structure. Test
Consider using Extract Function (106) on the resulting condition. after each one.
For each element of the new structure, replace the use of the original
Example parameter with the element of the structure. Remove the parameter. Test.
Perusing some code, I see the following:
function disabilityAmount(anEmployee) {
Example
if (anEmployee.seniority < 2) return 0;
if (anEmployee.monthsDisabled > 12) return 0;
I’ll begin with some code that looks at a set of temperature readings and deter-
if (anEmployee.isPartTime) return 0; mines whether any of them fall outside of an operating range. Here’s what the
// compute the disability amount data looks like for the readings:
It’s a sequence of conditional checks which all have the same result. Since the const station = { name: "ZB1",
result is the same, I should combine these conditions into a single expression. readings: [
{temp: 47, time: "2016-11-10 09:10"},
For a sequence like this, I do it using an or operator.
{temp: 53, time: "2016-11-10 09:20"},
function disabilityAmount(anEmployee) { {temp: 58, time: "2016-11-10 09:30"},
if ((anEmployee.seniority < 2) {temp: 53, time: "2016-11-10 09:40"},
|| (anEmployee.monthsDisabled > 12)) return 0; {temp: 51, time: "2016-11-10 09:50"},
if (anEmployee.isPartTime) return 0; ]
// compute the disability amount };
I test, then fold in the other condition: I have a function to find the readings that are outside a temperature range.
Notice how the calling code pulls the two data items as a pair from another
object and passes the pair into readingsOutsideRange. The operating plan uses different Consolidate Conditional Expression
names to indicate the start and end of the range compared to readingsOutsideRange.
A range like this is a common case where two separate data items are better
combined into a single object. I’ll begin by declaring a class for the combined data.
class NumberRange {
constructor(min, max) {
this._data = {min: min, max: max};
}
get min() {return this._data.min;}
get max() {return this._data.max;}
} if (anEmployee.seniority < 2) return 0;
if (anEmployee.monthsDisabled > 12) return 0;
I declare a class, rather than just using a basic JavaScript object, because I if (anEmployee.isPartTime) return 0;
usually find this refactoring to be a first step to moving behavior into the newly
created object. Since a class makes sense for this, I go right ahead and use one
directly. I also don’t provide any update methods for the new class, as I’ll probably
make this a Value Object [mf-vo]. Most times I do this refactoring, I create value
objects. if (isNotEligableForDisability()) return 0;
I then use Change Function Declaration (124) to add the new object as a parameter
to readingsOutsideRange. function isNotEligableForDisability() {
return ((anEmployee.seniority < 2)
function readingsOutsideRange(station, min, max, range) { || (anEmployee.monthsDisabled > 12)
return station.readings || (anEmployee.isPartTime));
.filter(r => r.temp < min || r.temp > max); }
}
In JavaScript, I can leave the caller as is, but in other languages I’d have to add
a null for the new parameter which would look something like this: Motivation
caller Sometimes, I run into a series of conditional checks where each check is different
alerts = readingsOutsideRange(station, yet the resulting action is the same. When I see this, I use and and or operators
operatingPlan.temperatureFloor, to consolidate them into a single conditional check with a single result.
operatingPlan.temperatureCeiling, Consolidating the conditional code is important for two reasons. First, it makes
null); it clearer by showing that I’m really making a single check that combines other
checks. The sequence has the same effect, but it looks like I’m carrying out a se-
At this point I haven’t changed any behavior, and tests should still pass. I then
quence of separate checks that just happen to be close together. The second
go to each caller and adjust it to pass in the correct date range.
reason I like to do this is that it often sets me up for Extract Function (106). Ex-
caller tracting a condition is one of the most useful things I can do to clarify my code.
const range = new NumberRange(operatingPlan.temperatureFloor, operatingPlan.temperatureCeiling); It replaces a statement of what I’m doing with why I’m doing it.
alerts = readingsOutsideRange(station, The reasons in favor of consolidating conditionals also point to the reasons
operatingPlan.temperatureFloor, against doing it. If I consider it to be truly independent checks that shouldn’t be
operatingPlan.temperatureCeiling, thought of as a single check, I don’t do the refactoring.
range);
I still haven’t altered any behavior yet, as the parameter isn’t used. All tests
should still work.
262 Chapter 10 Simplifying Conditional Logic Introduce Parameter Object 143
if (summer()) Now I can start replacing the usage of the parameters. I’ll start with the
charge = summerCharge(); maximum.
else
charge = regularCharge(); function readingsOutsideRange(station, min, max, range) {
return station.readings
function summer() { .filter(r => r.temp < min || r.temp > range.max);
return !aDate.isBefore(plan.summerStart) && !aDate.isAfter(plan.summerEnd); }
}
function summerCharge() { caller
return quantity * plan.summerRate; const range = new NumberRange(operatingPlan.temperatureFloor, operatingPlan.temperatureCeiling);
} alerts = readingsOutsideRange(station,
function regularCharge() { operatingPlan.temperatureFloor,
return quantity * plan.regularRate + plan.regularServiceCharge; operatingPlan.temperatureCeiling,
} range);
With that done, I like to reformat the conditional using the ternary operator. I can test at this point, then remove the other parameter.
charge = summer() ? summerCharge() : regularCharge(); function readingsOutsideRange(station, min, range) {
return station.readings
function summer() { .filter(r => r.temp < range.min || r.temp > range.max);
return !aDate.isBefore(plan.summerStart) && !aDate.isAfter(plan.summerEnd); }
}
function summerCharge() { caller
return quantity * plan.summerRate;
const range = new NumberRange(operatingPlan.temperatureFloor, operatingPlan.temperatureCeiling);
}
alerts = readingsOutsideRange(station,
function regularCharge() {
operatingPlan.temperatureFloor,
return quantity * plan.regularRate + plan.regularServiceCharge;
range);
}
That completes this refactoring. However, replacing a clump of parameters
with a real object is just the setup for the really good stuff. The great benefits of
making a class like this is that I can then move behavior into the new class. In
this case, I’d add a method for range that tests if a value falls within the range.
function readingsOutsideRange(station, range) {
return station.readings
.filter(r => !range.contains(r.temp));
}
class NumberRange…
contains(arg) {return (arg >= this.min && arg <= this.max);}
This is a first step to creating a range [mf-range] that can take on a lot of useful
behavior. Once I’ve identified the need for a range in my code, I can be constantly
on the lookout for other cases where I see a max/min pair of numbers and replace
them with a range. (One immediate possibility is the operating plan, replacing
temperatureFloor and temperatureCeiling with a temperatureRange.) As I look at how these
pairs are used, I can move more useful behavior into the range class, simplifying
its usage across the code base. One of the first things I may add is a value-based
equality method to make it a true value object.
144 Chapter 6 A First Set of Refactorings Decompose Conditional 261
This is really just a particular case of applying Extract Function (106) to my code,
Combine Functions into Class but I like to highlight this case as one where I’ve often found a remarkably good
value for the exercise.
Mechanics
Apply Extract Function (106) on the condition and each leg of the conditional.
As well as a class, functions like this can also be combined into a nested
Decompose Conditional function. Usually I prefer a class to a nested function, as it can be difficult to
test functions nested within another. Classes are also necessary when there is
more than one function in the group that I want to expose to collaborators.
Languages that don’t have classes as a first-class element, but do have
first-class functions, often use the Function As Object [mf-fao] to provide this
capability.
Mechanics
Apply Encapsulate Record (162) to the common data record that the functions
share.
if (!aDate.isBefore(plan.summerStart) && !aDate.isAfter(plan.summerEnd)) If the data that is common between the functions isn’t already grouped into a
charge = quantity * plan.summerRate; record structure, use Introduce Parameter Object (140) to create a record to group
else it together.
charge = quantity * plan.regularRate + plan.regularServiceCharge;
Take each function that uses the common record and use Move Function
(198) to move it into the new class.
Any arguments to the function call that are members can be removed from the
argument list.
if (summer())
charge = summerCharge(); Each bit of logic that manipulates the data can be extracted with Extract
else Function (106) and then moved into the new class.
charge = regularCharge();
Example
Motivation I grew up in England, a country renowned for its love of Tea. (Personally, I don’t
One of the most common sources of complexity in a program is complex condi- like most tea they serve in England, but have since acquired a taste for Chinese
tional logic. As I write code to do various things depending on various conditions, and Japanese teas.) So my author’s fantasy conjures up a state utility for providing
I can quickly end up with a pretty long function. Length of a function is in itself tea to the population. Every month they read the tea meters, to get a record
a factor that makes it harder to read, but conditions increase the difficulty. The like this:
problem usually lies in the fact that the code, both in the condition checks and reading = {customer: "ivan", quantity: 10, month: 5, year: 2017};
in the actions, tells me what happens but can easily obscure why it happens.
As with any large block of code, I can make my intention clearer by decompos- I look through the code that processes these records, and I see lots of places
ing it and replacing each chunk of code with a function call named after the in- where similar calculations are done on the data. So I find a spot that calculates
tention of that chunk. With conditions, I particularly like doing this for the the base charge:
conditional part and each of the alternatives. This way, I highlight the condition
client 1…
and make it clear what I’m branching on. I also highlight the reason for the
const aReading = acquireReading();
branching. const baseCharge = baseRate(aReading.month, aReading.year) * aReading.quantity;
Being England, everything essential must be taxed, so it is with tea. But the
rules allow at least an essential level of tea to be free of taxation.
146 Chapter 6 A First Set of Refactorings
client 2… Chapter 10
const aReading = acquireReading();
const base = (baseRate(aReading.month, aReading.year) * aReading.quantity);
const taxableCharge = Math.max(0, base - taxThreshold(aReading.year));
I’m sure that, like me, you noticed that the formula for the base charge is du-
plicated between these two fragments. If you’re like me, you’re already reaching Simplifying Conditional Logic
for Extract Function (106). Interestingly, it seems our work has been done for us
elsewhere.
client 3…
const aReading = acquireReading();
const basicChargeAmount = calculateBaseCharge(aReading);
function calculateBaseCharge(aReading) { Much of the power of programs comes from their ability to implement conditional
return baseRate(aReading.month, aReading.year) * aReading.quantity;
logic—but, sadly, much of the complexity of programs lies in these conditionals.
}
I often use refactoring to make conditional sections easier to understand. I regu-
Given this, I have a natural impulse to change the two earlier bits of client larly apply Decompose Conditional (260) to complicated conditionals, and I use
code to use this function. But the trouble with top-level functions like this is that Consolidate Conditional Expression (263) to make logical combinations clearer. I
they are often easy to miss. I’d rather change the code to give the function a use Replace Nested Conditional with Guard Clauses (266) to clarify cases where I want
closer connection to the data it processes. A good way to do this is to turn the to run some pre-checks before my main processing. If I see several conditions
data into a class. using the same switching logic, it’s a good time to pull Replace Conditional with
To turn the record into a class, I use Encapsulate Record (162). Polymorphism (272) out the box.
A lot of conditionals are used to handle special cases, such as nulls; if that
class Reading { logic is mostly the same, then Introduce Special Case (289) (often referred to as
constructor(data) {
this._customer = data.customer;
Introduce Null Object (289)) can remove a lot of duplicate code. And, although I
this._quantity = data.quantity; like to remove conditions a lot, if I want to communicate (and check) a program’s
this._month = data.month; state, I find Introduce Assertion (302) a worthwhile addition.
this._year = data.year;
}
get customer() {return this._customer;}
get quantity() {return this._quantity;}
get month() {return this._month;}
get year() {return this._year;}
}
To move the behavior, I’ll start with the function I already have: calculateBaseCharge.
To use the new class, I need to apply it to the data as soon as I’ve acquired it.
client 3…
const rawReading = acquireReading();
const aReading = new Reading(rawReading);
const basicChargeAmount = calculateBaseCharge(aReading);
I then use Move Function (198) to move calculateBaseCharge into the new class.
259
258 Chapter 9 Organizing Data Combine Functions into Class 147
client 3… Mechanics
const rawReading = acquireReading();
const aReading = new Reading(rawReading); Create a repository for instances of the related object (if one isn’t already
const taxableCharge = taxableChargeFn(aReading); present).
Then I apply Move Function (198). Ensure the constructor has a way of looking up the correct instance of the
related object.
class Reading…
get taxableCharge() { Change the constructors for the host object to use the repository to obtain
return Math.max(0, this.baseCharge - taxThreshold(this.year)); the related object. Test after each change.
}
client 3… Example
const rawReading = acquireReading();
const aReading = new Reading(rawReading); I’ll begin with a class that represents orders, which I might create from an incom-
const taxableCharge = aReading.taxableCharge; ing JSON document. Part of the order data is a customer ID from which I’m
creating a customer object.
Since all the derived data is calculated on demand, I have no problem should
I need to update the stored data. In general, I prefer immutable data, but many class Order…
circumstances force us to work with mutable data (such as JavaScript, a language constructor(data) {
ecosystem that wasn’t designed with immutability in mind). When there is a this._number = data.number;
reasonable chance the data will be updated somewhere in the program, then this._customer = new Customer(data.customer);
a class is very helpful. // load other data
}
get customer() {return this._customer;}
class Customer…
constructor(id) {
this._id = id;
}
get id() {return this._id;}
The customer object I create this way is a value. If I have five orders that refer
to the customer ID of 123, I’ll have five separate customer objects. Any change
I make to one of them will not be reflected in the others. Should I want to enrich
the customer objects, perhaps by gathering data from a customer service, I’d have
to update all five customers with the same data. Having duplicate objects like
this always makes me nervous—it’s confusing to have multiple objects represent-
ing the same entity, such as a customer. This problem is particularly awkward if
the customer object is mutable, which can lead to inconsistencies between the
customer objects.
If I want to use the same customer object each time, I’ll need a place to store
it. Exactly where to store entities like this will vary from application to application,
but for a simple case I like to use a repository object [mf-repos].
256 Chapter 9 Organizing Data Combine Functions into Transform 149
function enrichReading(argReading) {
let customer = customerRepository.get(customerData.id); const aReading = _.cloneDeep(argReading);
aReading.baseCharge = base(aReading);
aReading.taxableCharge = taxableCharge(aReading);
return aReading;
Motivation
}
A data structure may have several records linked to the same logical data structure.
I might read in a list of orders, some of which are for the same customer. When I
have sharing like this, I can represent it by treating the customer either as a value Motivation
or as a reference. With a value, the customer data is copied into each order; with
Software often involves feeding data into programs that calculate various derived
a reference, there is only one data structure that multiple orders link to.
information from it. These derived values may be needed in several places, and
If the customer never needs to be updated, then both approaches are reasonable.
those calculations are often repeated wherever the derived data is used. I prefer
It is, perhaps, a bit confusing to have multiple copies of the same data, but it’s
to bring all of these derivations together, so I have a consistent place to find and
common enough to not be a problem. In some cases, there may be issues
update them and avoid any duplicate logic.
with memory due to multiple copies—but, like any performance issue, that’s
One way to do this is to use a data transformation function that takes the
relatively rare.
source data as input and calculates all the derivations, putting each derived value
The biggest difficulty in having physical copies of the same logical data occurs
as a field in the output data. Then, to examine the derivations, all I need do is
when I need to update the shared data. I then have to find all the copies and
look at the transform function.
update them all. If I miss one, I’ll get a troubling inconsistency in my data. In
An alternative to Combine Functions into Transform is Combine Functions into
this case, it’s often worthwhile to change the copied data into a single reference.
Class (144) that moves the logic into methods on a class formed from the source
That way, any change is visible to all the customer’s orders.
data. Either of these refactorings are helpful, and my choice will often depend
Changing a value to a reference results in only one object being present for
on the style of programming already in the software. But there is one important
an entity, and it usually means I need some kind of repository where I can access
difference: Using a class is much better if the source data gets updated within
these objects. I then only create the object for an entity once, and everywhere
the code. Using a transform stores derived data in the new record, so if the source
else I retrieve it from the repository.
data changes, I will run into inconsistencies.
150 Chapter 6 A First Set of Refactorings Change Reference to Value 255
One of the reasons I like to do combine functions is to avoid duplication of In most object-oriented languages, there is a built-in equality test that is supposed to
the derivation logic. I can do that just by using Extract Function (106) on the logic, be overridden for value-based equality. In Ruby, I can override the == operator; in Java,
but it’s often difficult to find the functions unless they are kept close to the data I override the Object.equals() method. And whenever I override an equality method, I
structures they operate on. Using a transform (or a class) makes it easy to find usually need to override a hashcode generating method too (e.g., Object.hashCode() in Java)
to ensure collections that use hashing work properly with my new value.
and use them.
If the telephone number is used by more than one client, the procedure is still
Mechanics the same. As I apply Remove Setting Method (331), I’ll be modifying several clients
instead of just one. Tests for non-equal telephone numbers, as well as comparisons
Create a transformation function that takes the record to be transformed to non-telephone-numbers and null values, are also worthwhile.
and returns the same values.
This will usually involve a deep copy of the record. It is often worthwhile to write
a test to ensure the transform does not alter the original record.
Pick some logic and move its body into the transform to create a new field
in the record. Change the client code to access the new field.
If the logic is complex, use Extract Function (106) first.
Test.
Repeat for the other relevant functions.
Example
Where I grew up, tea is an important part of life—so much that I can imagine a
special utility that provides tea to the populace that’s regulated like a utility. Every
month, the utility gets a reading of how much tea a customer has acquired.
reading = {customer: "ivan", quantity: 10, month: 5, year: 2017};
Code in various places calculates various consequences of this tea usage. One
such calculation is the base monetary amount that’s used to calculate the charge
for the customer.
client 1…
const aReading = acquireReading();
const baseCharge = baseRate(aReading.month, aReading.year) * aReading.quantity;
Another is the amount that should be taxed—which is less than the base amount
since the government wisely considers that every citizen should get some tea
tax free.
client 2…
const aReading = acquireReading();
const base = (baseRate(aReading.month, aReading.year) * aReading.quantity);
const taxableCharge = Math.max(0, base - taxThreshold(aReading.year));
254 Chapter 9 Organizing Data Combine Functions into Transform 151
class Person… Looking through this code, I see these calculations repeated in several places.
get officeAreaCode() {return this._telephoneNumber.areaCode;} Such duplication is asking for trouble when they need to change (and I’d bet it’s
set officeAreaCode(arg) { “when” not “if”). I can deal with this repetition by using Extract Function (106) on
this._telephoneNumber = new TelephoneNumber(arg, this.officeNumber); these calculations, but such functions often end up scattered around the program
}
making it hard for future developers to realize they are there. Indeed, looking
get officeNumber() {return this._telephoneNumber.number;}
set officeNumber(arg) {this._telephoneNumber.number = arg;} around I discover such a function, used in another area of the code.
Within the transformation function, I’m happy to mutate a result object, instead
of copying each time. I like immutability, but most common languages make it
difficult to work with. I’m prepared to go through the extra effort to support
152 Chapter 6 A First Set of Refactorings Change Reference to Value 253
it at boundaries, but will mutate within smaller scopes. I also pick my names Mechanics
(using aReading as the accumulating variable) to make it easier to move the code
into the transformer function. Check that the candidate class is immutable or can become immutable.
I change the client that uses that function to use the enriched field instead.
For each setter, apply Remove Setting Method (331).
client 3…
Provide a value-based equality method that uses the fields of the value object.
const rawReading = acquireReading();
const aReading = enrichReading(rawReading); Most language environments provide an overridable equality function for this
const basicChargeAmount = aReading.baseCharge; purpose. Usually you must override a hashcode generator method as well.
Once I’ve moved all calls to calculateBaseCharge, I can nest it inside enrichReading.
That would make it clear that clients that need the calculated base charge should Example
use the enriched record.
One trap to beware of here. When I write enrichReading like this, to return the Imagine we have a person object that holds onto a crude telephone number.
enriched reading, I’m implying that the original reading record isn’t changed. So class Person…
it’s wise for me to add a test. constructor() {
it('check reading unchanged', function() { this._telephoneNumber = new TelephoneNumber();
const baseReading = {customer: "ivan", quantity: 15, month: 5, year: 2017}; }
const oracle = _.cloneDeep(baseReading);
get officeAreaCode() {return this._telephoneNumber.areaCode;}
enrichReading(baseReading);
set officeAreaCode(arg) {this._telephoneNumber.areaCode = arg;}
assert.deepEqual(baseReading, oracle);
get officeNumber() {return this._telephoneNumber.number;}
});
set officeNumber(arg) {this._telephoneNumber.number = arg;}
I can then change client 1 to also use the same field. class TelephoneNumber…
client 1… get areaCode() {return this._areaCode;}
set areaCode(arg) {this._areaCode = arg;}
const rawReading = acquireReading();
const aReading = enrichReading(rawReading); get number() {return this._number;}
const baseCharge = aReading.baseCharge; set number(arg) {this._number = arg;}
There is a good chance I can then use Inline Variable (123) on baseCharge too. This situation is the result of an Extract Class (182) where the old parent still
Now I turn to the taxable amount calculation. My first step is to add in the holds update methods for the new object. This is a good time to apply Change
transformation function. Reference to Value since there is only one reference to the new class.
const rawReading = acquireReading();
The first thing I need to do is to make the telephone number immutable. I do
const aReading = enrichReading(rawReading); this by applying Remove Setting Method (331) to the fields. The first step of Remove
const base = (baseRate(aReading.month, aReading.year) * aReading.quantity); Setting Method (331) is to use Change Function Declaration (124) to add the two
const taxableCharge = Math.max(0, base - taxThreshold(aReading.year)); fields to the constructor and enhance the constructor to call the setters.
I can immediately replace the calculation of the base charge with the new field. class TelephoneNumber…
If the calculation was complex, I could Extract Function (106) first, but here it’s constructor(areaCode, number) {
simple enough to do in one step. this._areaCode = areaCode;
this._number = number;
const rawReading = acquireReading(); }
const aReading = enrichReading(rawReading);
const base = aReading.baseCharge; Now I look at the callers of the setters. For each one, I need to change it to a
const taxableCharge = Math.max(0, base - taxThreshold(aReading.year)); reassignment. I start with the area code.
Once I’ve tested that that works, I apply Inline Variable (123):
252 Chapter 9 Organizing Data Combine Functions into Transform 153
Once I’ve tested that, it’s likely I would be able to use Inline Variable (123) on
taxableCharge.
One big problem with an enriched reading like this is: What happens should
a client change a data value? Changing, say, the quantity field would result in
class Product { data that’s inconsistent. To avoid this in JavaScript, my best option is to use
applyDiscount(arg) { Combine Functions into Class (144) instead. If I’m in a language with immutable
this._price = new Money(this._price.amount - arg, this._price.currency);
data structures, I don’t have this problem, so its more common to see transforms
}
in those languages. But even in languages without immutability, I can use trans-
forms if the data appears in a read-only context, such as deriving data to display
on a web page.
Motivation
When I nest an object, or data structure, within another I can treat the inner
object as a reference or as a value. The difference is most obviously visible in
how I handle updates of the inner object’s properties. If I treat it as a reference,
I’ll update the inner object’s property keeping the same inner object. If I treat it
as a value, I will replace the entire inner object with a new one that has the
desired property.
If I treat a field as a value, I can change the class of the inner object to make
it a Value Object [mf-vo]. Value objects are generally easier to reason about,
particularly because they are immutable. In general, immutable data structures
are easier to deal with. I can pass an immutable data value out to other parts of
the program and not worry that it might change without the enclosing object
being aware of the change. I can replicate values around my program and not
worry about maintaining memory links. Value objects are especially useful in
distributed and concurrent systems.
This also suggests when I shouldn’t do this refactoring. If I want to share an
object between several objects so that any change to the shared object is visible
to all its collaborators, then I need the shared object to be a reference.
154 Chapter 6 A First Set of Refactorings Replace Derived Variable with Query 251
constructor (production) {
Split Phase this._initialProduction = production;
this._productionAccumulator = 0;
this._adjustments = [];
}
get production() {
return this._initialProduction + this._productionAccumulator;
}
class ProductionPlan…
get production() {
assert(this._productionAccumulator === this.calculatedProductionAccumulator);
const orderData = orderString.split(/\s+/); return this._initialProduction + this._productionAccumulator;
const productPrice = priceList[orderData[0].split("-")[1]]; }
const orderPrice = parseInt(orderData[1]) * productPrice;
get calculatedProductionAccumulator() {
return this._adjustments
.reduce((sum, a) => sum + a.amount, 0);
}
function parseOrder(aString) {
const values = aString.split(/\s+/);
return ({
productID: values[0].split("-")[1],
quantity: parseInt(values[1]),
});
}
function price(order, priceList) {
return order.quantity * priceList[order.productID];
}
Motivation
When I run into code that’s dealing with two different things, I look for a way
to split it into separate modules. I endeavor to make this split because, if I need to
make a change, I can deal with each topic separately and not have to hold both
in my head together. If I’m lucky, I may only have to change one module without
having to remember the details of the other one at all.
One of the neatest ways to do a split like this is to divide the behavior into
two sequential phases. A good example of this is when you have some processing
whose inputs don’t reflect the model you need to carry out the logic. Before you
begin, you can massage the input into a convenient form for your main processing.
250 Chapter 9 Organizing Data Split Phase 155
With the assertion in place, I run my tests. If the assertion doesn’t fail, I can Or, you can take the logic you need to do and break it down into sequential
replace returning the field with returning the calculation: steps, where each step is significantly different in what it does.
The most obvious example of this is a compiler. It’s a basic task is to take some
class ProductionPlan… text (code in a programming language) and turn it into some executable form
get production() { (e.g., object code for a specific hardware). Over time, we’ve found this can be
assert(this._production === this.calculatedProduction);
usefully split into a chain of phases: tokenizing the text, parsing the tokens into
return this.calculatedProduction;
} a syntax tree, then various steps of transforming the syntax tree (e.g., for opti-
mization), and finally generating the object code. Each step has a limited scope
Then Inline Function (115): and I can think of one step without understanding the details of others.
Splitting phases like this is common in large software; the various phases in a
class ProductionPlan…
compiler can each contain many functions and classes. But I can carry out the
get production() {
basic split-phase refactoring on any fragment of code—whenever I see an oppor-
return this._adjustments
.reduce((sum, a) => sum + a.amount, 0);
tunity to usefully separate the code into different phases. The best clue is when
} different stages of the fragment use different sets of data and functions. By turning
them into separate modules I can make this difference explicit, revealing the
I clean up any references to the old variable with Remove Dead Code (237): difference in the code.
class ProductionPlan…
applyAdjustment(anAdjustment) { Mechanics
this._adjustments.push(anAdjustment);
this._production += anAdjustment.amount; Extract the second phase code into its own function.
}
Test.
Example: More Than One Source Introduce an intermediate data structure as an additional argument to the
extracted function.
The above example is nice and easy because there’s clearly a single source for the
value of production. But sometimes, more than one element can combine in Test.
the accumulator.
Examine each parameter of the extracted second phase. If it is used by first
class ProductionPlan… phase, move it to the intermediate data structure. Test after each move.
constructor (production) { Sometimes, a parameter should not be used by the second phase. In this case,
this._production = production;
extract the results of each usage of the parameter into a field of the intermediate
this._adjustments = [];
data structure and use Move Statements to Callers (217) on the line that populates it.
}
get production() {return this._production;} Apply Extract Function (106) on the first-phase code, returning the intermediate
applyAdjustment(anAdjustment) {
data structure.
this._adjustments.push(anAdjustment);
this._production += anAdjustment.amount;
It’s also reasonable to extract the first phase into a transformer object.
}
If I do the same Introduce Assertion (302) that I did above, it will now fail for Example
any case where the initial value of the production isn’t zero.
But I can still replace the derived data. The only difference is that I must first I’ll start with code to price an order for some vague and unimportant kind of
apply Split Variable (240). goods:
156 Chapter 6 A First Set of Refactorings Replace Derived Variable with Query 249
Now, I look at the various parameters to applyShipping. The first one is basePrice
get discountedTotal() {return this._discountedTotal;} which is created by the first-phase code. So I move this into the intermediate
set discount(aNumber) { data structure, removing it from the parameter list.
const old = this._discount;
this._discount = aNumber; function priceOrder(product, quantity, shippingMethod) {
this._discountedTotal += old - aNumber; const basePrice = product.basePrice * quantity;
} const discount = Math.max(quantity - product.discountThreshold, 0)
* product.basePrice * product.discountRate;
const priceData = {basePrice: basePrice};
const price = applyShipping(priceData, basePrice, shippingMethod, quantity, discount);
return price;
}
function applyShipping(priceData, basePrice, shippingMethod, quantity, discount) {
get discountedTotal() {return this._baseTotal - this._discount;}
const shippingPerCase = (priceData.basePrice > shippingMethod.discountThreshold)
set discount(aNumber) {this._discount = aNumber;}
? shippingMethod.discountedFee : shippingMethod.feePerCase;
const shippingCost = quantity * shippingPerCase;
const price = priceData.basePrice - discount + shippingCost;
Motivation return price;
}
One of the biggest sources of problems in software is mutable data. Data changes
can often couple together parts of code in awkward ways, with changes in one The next parameter in the list is shippingMethod. This one I leave as is, since it
part leading to knock-on effects that are hard to spot. In many situations it’s not isn’t used by the first-phase code.
realistic to entirely remove mutable data—but I do advocate minimizing the scope After this, I have quantity. This is used by the first phase but not created by it,
of mutable data at much as possible. so I could actually leave this in the parameter list. My usual preference, however,
One way I can make a big impact is by removing any variables that I could is to move as much as I can to the intermediate data structure.
just as easily calculate. A calculation often makes it clearer what the meaning of function priceOrder(product, quantity, shippingMethod) {
the data is, and it is protected from being corrupted when you fail to update the const basePrice = product.basePrice * quantity;
variable as the source data changes. const discount = Math.max(quantity - product.discountThreshold, 0)
A reasonable exception to this is when the source data for the calculation is * product.basePrice * product.discountRate;
const priceData = {basePrice: basePrice, quantity: quantity};
immutable and we can force the result to being immutable too. Transformation
const price = applyShipping(priceData, shippingMethod, quantity, discount);
operations that create new data structures are thus reasonable to keep even if return price;
they could be replaced with calculations. Indeed, there is a duality here between }
objects that wrap a data structure with a series of calculated properties and function applyShipping(priceData, shippingMethod, quantity, discount) {
functions that transform one data structure into another. The object route is const shippingPerCase = (priceData.basePrice > shippingMethod.discountThreshold)
clearly better when the source data changes and you would have to manage the ? shippingMethod.discountedFee : shippingMethod.feePerCase;
const shippingCost = priceData.quantity * shippingPerCase;
lifetime of the derived data structures. But if the source data is immutable, or
const price = priceData.basePrice - discount + shippingCost;
the derived data is very transient, then both approaches are effective. return price;
}
158 Chapter 6 A First Set of Refactorings Rename Field 247
class Organization…
class Organization {
constructor(data) {
this._title = (data.title !== undefined) ? data.title : data.name;
this._country = data.country;
}
get name() {return this._title;}
set name(aString) {this._title = aString;}
get country() {return this._country;}
set country(aCountryCode) {this._country = aCountryCode;}
}
Now, callers of my constructor can use either name or title (with title taking
precedence). I can now go through all constructor callers and change them
one-by-one to use the new name.
const organization = new Organization({title: "Acme Gooseberries", country: "GB"});
Once I’ve done all of them, I can remove the support for the name.
class Organization…
class Organization {
constructor(data) {
this._title = data.title;
this._country = data.country;
}
get name() {return this._title;}
set name(aString) {this._title = aString;}
get country() {return this._country;}
set country(aCountryCode) {this._country = aCountryCode;}
}
Now that the constructor and data use the new name, I can change the acces-
sors, which is as simple as applying Rename Function (124) to each one.
Rename Field 245
This page intentionally left blank Apply Rename Function (124) to the accessors.
I want to change “name” to “title”. The object is widely used in the code base,
and there are updates to the title in the code. So my first move is to apply
Encapsulate Record (162).
class Organization {
constructor(data) {
this._name = data.name;
this._country = data.country;
}
get name() {return this._name;}
set name(aString) {this._name = aString;}
get country() {return this._country;}
set country(aCountryCode) {this._country = aCountryCode;}
}
Now that I’ve encapsulated the record structure into the class, there are four
places I need to look at for renaming: the getting function, the setting function,
the constructor, and the internal data structure. While that may sound like I’ve
increased my workload, it actually makes my work easier since I can now change
these independently instead of all at once, taking smaller steps. Smaller steps
mean fewer things to go wrong in each step—therefore, less work. It wouldn’t
be less work if I never made mistakes—but not making mistakes is a fantasy I
gave up on a long time ago.
Since I’ve copied the input data structure into the internal data structure, I
need to separate them so I can work on them independently. I can do this by
defining a separate field and adjusting the constructor and accessors to use it.
244 Chapter 9 Organizing Data
Rename Field
Chapter 7
Encapsulation
class Organization {
get name() {...}
}
Mechanics
If the record has limited scope, rename all accesses to the field and test; no
need to do the rest of the mechanics.
161
162 Chapter 7 Encapsulation Split Variable 243
Here inputValue is used both to supply an input to the function and to hold
Encapsulate Record the result for the caller. (Since JavaScript has call-by-value parameters, any
modification of inputValue isn’t seen by the caller.)
formerly: Replace Record with Data Class In this situation, I would split that variable.
function discount (originalInputValue, quantity) {
let inputValue = originalInputValue;
if (inputValue > 50) inputValue = inputValue - 2;
if (quantity > 100) inputValue = inputValue - 1;
return inputValue;
}
Motivation
This is why I often favor objects over records for mutable data. With objects, I
can hide what is stored and provide methods for all three values. The user of
the object doesn’t need to know or care which is stored and which is calculated.
This encapsulation also helps with renaming: I can rename the field while pro-
viding methods for both the new and the old names, gradually updating callers
until they are all done.
I just said I favor objects for mutable data. If I have an immutable value, I can
just have all three values in my record, using an enrichment step if necessary.
Similarly, it’s easy to copy the field when renaming.
I can have two kinds of record structures: those where I declare the legal field
names and those that allow me to use whatever I like. The latter are often imple-
mented through a library class called something like hash, map, hashmap, dictio-
nary, or associative array. Many languages provide convenient syntax for creating
hashmaps, which makes them useful in many programming situations. The
downside of using them is they are aren’t explicit about their fields. The only
242 Chapter 9 Organizing Data Encapsulate Record 163
function distanceTravelled (scenario, time) { way I can tell if they use start/end or start/length is by looking at where they
let result; are created and used. This isn’t a problem if they are only used in a small section
const primaryAcceleration = scenario.primaryForce / scenario.mass;
of a program, but the wider their scope of usage, the greater problem I get from
let primaryTime = Math.min(time, scenario.delay);
result = 0.5 * primaryAcceleration * primaryTime * primaryTime; their implicit structure. I could refactor such implicit records into explicit ones—but
let secondaryTime = time - scenario.delay; if I need to do that, I’d rather make them classes instead.
if (secondaryTime > 0) { It’s common to pass nested structures of lists and hashmaps which are often
let primaryVelocity = primaryAcceleration * scenario.delay; serialized into formats like JSON or XML. Such structures can be encapsulated
let acc = (scenario.primaryForce + scenario.secondaryForce) / scenario.mass; too, which helps if their formats change later on or if I’m concerned about updates
result += primaryVelocity * secondaryTime + 0.5 * acc * secondaryTime * secondaryTime;
}
to the data that are hard to keep track of.
return result;
} Mechanics
I choose the new name to represent only the first use of the variable. I make Use Encapsulate Variable (132) on the variable holding the record.
it const to ensure it is only assigned once. I can then declare the original variable
at its second assignment. Now I can compile and test, and all should work. Give the functions that encapsulate the record names that are easily searchable.
I continue on the second assignment of the variable. This removes the original Replace the content of the variable with a simple class that wraps the record.
variable name completely, replacing it with a new variable named for the Define an accessor inside this class that returns the raw record. Modify the
second use. functions that encapsulate the variable to use this accessor.
function distanceTravelled (scenario, time) { Test.
let result;
const primaryAcceleration = scenario.primaryForce / scenario.mass; Provide new functions that return the object rather than the raw record.
let primaryTime = Math.min(time, scenario.delay);
result = 0.5 * primaryAcceleration * primaryTime * primaryTime; For each user of the record, replace its use of a function that returns the
let secondaryTime = time - scenario.delay; record with a function that returns the object. Use an accessor on the object
if (secondaryTime > 0) {
to get at the field data, creating that accessor if needed. Test after each
let primaryVelocity = primaryAcceleration * scenario.delay;
const secondaryAcceleration = (scenario.primaryForce + scenario.secondaryForce) / scenario.mass; change.
result += primaryVelocity * secondaryTime +
If it’s a complex record, such as one with a nested structure, focus on clients that
0.5 * secondaryAcceleration * secondaryTime * secondaryTime;
} update the data first. Consider returning a copy or read-only proxy of the data
return result; for clients that only read the data.
}
Remove the class’s raw data accessor and the easily searchable functions
I’m sure you can think of a lot more refactoring to be done here. Enjoy it. (I’m that returned the raw record.
sure it’s better than eating the haggis—do you know what they put in those
Test.
things?)
If the fields of the record are themselves structures, consider using
Encapsulate Record and Encapsulate Collection (170) recursively.
Example: Assigning to an Input Parameter
Another case of splitting a variable is where the variable is declared as an input Example
parameter. Consider something like I’ll start with a constant that is widely used across a program.
function discount (inputValue, quantity) { const organization = {name: "Acme Gooseberries", country: "GB"};
if (inputValue > 50) inputValue = inputValue - 2;
if (quantity > 100) inputValue = inputValue - 1; This is a JavaScript object which is being used as a record structure by various
return inputValue; parts of the program, with accesses like this:
}
164 Chapter 7 Encapsulation Split Variable 241
client… I start at the beginning by changing the name of the variable and declaring
getOrganization().name = newName; the new name as const. Then, I change all references to the variable from that
point up to the next assignment. At the next assignment, I declare it:
Similarly, I replace any readers with the appropriate getter.
class Organization…
get name() {return this._data.name;}
client…
result += `<h1>${getOrganization().name}</h1>`;
240 Chapter 9 Organizing Data Encapsulate Record 165
After I’ve done that, I can follow through on my threat to give the ugly
Split Variable sounding function a short life.
This has the advantage of breaking the link to the input data record. This might
be useful if a reference to it runs around, which would break encapsulation.
Should I not fold the data into individual fields, I would be wise to copy _data
when I assign it.
const perimeter = 2 * (height + width);
console.log(perimeter);
const area = height * width; Example: Encapsulating a Nested Record
console.log(area); The above example looks at a shallow record, but what do I do with data that
is deeply nested, e.g., coming from a JSON document? The core refactoring steps
still apply, and I have to be equally careful with updates, but I do get some options
Motivation around reads.
As an example, here is some slightly more nested data: a collection of
Variables have various uses. Some of these uses naturally lead to the variable customers, kept in a hashmap indexed by their customer ID.
being assigned to several times. Loop variables change for each run of a loop
(such as the i in for (let i=0; i<10; i++)). Collecting variables store a value that is
built up during the method.
Many other variables are used to hold the result of a long-winded bit of code
for easy reference later. These kinds of variables should be set only once. If they
are set more than once, it is a sign that they have more than one responsibility
within the method. Any variable with more than one responsibility should be
replaced with multiple variables, one for each responsibility. Using a variable for
two different things is very confusing for the reader.
Mechanics
Change the name of the variable at its declaration and first assignment.
If the later assignments are of the form i = i + something, that is a collecting variable,
so don’t split it. A collecting variable is often used for calculating sums, string
concatenation, writing to a stream, or adding to a collection.
166 Chapter 7 Encapsulation
"1920": {
name: "martin",
Chapter 9
id: "1920",
usages: {
"2016": {
"1": 50,
"2": 55, Organizing Data
// remaining months of the year
},
"2015": {
"1": 70,
"2": 63,
// remaining months of the year
}
}
}, Data structures play an important role in our programs, so it’s no great shock
"38673": { that I have a clutch of refactorings that focus on them. A value that’s used for
name: "neal",
different purposes is a breeding ground for confusion and bugs—so, when I see
id: "38673",
// more customers in a similar form
one, I use Split Variable (240) to separate the usages. As with any program element,
getting a variable’s name right is tricky and important, so Rename Variable (137)
With more nested data, reads and writes can be digging into the data structure. is often my friend. But sometimes the best thing I can do with a variable is to
get rid of it completely—with Replace Derived Variable with Query (248).
sample update… I often find problems in a code base due to a confusion between references
customerData[customerID].usages[year][month] = amount; and values, so I use Change Reference to Value (252) and Change Value to Reference
sample read… (256) to change between these styles.
function compareUsage (customerID, laterYear, month) {
const later = customerData[customerID].usages[laterYear][month];
const earlier = customerData[customerID].usages[laterYear - 1][month];
return {laterAmount: later, change: later - earlier};
}
sample update…
getRawDataOfCustomers()[customerID].usages[year][month] = amount;
sample read…
function compareUsage (customerID, laterYear, month) {
const later = getRawDataOfCustomers()[customerID].usages[laterYear][month];
const earlier = getRawDataOfCustomers()[customerID].usages[laterYear - 1][month];
return {laterAmount: later, change: later - earlier};
}
239
Encapsulate Record 167
class CustomerData {
constructor(data) {
this._data = data;
}
}
top level…
function getCustomerData() {return customerData;}
function getRawDataOfCustomers() {return customerData._data;}
This page intentionally left blank function setRawDataOfCustomers(arg) {customerData = new CustomerData(arg);}
The most important area to deal with is the updates. So, while I look at all the
callers of getRawDataOfCustomers, I’m focused on those where the data is changed. To
remind you, here’s the update again:
sample update…
getRawDataOfCustomers()[customerID].usages[year][month] = amount;
The general mechanics now say to return the full customer and use an accessor,
creating one if needed. I don’t have a setter on the customer for this update, and
this one digs into the structure. So, to make one, I begin by using Extract Function
(106) on the code that digs into the data structure.
sample update…
setUsage(customerID, year, month, amount);
top level…
function setUsage(customerID, year, month, amount) {
getRawDataOfCustomers()[customerID].usages[year][month] = amount;
}
I then use Move Function (198) to move it into the new customer data class.
sample update…
getCustomerData().setUsage(customerID, year, month, amount);
class CustomerData…
setUsage(customerID, year, month, amount) {
this._data[customerID].usages[year][month] = amount;
}
When working with a big data structure, I like to concentrate on the updates.
Getting them visible and gathered in a single place is the most important part
of the encapsulation.
At some point, I will think I’ve got them all—but how can I be sure? There’s a
couple of ways to check. One is to modify getRawDataOfCustomers to return a deep
copy of the data; if my test coverage is good, one of the tests should break if I
missed a modification.
168 Chapter 7 Encapsulation Remove Dead Code 237
top level…
function getCustomerData() {return customerData;} Remove Dead Code
function getRawDataOfCustomers() {return customerData.rawData;}
function setRawDataOfCustomers(arg) {customerData = new CustomerData(arg);}
class CustomerData…
get rawData() {
return _.cloneDeep(this._data);
}
if(false) {
I’m using the lodash library to make a deep copy. doSomethingThatUsedToMatter();
}
Another approach is to return a read-only proxy for the data structure. Such a
proxy could raise an exception if the client code tries to modify the underlying
object. Some languages make this easy, but it’s a pain in JavaScript, so I’ll leave
it as an exercise for the reader. I could also take a copy and recursively freeze it
to detect any modifications.
Dealing with the updates is valuable, but what about the readers? Here there
are a few options.
Motivation
The first option is to do the same thing as I did for the setters. Extract all the
reads into their own functions and move them into the customer data class. When we put code into production, even on people’s devices, we aren’t charged
by weight. A few unused lines of code don’t slow down our systems nor take up
class CustomerData…
significant memory; indeed, decent compilers will instinctively remove them. But
usage(customerID, year, month) {
unused code is still a significant burden when trying to understand how the
return this._data[customerID].usages[year][month];
}
software works. It doesn’t carry any warning signs telling programmers that they
can ignore this function as it’s never called any more, so they still have to spend
top level… time understanding what it’s doing and why changing it doesn’t seem to alter
function compareUsage (customerID, laterYear, month) { the output as they expected.
const later = getCustomerData().usage(customerID, laterYear, month); Once code isn’t used any more, we should delete it. I don’t worry that I may
const earlier = getCustomerData().usage(customerID, laterYear - 1, month); need it sometime in the future; should that happen, I have my version control
return {laterAmount: later, change: later - earlier};
system so I can always dig it out again. If it’s something I really think I may need
}
one day, I might put a comment into the code that mentions the lost code and
The great thing about this approach is that it gives customerData an explicit API which revision it was removed in—but, honestly, I can’t remember the last time
that captures all the uses made of it. I can look at the class and see all their uses I did that, or regretted that I hadn’t done it.
of the data. But this can be a lot of code for lots of special cases. Modern lan- Commenting out dead code was once a common habit. This was useful in the
guages provide good affordances for digging into a list-and-hash [mf-lh] data days before version control systems were widely used, or when they were incon-
structure, so it’s useful to give clients just such a data structure to work with. venient. Now, when I can put even the smallest code base under version control,
If the client wants a data structure, I can just hand out the actual data. But the that’s no longer needed.
problem with this is that there’s no way to prevent clients from modifying the data
directly, which breaks the whole point of encapsulating all the updates inside Mechanics
functions. Consequently, the simplest thing to do is to provide a copy of the
underlying data, using the rawData method I wrote earlier. If the dead code can be referenced from outside, e.g., when it’s a full function,
do a search to check for callers.
Remove the dead code.
Test.
236 Chapter 8 Moving Features Encapsulate Record 169
Further Reading
For more examples on turning loops into pipelines, see my essay “Refactoring
with Loops and Collection Pipelines” [mf-ref-pipe].
170 Chapter 7 Encapsulation Replace Loop with Pipeline 235
function acquireData(input) {
Encapsulate Collection const lines = input.split("\n");
const result = [];
const loopItems = lines
.slice(1)
.filter(line => line.trim() !== "")
.map(line => line.split(","))
.filter(record => record[1].trim() === "India")
;
for (const line of loopItems) {
const record = line;
if (record[1].trim() === "India") {
class Person { result.push({city: record[0].trim(), phone: record[2].trim()});
get courses() {return this._courses;} }
set courses(aList) {this._courses = aList;} }
return result;
}
The next bit of behavior removes any blank lines. I can replace this with a filter in useful ways such as Collection Pipelines [mf-cp]. Putting in special methods
operation. to handle this kind of functionality adds a lot of extra code and cripples the easy
composability of collection operations.
function acquireData(input) {
Another way is to allow some form of read-only access to a collection. Java,
const lines = input.split("\n");
const result = []; for example, makes it easy to return a read-only proxy to the collection. Such a
const loopItems = lines proxy forwards all reads to the underlying collection, but blocks all writes—in
.slice(1) Java’s case, throwing an exception. A similar route is used by libraries that base
.filter(line => line.trim() !== "") their collection composition on some kind of iterator or enumerable object—
; providing that iterator cannot modify the underlying collection.
for (const line of loopItems) {
if (line.trim() === "") continue;
Probably the most common approach is to provide a getting method for the
const record = line.split(","); collection, but make it return a copy of the underlying collection. That way, any
if (record[1].trim() === "India") { modifications to the copy don’t affect the encapsulated collection. This might
result.push({city: record[0].trim(), phone: record[2].trim()}); cause some confusion if programmers expect the returned collection to modify the
} source field—but in many code bases, programmers are used to collection getters
}
providing copies. If the collection is huge, this may be a performance issue—but
return result;
} most lists aren’t all that big, so the general rules for performance should apply
(Refactoring and Performance (64)).
When writing a pipeline, I find it best to put the terminal semicolon on its own line. Another difference between using a proxy and a copy is that a modification of
the source data will be visible in the proxy but not in a copy. This isn’t an issue
I use the map operation to turn lines into an array of strings—misleadingly
most of the time, because lists accessed in this way are usually only held for a
called record in the original function, but it’s safer to keep the name for now and
short time.
rename later.
What’s important here is consistency within a code base. Use only one mecha-
function acquireData(input) { nism so everyone can get used to how it behaves and expect it when calling any
const lines = input.split("\n"); collection accessor function.
const result = [];
const loopItems = lines
.slice(1) Mechanics
.filter(line => line.trim() !== "")
.map(line => line.split(",")) Apply Encapsulate Variable (132) if the reference to the collection isn’t already
; encapsulated.
for (const line of loopItems) {
const record = line;.split(","); Add functions to add and remove elements from the collection.
if (record[1].trim() === "India") {
result.push({city: record[0].trim(), phone: record[2].trim()}); If there is a setter for the collection, use Remove Setting Method (331) if possible.
} If not, make it take a copy of the provided collection.
}
return result; Run static checks.
}
Find all references to the collection. If anyone calls modifiers on the collec-
Filter again to just get the India records: tion, change them to use the new add/remove functions. Test after each
change.
Modify the getter for the collection to return a protected view on it, using
a read-only proxy or a copy.
Test.
172 Chapter 7 Encapsulation Replace Loop with Pipeline 233
Starting at the top, take each bit of behavior in the loop and replace it with class Person…
a collection pipeline operation in the derivation of the loop collection addCourse(aCourse) {
variable. Test after each change. this._courses.push(aCourse);
}
Once all behavior is removed from the loop, remove it. removeCourse(aCourse, fnIfAbsent = () => {throw new RangeError();}) {
const index = this._courses.indexOf(aCourse);
If it assigns to an accumulator, assign the pipeline result to the accumulator. if (index === -1) fnIfAbsent();
else this._courses.splice(index, 1);
}
Example
With a removal, I have to decide what to do if a client asks to remove an ele-
I’ll begin with some data: a CSV file of data about our offices. ment that isn’t in the collection. I can either shrug, or raise an error. With this
office, country, telephone code, I default to raising an error, but give the callers an opportunity to do
Chicago, USA, +1 312 373 1000 something else if they wish.
Beijing, China, +86 4008 900 505 I then change any code that calls modifiers directly on the collection to use
Bangalore, India, +91 80 4064 9570 new methods.
Porto Alegre, Brazil, +55 51 3079 3550
Chennai, India, +91 44 660 44766 client code…
for(const name of readBasicCourseNames(filename)) {
... (more data follows)
aPerson.addCourse(new Course(name, false));
}
The following function picks out the offices in India and returns their cities
and telephone numbers: With individual add and remove methods, there is usually no need for setCourses,
function acquireData(input) { in which case I’ll use Remove Setting Method (331) on it. Should the API need a
const lines = input.split("\n"); setting method for some reason, I ensure it puts a copy of the collection in the
let firstLine = true; field.
const result = [];
for (const line of lines) { class Person…
if (firstLine) { set courses(aList) {this._courses = aList.slice();}
firstLine = false;
continue; All this enables the clients to use the right kind of modifier methods, but I
} prefer to ensure nobody modifies the list without using them. I can do this by
if (line.trim() === "") continue;
providing a copy.
const record = line.split(",");
if (record[1].trim() === "India") {
class Person…
result.push({city: record[0].trim(), phone: record[2].trim()});
} get courses() {return this._courses.slice();}
}
return result;
In general, I find it wise to be moderately paranoid about collections and I’d
} rather copy them unnecessarily than debug errors due to unexpected modifica-
tions. Modifications aren’t always obvious; for example, sorting an array in
I want to replace that loop with a collection pipeline. JavaScript modifies the original, while many languages default to making a copy
My first step is to create a separate variable for the loop to work over. for an operation that changes a collection. Any class that’s responsible for man-
aging a collection should always give out copies—but I also get into the habit of
making a copy if I do something that’s liable to change a collection.
174 Chapter 7 Encapsulation Replace Loop with Pipeline 231
client…
highPriorityCount = orders.filter(o => "high" === o.priority
|| "rush" === o.priority)
.length;
Whenever I’m fiddling with a data value, the first thing I do is use Encapsulate
Variable (132) on it.
class Order…
get priority() {return this._priority;}
set priority(aString) {this._priority = aString;}
The constructor line that initializes the priority will now use the setter I define here.
176 Chapter 7 Encapsulation Split Loop 229
This self-encapsulates the field so I can preserve its current use while I for (const p of people) {
manipulate the data itself. if (p.age < youngest) youngest = p.age;
totalSalary += p.salary;
I create a simple value class for the priority. It has a constructor for the value
}
and a conversion function to return a string.
return `youngestAge: ${youngest}, totalSalary: ${totalSalary}`;
class Priority {
constructor(value) {this._value = value;}
With the loop copied, I need to remove the duplication that would otherwise
toString() {return this._value;}
}
produce wrong results. If something in the loop has no side effects, I can leave
it there for now, but it’s not the case with this example.
I prefer using a conversion function (toString) rather than a getter (value) here. For clients
of the class, asking for the string representation should feel more like a conversion than let youngest = people[0] ? people[0].age : Infinity;
getting a property. let totalSalary = 0;
for (const p of people) {
I then modify the accessors to use this new class. if (p.age < youngest) youngest = p.age;
totalSalary += p.salary;
class Order… }
get priority() {return this._priority.toString();}
for (const p of people) {
set priority(aString) {this._priority = new Priority(aString);}
if (p.age < youngest) youngest = p.age;
totalSalary += p.salary;
Now that I have a priority class, I find the current getter on the order to be
}
misleading. It doesn’t return the priority—but a string that describes the priority.
My immediate move is to use Rename Function (124). return `youngestAge: ${youngest}, totalSalary: ${totalSalary}`;
class Order… Officially, that’s the end of the Split Loop refactoring. But the point of Split
get priorityString() {return this._priority.toString();} Loop isn’t what it does on its own but what it sets up for the next move—and
set priority(aString) {this._priority = new Priority(aString);} I’m usually looking to extract the loops into their own functions. I’ll use Slide
client… Statements (223) to reorganize the code a bit first.
highPriorityCount = orders.filter(o => "high" === o.priorityString let totalSalary = 0;
|| "rush" === o.priorityString) for (const p of people) {
.length; totalSalary += p.salary;
}
In this case, I’m happy to retain the name of the setter. The name of the
argument communicates what it expects. let youngest = people[0] ? people[0].age : Infinity;
Now I’m done with the formal refactoring. But as I look at who uses the prior- for (const p of people) {
ity, I consider whether they should use the priority class themselves. As a result, if (p.age < youngest) youngest = p.age;
}
I provide a getter on order that provides the new priority object directly.
return `youngestAge: ${youngest}, totalSalary: ${totalSalary}`;
class Order…
get priority() {return this._priority;} Then I do a couple of Extract Function (106).
get priorityString() {return this._priority.toString();}
set priority(aString) {this._priority = new Priority(aString);}
client…
highPriorityCount = orders.filter(o => "high" === o.priority.toString()
|| "rush" === o.priority.toString())
.length;
228 Chapter 8 Moving Features Replace Primitive with Object 177
to understand both things. By splitting the loop, you ensure you only need to As the priority class becomes useful elsewhere, I would allow clients of the
understand the behavior you need to modify. order to use the setter with a priority instance, which I do by adjusting the priority
Splitting a loop can also make it easier to use. A loop that calculates a single constructor.
value can just return that value. Loops that do many things need to return
structures or populate local variables. I frequently follow a sequence of Split Loop class Priority…
followed by Extract Function (106). constructor(value) {
if (value instanceof Priority) return value;
Many programmers are uncomfortable with this refactoring, as it forces you to
this._value = value;
execute the loop twice. My reminder, as usual, is to separate refactoring from }
optimization (Refactoring and Performance (64)). Once I have my code clear,
I’ll optimize it, and if the loop traversal is a bottleneck, it’s easy to slam the loops The point of all this is that now, my new priority class can be useful as a place
back together. But the actual iteration through even a large list is rarely a bottle- for new behavior—either new to the code or moved from elsewhere. Here’s some
neck, and splitting the loops often enables other, more powerful, optimizations. simple code to add validation of priority values and comparison logic:
class Priority…
Mechanics constructor(value) {
if (value instanceof Priority) return value;
Copy the loop. if (Priority.legalValues().includes(value))
this._value = value;
Identify and eliminate duplicate side effects. else
throw new Error(`<${value}> is invalid for Priority`);
Test. }
toString() {return this._value;}
When done, consider Extract Function (106) on each loop. get _index() {return Priority.legalValues().findIndex(s => s === this._value);}
static legalValues() {return ['low', 'normal', 'high', 'rush'];}
Example equals(other) {return this._index === other._index;}
higherThan(other) {return this._index > other._index;}
I’ll start with a little bit of code that calculates the total salary and youngest age. lowerThan(other) {return this._index < other._index;}
let youngest = people[0] ? people[0].age : Infinity;
As I do this, I decide that a priority should be a value object, so I provide an
let totalSalary = 0;
for (const p of people) { equals method and ensure that it is immutable.
if (p.age < youngest) youngest = p.age; Now I’ve added that behavior, I can make the client code more meaningful:
totalSalary += p.salary;
} client…
highPriorityCount = orders.filter(o => o.priority.higherThan(new Priority("normal")))
return `youngestAge: ${youngest}, totalSalary: ${totalSalary}`; .length;
It’s a very simple loop, but it’s doing two different calculations. To split them,
I begin with just copying the loop.
let youngest = people[0] ? people[0].age : Infinity;
let totalSalary = 0;
for (const p of people) {
if (p.age < youngest) youngest = p.age;
totalSalary += p.salary;
}
178 Chapter 7 Encapsulation Split Loop 227
let averageAge = 0;
let totalSalary = 0;
for (const p of people) {
averageAge += p.age;
get basePrice() {this._quantity * this._itemPrice;} totalSalary += p.salary;
}
... averageAge = averageAge / people.length;
let totalSalary = 0;
for (const p of people) {
Motivation totalSalary += p.salary;
}
One use of temporary variables is to capture the value of some code in order to
refer to it later in a function. Using a temp allows me to refer to the value while let averageAge = 0;
explaining its meaning and avoiding repeating the code that calculates it. But for (const p of people) {
averageAge += p.age;
while using a variable is handy, it can often be worthwhile to go a step further
}
and use a function instead. averageAge = averageAge / people.length;
If I’m working on breaking up a large function, turning variables into their own
functions makes it easier to extract parts of the function, since I no longer need
Motivation
You often see loops that are doing two different things at once just because they
can do that with one pass through a loop. But if you’re doing two different
things in the same loop, then whenever you need to modify the loop you have
226 Chapter 8 Moving Features Replace Temp with Query 179
The most important consequence of a test failure after a slide is to use smaller to pass in variables into the extracted functions. Putting this logic into functions
slides: Instead of sliding over ten lines, I’ll just pick five, or slide up to what I often also sets up a stronger boundary between the extracted logic and the orig-
reckon is a dangerous line. It may also mean that the slide isn’t worth it, and inal function, which helps me spot and avoid awkward dependencies and side
I need to work on something else first. effects.
Using functions instead of variables also allows me to avoid duplicating the
Example: Sliding with Conditionals calculation logic in similar functions. Whenever I see variables calculated in
the same way in different places, I look to turn them into a single function.
I can also do slides with conditionals. This will either involve removing duplicate This refactoring works best if I’m inside a class, since the class provides a
logic when I slide out of a conditional, or adding duplicate logic when I slide in. shared context for the methods I’m extracting. Outside of a class, I’m liable to
Here’s a case where I have the same statements in both legs of a conditional: have too many parameters in a top-level function which negates much of the
let result; benefit of using a function. Nested functions can avoid this, but they limit my
if (availableResources.length === 0) { ability to share the logic between related functions.
result = createResource(); Only some temporary variables are suitable for Replace Temp with Query. The
allocatedResources.push(result); variable needs to be calculated once and then only be read afterwards. In
} else {
the simplest case, this means the variable is assigned to once, but it’s also possible
result = availableResources.pop();
allocatedResources.push(result); to have several assignments in a more complicated lump of code—all of which
} has to be extracted into the query. Furthermore, the logic used to calculate the
return result; variable must yield the same result when the variable is used later—which rules
out variables used as snapshots with names like oldAddress.
I can slide these out of the conditional, in which case they turn into a single
statement outside of the conditional block.
Mechanics
let result;
if (availableResources.length === 0) { Check that the variable is determined entirely before it’s used, and the code
result = createResource();
that calculates it does not yield a different value whenever it is used.
} else {
result = availableResources.pop(); If the variable isn’t read-only, and can be made read-only, do so.
}
allocatedResources.push(result); Test.
return result;
Extract the assignment of the variable into a function.
In the reverse case, sliding a fragment into a conditional means repeating it in
every leg of the conditional. If the variable and the function cannot share a name, use a temporary name for
the function.
Further Reading Ensure the extracted function is free of side effects. If not, use Separate Query from
I’ve seen an almost identical refactoring under the name of Swap Statement Modifier (306).
[wake-swap]. Swap Statement moves adjacent fragments, but it only works with Test.
single-statement fragments. You can think of it as Slide Statements where both
the sliding fragment and the slid-over fragment are single statements. This Use Inline Variable (123) to remove the temp.
refactoring appeals to me; after all, I’m always going on about taking small
steps—steps that may seem ridiculously small to those new to refactoring.
But I ended up writing this refactoring with larger fragments because that is
what I do. I only move one statement at a time if I’m having difficulty with a
larger slide, and I rarely run into problems with larger slides. With more messy
code, however, smaller slides end up being easier.
180 Chapter 7 Encapsulation Slide Statements 225
Example I do similar analysis with any code that doesn’t have side effects. So I can take
line 2 (`const order = ...`) and move it down to above line 6 (`const units = ...`)
Here is a simple class: without trouble.
In this case, I’m also helped by the fact that the code I’m moving over doesn’t
class Order…
have side effects either. Indeed, I can freely rearrange code that lacks side effects
constructor(quantity, item) {
this._quantity = quantity;
to my heart’s content, which is one of the reasons why wise programmers prefer to
this._item = item; use side-effect-free code as much as possible.
} There is a wrinkle here, however. How do I know that line 2 is side-effect-free?
To be sure, I’d need to look inside retrieveOrder() to ensure there are no side
get price() { effects there (and inside any functions it calls, and inside any functions its func-
var basePrice = this._quantity * this._item.price;
tions call, and so on). In practice, when working on my own code, I know that
var discountFactor = 0.98;
if (basePrice > 1000) discountFactor -= 0.03;
I generally follow the Command-Query Separation [mf-cqs] principle, so any
return basePrice * discountFactor; function that returns a value is free of side effects. But I can only be confident
} of that because I know the code base; if I were working in an unknown code
} base, I’d have to be more cautious. But I do try to follow the Command-Query
Separation in my own code because it’s so valuable to know that code is free of
I want to replace the temps basePrice and discountFactor with methods.
side effects.
Starting with basePrice, I make it const and run tests. This is a good way of
When sliding code that has a side effect, or sliding over code with side effects,
checking that I haven’t missed a reassignment—unlikely in such a short function
I have to be much more careful. What I’m looking for is interference between
but common when I’m dealing with something larger.
the two code fragments. So, let’s say I want to slide line 11 (`if (order.isRepeat) ...`)
class Order… down to the end. I’m prevented from doing that by line 12 because it references
constructor(quantity, item) { the variable whose state I’m changing in line 11. Similarly, I can’t take line 13
this._quantity = quantity; (`chargeOrder(charge)`) and move it up because line 12 modifies some state that line 13
this._item = item; references. However, I can slide line 8 (`charge = baseCharge + ...`) over lines 9–11
} because there they don’t modify any common state.
The most straightforward rule to follow is that I can’t slide one fragment of
get price() {
const basePrice = this._quantity * this._item.price;
code over another if any data that both fragments refer to is modified by either
var discountFactor = 0.98; one. But that’s not a comprehensive rule; I can happily slide either of the following
if (basePrice > 1000) discountFactor -= 0.03; two lines over the other:
return basePrice * discountFactor;
} a = a + 10;
} a = a + 5;
I then extract the right-hand side of the assignment to a getting method. But judging whether a slide is safe means I have to really understand the
operations involved and how they compose.
class Order… Since I need to worry so much about updating state, I look to remove as much
get price() { of it as I can. So with this code, I’d be looking to apply Split Variable (240) on
const basePrice = this.basePrice; charge before I indulge in any sliding around of that code.
var discountFactor = 0.98;
Here, the analysis is relatively simple because I’m mostly just modifying local
if (basePrice > 1000) discountFactor -= 0.03;
return basePrice * discountFactor; variables. With more complex data structures, it’s much harder to be sure when
} I get interference. So tests play an important role: Slide the fragment, run tests,
see if things break. If my test coverage is good, I can feel happy with the
get basePrice() { refactoring. But if tests aren’t reliable, I need to be more wary—or, more likely,
return this._quantity * this._item.price;
}
to improve the tests for the code I’m working on.
224 Chapter 8 Moving Features Replace Temp with Query 181
A fragment cannot slide forwards beyond any element that references it. I test, and apply Inline Variable (123).
A fragment cannot slide over any statement that modifies an element it references. class Order…
get price() {
A fragment that modifies an element cannot slide over any other element that
const basePrice = this.basePrice;
references the modified element.
var discountFactor = 0.98;
Cut the fragment from the source and paste into the target position. if (this.basePrice > 1000) discountFactor -= 0.03;
return this.basePrice * discountFactor;
Test. }
If the test fails, try breaking down the slide into smaller steps. Either slide over I then repeat the steps with discountFactor, first using Extract Function (106).
less code or reduce the amount of code in the fragment you’re moving. class Order…
get price() {
Example const discountFactor = this.discountFactor;
return this.basePrice * discountFactor;
When sliding code fragments, there are two decisions involved: what slide I’d }
like to do and whether I can do it. The first decision is very context-specific. On get discountFactor() {
the simplest level, I like to declare elements close to where I use them, so I’ll var discountFactor = 0.98;
often slide a declaration down to its usage. But almost always I slide some code if (this.basePrice > 1000) discountFactor -= 0.03;
because I want to do another refactoring—perhaps to get a clump of code together return discountFactor;
to Extract Function (106). }
Once I have a sense of where I’d like to move some code, the next part is de- In this case I need my extracted function to contain both assignments to
ciding if I can do it. This involves looking at the code I’m sliding and the code discountFactor.
I can also set the original variable to be const.
I’m sliding over: Do they interfere with each other in a way that would change Then, I inline:
the observable behavior of the program?
Consider the following fragment of code: get price() {
return this.basePrice * this.discountFactor;
1 const pricingPlan = retrievePricingPlan(); }
2 const order = retreiveOrder();
3 const baseCharge = pricingPlan.base;
4 let charge;
5 const chargePerUnit = pricingPlan.unit;
6 const units = order.units;
7 let discount;
8 charge = baseCharge + units * chargePerUnit;
9 let discountableUnits = Math.max(units - pricingPlan.discountThreshold, 0);
10 discount = discountableUnits * pricingPlan.discountFactor;
11 if (order.isRepeat) discount += 20;
12 charge = charge - discount;
13 chargeOrder(charge);
The first seven lines are declarations, and it’s relatively easy to move these.
For example, I may want to move all the code dealing with discounts together,
which would involve moving line 7 (`let discount`) to above line 10 (`discount = ...`).
Since a declaration has no side effects and refers to no other variable, I can
safely move this forwards as far as the first line that references discount itself. This
is also a common move—if I want to use Extract Function (106) on the discount
logic, I’ll need to move the declaration down first.
182 Chapter 7 Encapsulation Slide Statements 223
Mechanics
Replace Inline Code with Function Call
Decide how to split the responsibilities of the class.
Create a new child class to express the split-off responsibilities.
If the responsibilities of the original parent class no longer match its name, rename
the parent.
Create an instance of the child class when constructing the parent and add
a link from parent to child.
let appliesToMass = false;
for(const s of states) { Use Move Field (207) on each field you wish to move. Test after each move.
if (s === "MA") appliesToMass = true;
} Use Move Function (198) to move methods to the new child. Start with lower-
level methods (those being called rather than calling). Test after each move.
Review the interfaces of both classes, remove unneeded methods, change
names to better fit the new circumstances.
appliesToMass = states.includes("MA"); Decide whether to expose the new child. If so, consider applying Change
Reference to Value (252) to the child class.
Motivation Example
Functions allow me to package up bits of behavior. This is useful for understand- I start with a simple person class:
ing—a named function can explain the purpose of the code rather than its me-
chanics. It’s also valuable to remove duplication: Instead of writing the same code class Person…
twice, I just call the function. Then, should I need to change the function’s imple- get name() {return this._name;}
mentation, I don’t have to track down similar-looking code to update all the set name(arg) {this._name = arg;}
changes. (I may have to look at the callers, to see if they should all use the new get telephoneNumber() {return `(${this.officeAreaCode}) ${this.officeNumber}`;}
get officeAreaCode() {return this._officeAreaCode;}
code, but that’s both less common and much easier.) set officeAreaCode(arg) {this._officeAreaCode = arg;}
If I see inline code that’s doing the same thing that I have in an existing func- get officeNumber() {return this._officeNumber;}
tion, I’ll usually want to replace that inline code with a function call. The exception set officeNumber(arg) {this._officeNumber = arg;}
is if I consider the similarity to be coincidental—so that, if I change the function
body, I don’t expect the behavior in this inline code to change. A guide to this Here. I can separate the telephone number behavior into its own class. I start
is the name of the function. A good name should make sense in place of inline by defining an empty telephone number class:
code I have. If the name doesn’t make sense, that may be because it’s a poor class TelephoneNumber {
name (in which case I use Rename Function (124) to fix it) or because the function’s }
purpose is different to what I want in this case—so I shouldn’t call it.
I find it particularly satisfying to do this with calls to library functions—that That was easy! Next, I create an instance of telephone number when
way, I don’t even have to write the function body. constructing the person:
class Person…
Mechanics constructor() {
this._telephoneNumber = new TelephoneNumber();
Replace the inline code with a call to the existing function. }
Test.
184 Chapter 7 Encapsulation Move Statements to Callers 221
I test again to ensure this call is working properly, then move onto the next.
function renderPerson(outStream, person) {
outStream.write(`<p>${person.name}</p>\n`);
renderPhoto(outStream, person.photo);
zztmp(outStream, person.photo);
outStream.write(`<p>location: ${person.photo.location}</p>\n`);
}
Then I can delete the outer function, completing Inline Function (115).
186 Chapter 7 Encapsulation Move Statements to Callers 219
I need to modify the software so that listRecentPhotos renders the location infor-
Inline Class mation differently while renderPerson stays the same. To make this change easier,
I’ll use Move Statements to Callers on the final line.
inverse of: Extract Class (182) Usually, when faced with something this simple, I’ll just cut the last line from
renderPerson and paste it below the two calls. But since I’m explaining what to do
in more tricky cases, I’ll go through the more elaborate but safer procedure.
My first step is to use Extract Function (106) on the code that will remain in
emitPhotoData.
Move Statements to Callers works well for small changes, but sometimes the Mechanics
boundaries between caller and callee need complete reworking. In that case, my
best move is to use Inline Function (115) and then slide and extract new functions In the target class, create functions for all the public functions of the source
to form better boundaries. class. These functions should just delegate to the source class.
Change all references to source class methods so they use the target class’s
Mechanics delegators instead. Test after each change.
In simple circumstances, where you have only one or two callers and a Move all the functions and data from the source class into the target, testing
simple function to call from, just cut the first line from the called function after each move, until the source class is empty.
and paste (and perhaps fit) it into the callers. Test and you’re done. Delete the source class and hold a short, simple funeral service.
Otherwise, apply Extract Function (106) to all the statements that you don’t
wish to move; give it a temporary but easily searchable name. Example
If the function is a method that is overridden by subclasses, do the extraction on Here’s a class that holds a couple of pieces of tracking information for a shipment.
all of them so that the remaining method is identical in all classes. Then remove
the subclass methods. class TrackingInformation {
get shippingCompany() {return this._shippingCompany;}
Use Inline Function (115) on the original function. set shippingCompany(arg) {this._shippingCompany = arg;}
get trackingNumber() {return this._trackingNumber;}
Apply Change Function Declaration (124) on the extracted function to rename set trackingNumber(arg) {this._trackingNumber = arg;}
it to the original name. get display() {
return `${this.shippingCompany}: ${this.trackingNumber}`;
Or to a better name, if you can think of one. }
}
Example It’s used as part of a shipment class.
Here’s a simple case: a function with two callers. class Shipment…
function renderPerson(outStream, person) { get trackingInfo() {
outStream.write(`<p>${person.name}</p>\n`); return this._trackingInformation.display;
renderPhoto(outStream, person.photo); }
emitPhotoData(outStream, person.photo); get trackingInformation() {return this._trackingInformation;}
} set trackingInformation(aTrackingInformation) {
this._trackingInformation = aTrackingInformation;
function listRecentPhotos(outStream, photos) { }
photos
.filter(p => p.date > recentDateCutoff()) While this class may have been worthwhile in the past, I no longer feel it’s
.forEach(p => { pulling its weight, so I want to inline it into Shipment.
outStream.write("<div>\n"); I start by looking at places that are invoking the methods of TrackingInformation.
emitPhotoData(outStream, p);
outStream.write("</div>\n"); caller…
}); aShipment.trackingInformation.shippingCompany = request.vendor;
}
I’m going to move all such functions to Shipment, but I do it slightly differently
function emitPhotoData(outStream, photo) { to how I usually do Move Function (198). In this case, I start by putting a delegating
outStream.write(`<p>title: ${photo.title}</p>\n`);
method into the shipment, and adjusting the client to call that.
outStream.write(`<p>date: ${photo.date.toDateString()}</p>\n`);
outStream.write(`<p>location: ${photo.location}</p>\n`);
}
188 Chapter 7 Encapsulation Move Statements to Callers 217
class Shipment…
set shippingCompany(arg) {this._trackingInformation.shippingCompany = arg;} Move Statements to Callers
caller… inverse of: Move Statements into Function (213)
aShipment.trackingInformation.shippingCompany = request.vendor;
I do this for all the elements of tracking information that are used by clients.
Once I’ve done that, I can move all the elements of the tracking information over
into the shipment class.
I start by applying Inline Function (115) to the display method.
class Shipment…
get trackingInfo() {
return `${this.shippingCompany}: ${this.trackingNumber}`;
}
emitPhotoData(outStream, person.photo);
I move the shipping company field.
function emitPhotoData(outStream, photo) {
get shippingCompany() {return this._trackingInformation._shippingCompany;} outStream.write(`<p>title: ${photo.title}</p>\n`);
set shippingCompany(arg) {this._trackingInformation._shippingCompany = arg;} outStream.write(`<p>location: ${photo.location}</p>\n`);
}
I don’t use the full mechanics for Move Field (207) since in this case I only ref-
erence shippingCompany from Shipment which is the target of the move. I thus don’t
need the steps that put a reference from the source to the target.
I continue until everything is moved over. Once I’ve done that, I can delete
the tracking information class.
emitPhotoData(outStream, person.photo);
class Shipment… outStream.write(`<p>location: ${person.photo.location}</p>\n`);
get trackingInfo() {
return `${this.shippingCompany}: ${this.trackingNumber}`; function emitPhotoData(outStream, photo) {
} outStream.write(`<p>title: ${photo.title}</p>\n`);
get shippingCompany() {return this._shippingCompany;} }
set shippingCompany(arg) {this._shippingCompany = arg;}
get trackingNumber() {return this._trackingNumber;}
set trackingNumber(arg) {this._trackingNumber = arg;} Motivation
Functions are the basic building block of the abstractions we build as program-
mers. And, as with any abstraction, we don’t always get the boundaries right. As
a code base changes its capabilities—as most useful software does—we often find
our abstraction boundaries shift. For functions, that means that what might once
have been a cohesive, atomic unit of behavior becomes a mix of two or more
different things.
One trigger for this is when common behavior used in several places needs to
vary in some of its calls. Now, we need to move the varying behavior out of the
function to its callers. In this case, I’ll use Slide Statements (223) to get the varying
behavior to the beginning or end of the function and then Move Statements to
Callers. Once the varying code is in the caller, I can change it when necessary.
216 Chapter 8 Moving Features Hide Delegate 189
Now that I’ve done all the callers, I use Inline Function (115) on emitPhotoData:
function zznew(p) {
Hide Delegate
return [
inverse of: Remove Middle Man (192)
`<p>title: ${p.title}</p>`,
`<p>location: ${p.location}</p>`,
`<p>date: ${p.date.toDateString()}</p>`,
].join("\n");
}
function photoDiv(aPhoto) {
return [
"<div>",
emitPhotoData(aPhoto),
"</div>", manager = aPerson.manager;
].join("\n");
} class Person {
get manager() {return this.department.manager;}
function emitPhotoData(aPhoto) {
return [
`<p>title: ${aPhoto.title}</p>`,
`<p>location: ${aPhoto.location}</p>`, Motivation
`<p>date: ${aPhoto.date.toDateString()}</p>`,
].join("\n"); One of the keys—if not the key—to good modular design is encapsulation. Encap-
} sulation means that modules need to know less about other parts of the system.
Then, when things change, fewer modules need to be told about the change—
I also make the parameter names fit my convention while I’m at it. which makes the change easier to make.
When we are first taught about object orientation, we are told that encapsulation
means hiding our fields. As we become more sophisticated, we realize there is
more that we can encapsulate.
If I have some client code that calls a method defined on an object in a field
of a server object, the client needs to know about this delegate object. If the
delegate changes its interface, changes propagate to all the clients of the server
that use the delegate. I can remove this dependency by placing a simple delegating
method on the server that hides the delegate. Then any changes I make to the
delegate propagate only to the server and not to the clients.
190 Chapter 7 Encapsulation Move Statements into Function 215
function photoDiv(p) {
return [
"<div>",
`<p>title: ${p.title}</p>`,
emitPhotoData(p),
"</div>",
].join("\n");
}
function emitPhotoData(aPhoto) {
const result = [];
result.push(`<p>location: ${aPhoto.location}</p>`);
Mechanics result.push(`<p>date: ${aPhoto.date.toDateString()}</p>`);
return result.join("\n");
For each method on the delegate, create a simple delegating method on the }
server. This code shows two calls to emitPhotoData, each preceded by a line of code that
Adjust the client to call the server. Test after each change. is semantically equivalent. I’d like to remove this duplication by moving the
title printing into emitPhotoData. If I had just the one caller, I would just cut and
If no client needs to access the delegate anymore, remove the server’s paste the code, but the more callers I have, the more I’m inclined to use a safer
accessor for the delegate. procedure.
Test. I begin by using Extract Function (106) on one of the callers. I’m extracting the
statements I want to move into emitPhotoData, together with the call to emitPhotoData
itself.
Example
function photoDiv(p) {
I start with a person and a department. return [
"<div>",
class Person… zznew(p),
constructor(name) { "</div>",
this._name = name; ].join("\n");
} }
get name() {return this._name;}
function zznew(p) {
get department() {return this._department;}
return [
set department(arg) {this._department = arg;}
`<p>title: ${p.title}</p>`,
emitPhotoData(p),
class Department…
].join("\n");
get chargeCode() {return this._chargeCode;} }
set chargeCode(arg) {this._chargeCode = arg;}
get manager() {return this._manager;} I can now look at the other callers of emitPhotoData and, one by one, replace the
set manager(arg) {this._manager = arg;} calls and the preceding statements with calls to the new function.
Some client code wants to know the manager of a person. To do this, it needs function renderPerson(outStream, person) {
to get the department first. const result = [];
result.push(`<p>${person.name}</p>`);
client code… result.push(renderPhoto(person.photo));
manager = aPerson.department.manager; result.push(zznew(person.photo));
return result.join("\n");
This reveals to the client how the department class works and that the depart- }
ment is responsible for tracking the manager. I can reduce this coupling by
214 Chapter 8 Moving Features Hide Delegate 191
I move statements into a function when I can best understand these statements hiding the department class from the client. I do this by creating a simple
as part of the called function. If they don’t make sense as part of the called delegating method on person:
function, but still should be called with it, I’ll simply use Extract Function (106)
on the statements and the called function. That’s essentially the same process as class Person…
I describe below, but without the inline and rename steps. It’s not unusual to do get manager() {return this._department.manager;}
that and then, after later reflection, carry out those final steps. I now need to change all clients of person to use this new method:
client code…
Mechanics
manager = aPerson.department.manager;
If the repetitive code isn’t adjacent to the call of the target function, use Slide
Once I’ve made the change for all methods of department and for all the clients
Statements (223) to get it adjacent.
of person, I can remove the department accessor on person.
If the target function is only called by the source function, just cut the code
from the source, paste it into the target, test, and ignore the rest of these
mechanics.
If you have more callers, use Extract Function (106) on one of the call sites
to extract both the call to the target function and the statements you wish to
move into it. Give it a name that’s transient, but easy to grep.
Convert every other call to use the new function. Test after each conversion.
When all the original calls use the new function, use Inline Function (115) to
inline the original function completely into the new function, removing the
original function.
Rename Function (124) to change the name of the new function to the same
name as the original function.
Or to a better name, if there is one.
Example
I’ll start with this code to emit HTML for data about a photo:
function renderPerson(outStream, person) {
const result = [];
result.push(`<p>${person.name}</p>`);
result.push(renderPhoto(person.photo));
result.push(`<p>title: ${person.photo.title}</p>`);
result.push(emitPhotoData(person.photo));
return result.join("\n");
}
192 Chapter 7 Encapsulation Move Statements into Function 213
result.push(`<p>title: ${person.photo.title}</p>`);
manager = aPerson.manager; result.concat(photoData(person.photo));
manager = aPerson.department.manager;
Motivation result.concat(photoData(person.photo));
In the motivation for Hide Delegate (189), I talked about the advantages of encap-
function photoData(aPhoto) {
sulating the use of a delegated object. There is a price for this. Every time the return [
client wants to use a new feature of the delegate, I have to add a simple delegating `<p>title: ${aPhoto.title}</p>`,
method to the server. After adding features for a while, I get irritated with all `<p>location: ${aPhoto.location}</p>`,
this forwarding. The server class is just a middle man (Middle Man (81)), and `<p>date: ${aPhoto.date.toDateString()}</p>`,
perhaps it’s time for the client to call the delegate directly. (This smell often pops ];
}
up when people get overenthusiastic about following the Law of Demeter, which
I’d like a lot more if it were called the Occasionally Useful Suggestion of Demeter.)
It’s hard to figure out what the right amount of hiding is. Fortunately, with
Hide Delegate (189) and Remove Middle Man, it doesn’t matter so much. I can
Motivation
adjust my code as time goes on. As the system changes, the basis for how much Removing duplication is one of the best rules of thumb of healthy code. If I see
I hide also changes. A good encapsulation six months ago may be awkward now. the same code executed every time I call a particular function, I look to com-
Refactoring means I never have to say I’m sorry—I just fix it. bine that repeating code into the function itself. That way, any future modifications
to the repeating code can be done in one place and used by all the callers. Should
the code vary in the future, I can easily move it (or some of it) out again with
Move Statements to Callers (217).
212 Chapter 8 Moving Features Remove Middle Man 193
I might run the system for a while with this assertion in place to see if I get Mechanics
an error. Or, instead of adding an assertion, I might log the problem. Once I’m
confident that I’m not introducing an observable change, I can change the access, Create a getter for the delegate.
removing the update from the account completely.
For each client use of a delegating method, replace the call to the delegating
class Account… method by chaining through the accessor. Test after each replacement.
constructor(number, type) {
If all calls to a delegating method are replaced, you can delete the delegating
this._number = number;
method.
this._type = type;
} With automated refactorings, you can use Encapsulate Variable (132) on the delegate
get interestRate() {return this._type.interestRate;}
field and then Inline Function (115) on all the methods that use it.
Example
I begin with a person class that uses a linked department object to determine a
manager. (If you’re reading this book sequentially, this example may look eerily
familiar.)
client code…
manager = aPerson.manager;
class Person…
get manager() {return this._department.manager;}
class Department…
get manager() {return this._manager;}
class Person…
get department() {return this._department;}
Now I go to each client at a time and modify them to use the department
directly.
client code…
manager = aPerson.department.manager;
Once I’ve done this with all the clients, I can remove the manager method
from Person. I can repeat this process for any other simple delegations on Person.
194 Chapter 7 Encapsulation Move Field 211
I can do a mixture here. Some delegations may be so common that I’d like to Example: Moving to a Shared Object
keep them to make client code easier to work with. There is no absolute reason
why I should either hide a delegate or remove a middle man—particular circum- Now, let’s consider a different case. Here’s an account with an interest rate:
stances suggest which approach to take, and reasonable people can differ on
class Account…
what works best.
constructor(number, type, interestRate) {
If I have automated refactorings, then there’s a useful variation on these steps. this._number = number;
First, I use Encapsulate Variable (132) on department. This changes the manager getter this._type = type;
to use the public department getter: this._interestRate = interestRate;
}
class Person… get interestRate() {return this._interestRate;}
get manager() {return this.department.manager;}
class AccountType…
The change is rather too subtle in JavaScript, but by removing the underscore from department constructor(nameString) {
I’m using the new getter rather than accessing the field directly. this._name = nameString;
}
Then I apply Inline Function (115) on the manager method to replace all the
callers at once. I want to change things so that an account’s interest rate is determined from
its account type.
The access to the interest rate is already nicely encapsulated, so I’ll just create
the field and an appropriate accessor on the account type.
class AccountType…
constructor(nameString, interestRate) {
this._name = nameString;
this._interestRate = interestRate;
}
get interestRate() {return this._interestRate;}
But there is a potential problem when I update the accesses from account. Before
this refactoring, each account had its own interest rate. Now, I want all accounts to
share the interest rates of their account type. If all the accounts of the same type
already have the same interest rate, then there’s no change in observable behavior,
so I’m fine with the refactoring. But if there’s an account with a different interest
rate, it’s no longer a refactoring. If my account data is held in a database, I should
check the database to ensure that all my accounts have the rate matching their
type. I can also Introduce Assertion (302) in the account class.
class Account…
constructor(number, type, interestRate) {
this._number = number;
this._type = type;
assert(interestRate === this._type.interestRate);
this._interestRate = interestRate;
}
get interestRate() {return this._interestRate;}
210 Chapter 8 Moving Features Substitute Algorithm 195
I use a method to update the discount rate, rather than a property setter, as I
don’t want to make a public setter for the discount rate. Substitute Algorithm
I add a field and accessors to the customer contract.
class CustomerContract…
constructor(startDate, discountRate) {
this._startDate = startDate;
this._discountRate = discountRate;
}
get discountRate() {return this._discountRate;}
set discountRate(arg) {this._discountRate = arg;}
I now modify the accessors on customer to use the new field. When I did that,
I got an error: “Cannot set property ’discountRate’ of undefined”. This was because function foundPerson(people) {
for(let i = 0; i < people.length; i++) {
_setDiscountRate was called before I created the contract object in the constructor.
if (people[i] === "Don") {
To fix that, I first reverted to the previous state, then used Slide Statements (223) to return "Don";
move the _setDiscountRate after creating the contract. }
if (people[i] === "John") {
class Customer… return "John";
constructor(name, discountRate) { }
this._name = name; if (people[i] === "Kent") {
this._setDiscountRate(discountRate); return "Kent";
this._contract = new CustomerContract(dateToday()); }
} }
return "";
I tested that, then changed the accessors again to use the contract. }
class Customer…
get discountRate() {return this._contract.discountRate;}
_setDiscountRate(aNumber) {this._contract.discountRate = aNumber;}
Since I’m using JavaScript, there is no declared source field, so I don’t need to function foundPerson(people) {
remove anything further. const candidates = ["Don", "John", "Kent"];
return people.find(p => candidates.includes(p)) || '';
Changing a Bare Record }
Sometimes, when I want to change the algorithm to work slightly differently, Remove the source field.
it’s easier to start by replacing it with something that would make my change
Test.
more straightforward to make.
When I have to take this step, I have to be sure I’ve decomposed the method
as much as I can. Replacing a large, complex algorithm is very difficult; only by Example
making it simple can I make the substitution tractable.
I’m starting here with this customer and contract:
class CustomerContract…
constructor(startDate) {
this._startDate = startDate;
}
I want to move the discount rate field from the customer to the customer
contract.
The first thing I need to use is Encapsulate Variable (132) to encapsulate access
to the discount rate field.
class Customer…
constructor(name, discountRate) {
this._name = name;
this._setDiscountRate(discountRate);
this._contract = new CustomerContract(dateToday());
}
get discountRate() {return this._discountRate;}
_setDiscountRate(aNumber) {this._discountRate = aNumber;}
becomePreferred() {
this._setDiscountRate(this.discountRate + 0.03);
// other nice things
}
applyDiscount(amount) {
return amount.subtract(amount.multiply(this.discountRate));
}
208 Chapter 8 Moving Features
clarify their relationship. Change is also a factor; if a change in one record causes Chapter 8
a field in another record to change too, that’s a sign of a field in the wrong place.
If I have to update the same field in multiple structures, that’s a sign that it should
move to another place where it only needs to be updated once.
I usually do Move Field in the context of a broader set of changes. Once I’ve
moved a field, I find that many of the users of the field are better off accessing Moving Features
that data through the target object rather than the original source. I then
change these with later refactorings. Similarly, I may find that I can’t do Move
Field at the moment due to the way the data is used. I need to refactor some
usage patterns first, then do the move.
In my description so far, I’m saying “record,” but all this is true of classes and
objects too. A class is a record type with attached functions—and these need to
be kept healthy just as much as any other data. The attached functions do make
it easier to move data around, since the data is encapsulated behind accessor So far, the refactorings have been about creating, removing, and renaming program
methods. I can move the data, change the accessors, and clients of the accessors elements. Another important part of refactoring is moving elements between
will still work. So, this is a refactoring that’s easier to do if you have classes, and contexts. I use Move Function (198) to move functions between classes and other
my description below makes that assumption. If I’m using bare records that don’t modules. Fields can move too, with Move Field (207).
support encapsulation, I can still make a change like this, but it is more tricky. I also move individual statements around. I use Move Statements into Function
(213) and Move Statements to Callers (217) to move them in or out of functions,
as well as Slide Statements (223) to move them within a function. Sometimes, I
Mechanics can take some statements that match an existing function and use Replace Inline
Code with Function Call (222) to remove the duplication.
Ensure the source field is encapsulated.
Two refactorings I often do with loops are Split Loop (227), to ensure a loop does
Test. only one thing, and Replace Loop with Pipeline (231) to get rid of a loop entirely.
And then there’s the favorite refactoring of many a fine programmer: Remove
Create a field (and accessors) in the target. Dead Code (237). Nothing is as satisfying as applying the digital flamethrower to
Run static checks. superfluous statements.
Ensure there is a reference from the source object to the target object.
An existing field or method may give you the target. If not, see if you can easily
create a method that will do so. Failing that, you may need to create a new field
in the source object that can store the target. This may be a permanent change,
but you can also do it temporarily until you have done enough refactoring in the
broader context.
Test.
197
198 Chapter 8 Moving Features Move Field 207
class Customer {
class Account { get plan() {return this._plan;}
get overdraftCharge() {...} get discountRate() {return this._discountRate;}
Motivation
Motivation
The heart of a good software design is its modularity—which is my ability to
make most modifications to a program while only having to understand a small Programming involves writing a lot of code that implements behavior—but the
part of it. To get this modularity, I need to ensure that related software elements strength of a program is really founded on its data structures. If I have a good
are grouped together and the links between them are easy to find and understand. set of data structures that match the problem, then my behavior code is simple
But my understanding of how to do this isn’t static—as I better understand what and straightforward. But poor data structures lead to lots of code whose job is
I’m doing, I learn how to best group together software elements. To reflect that merely dealing with the poor data. And it’s not just messier code that’s harder
growing understanding, I need to move elements around. to understand; it also means the data structures obscure what the program is
All functions live in some context; it may be global, but usually it’s some form doing.
of a module. In an object-oriented program, the core modular context is a class. So, data structures are important—but like most aspects of programming they
Nesting a function within another creates another common context. Different are hard to get right. I do make an initial analysis to figure out the best data
languages provide varied forms of modularity, each creating a context for a structures, and I’ve found that experience and techniques like domain-driven
function to live in. design have improved my ability to do that. But despite all my skill and experi-
One of the most straightforward reasons to move a function is when it refer- ence, I still find that I frequently make mistakes in that initial design. In the
ences elements in other contexts more than the one it currently resides in. process of programming, I learn more about the problem domain and my data
Moving it together with those elements often improves encapsulation, allowing structures. A design decision that is reasonable and correct one week can become
other parts of the software to be less dependent on the details of this module. wrong in another.
Similarly, I may move a function because of where its callers live, or where I As soon as I realize that a data structure isn’t right, it’s vital to change it. If I
need to call it from in my next enhancement. A function defined as a helper inside leave my data structures with their blemishes, those blemishes will confuse my
another function may have value on its own, so it’s worth moving it to somewhere thinking and complicate my code far into the future.
more accessible. A method on a class may be easier for me to use if shifted to I may seek to move data because I find I always need to pass a field from one
another. record whenever I pass another record to a function. Pieces of data that are always
passed to functions together are usually best put in a single record in order to
206 Chapter 8 Moving Features Move Function 199
In the earlier steps, I passed daysOverdrawn as a parameter—but if there’s a lot of Deciding to move a function is rarely an easy decision. To help me decide, I
data from the account to pass, I might prefer to pass the account itself. examine the current and candidate contexts for that function. I need to look at
what functions call this one, what functions are called by the moving function,
class Account… and what data that function uses. Often, I see that I need a new context for a
get bankCharge() { group of functions and create one with Combine Functions into Class (144) or Extract
let result = 4.5;
Class (182). Although it can be difficult to decide where the best place for a
if (this._daysOverdrawn > 0) result += this.overdraftCharge;
return result; function is, the more difficult this choice, often the less it matters. I find it valuable
} to try working with functions in one context, knowing I’ll learn how well they
fit, and if they don’t fit I can always move them later.
get overdraftCharge() {
return this.type.overdraftCharge(this);
} Mechanics
class AccountType… Examine all the program elements used by the chosen function in its current
overdraftCharge(account) { context. Consider whether they should move too.
if (this.isPremium) {
const baseCharge = 10; If I find a called function that should also move, I usually move it first. That way,
if (account.daysOverdrawn <= 7) moving a clusters of functions begins with the one that has the least dependency
return baseCharge; on the others in the group.
else
return baseCharge + (account.daysOverdrawn - 7) * 0.85; If a high-level function is the only caller of subfunctions, then you can inline those
} functions into the high-level method, move, and reextract at the destination.
else
return account.daysOverdrawn * 1.75; Check if the chosen function is a polymorphic method.
}
If I’m in an object-oriented language, I have to take account of super- and subclass
declarations.
Copy the function to the target context. Adjust it to fit in its new home.
If the body uses elements in the source context, I need to either pass those
elements as parameters or pass a reference to that source context.
Moving a function often means I need to come up with a different name that
works better in the new context.
Example: Moving a Nested Function to Top Level The first step is to look at the features that the overdraftCharge method uses and
consider whether it is worth moving a batch of methods together. In this case I
I’ll begin with a function that calculates the total distance for a GPS track record. need the daysOverdrawn method to remain on the account class, because that will
function trackSummary(points) {
vary with individual accounts.
const totalTime = calculateTime(); Next, I copy the method body over to the account type and get it to fit.
const totalDistance = calculateDistance();
const pace = totalTime / 60 / totalDistance ; class AccountType…
return { overdraftCharge(daysOverdrawn) {
time: totalTime, if (this.isPremium) {
distance: totalDistance, const baseCharge = 10;
pace: pace if (daysOverdrawn <= 7)
}; return baseCharge;
else
function calculateDistance() { return baseCharge + (daysOverdrawn - 7) * 0.85;
let result = 0; }
for (let i = 1; i < points.length; i++) { else
result += distance(points[i-1], points[i]); return daysOverdrawn * 1.75;
} }
return result;
} In order to get the method to fit in its new location, I need to deal with two
call targets that change their scope. isPremium is now a simple call on this. With
function distance(p1,p2) { ... }
daysOverdrawn I have to decide—do I pass the value or do I pass the account? For
function radians(degrees) { ... }
function calculateTime() { ... }
the moment, I just pass the simple value but I may well change this in the future
if I require more than just the days overdrawn from the account—especially if
} what I want from the account varies with the account type.
Next, I replace the original method body with a delegating call.
I’d like to move calculateDistance to the top level so I can calculate distances for
tracks without all the other parts of the summary. class Account…
I begin by copying the function to the top level. get bankCharge() {
let result = 4.5;
function trackSummary(points) {
if (this._daysOverdrawn > 0) result += this.overdraftCharge;
const totalTime = calculateTime();
return result;
const totalDistance = calculateDistance();
}
const pace = totalTime / 60 / totalDistance ;
return {
get overdraftCharge() {
time: totalTime,
return this.type.overdraftCharge(this.daysOverdrawn);
distance: totalDistance,
}
pace: pace
}; Then comes the decision of whether to leave the delegation in place or to inline
overdraftCharge.
Inlining results in:
function calculateDistance() {
let result = 0;
class Account…
for (let i = 1; i < points.length; i++) {
result += distance(points[i-1], points[i]); get bankCharge() {
} let result = 4.5;
return result; if (this._daysOverdrawn > 0)
} result += this.type.overdraftCharge(this.daysOverdrawn);
... return result;
}
204 Chapter 8 Moving Features Move Function 201
function trackSummary(points) { ... } When I copy a function like this, I like to change the name so I can distinguish
function totalDistance(points) { ... } them both in the code and in my head. I don’t want to think about what the
function distance(p1,p2) { ... } right name should be right now, so I create a temporary name.
function radians(degrees) { ... }
The program still works, but my static analysis is rightly rather upset. The new
Some people would prefer to keep distance and radians inside totalDistance in order function has two undefined symbols: distance and points. The natural way to deal
to restrict their visibility. In some languages that may be a consideration, but with points is to pass it in as a parameter.
with ES 2015, JavaScript has an excellent module mechanism that’s the best tool
function top_calculateDistance(points) {
for controlling function visibility. In general, I’m wary of nested functions—they let result = 0;
too easily set up hidden data interrelationships that can get hard to follow. for (let i = 1; i < points.length; i++) {
result += distance(points[i-1], points[i]);
}
Example: Moving between Classes return result;
}
To illustrate this variety of Move Function, I’ll start here:
I could do the same with distance, but perhaps it makes sense to move it together
class Account… with calculateDistance. Here’s the relevant code:
get bankCharge() {
let result = 4.5; function trackSummary…
if (this._daysOverdrawn > 0) result += this.overdraftCharge; function distance(p1,p2) {
return result; // haversine formula see http://www.movable-type.co.uk/scripts/latlong.html
} const EARTH_RADIUS = 3959; // in miles
const dLat = radians(p2.lat) - radians(p1.lat);
get overdraftCharge() { const dLon = radians(p2.lon) - radians(p1.lon);
if (this.type.isPremium) { const a = Math.pow(Math.sin(dLat / 2),2)
const baseCharge = 10; + Math.cos(radians(p2.lat))
if (this.daysOverdrawn <= 7) * Math.cos(radians(p1.lat))
return baseCharge; * Math.pow(Math.sin(dLon / 2), 2);
else const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
return baseCharge + (this.daysOverdrawn - 7) * 0.85; return EARTH_RADIUS * c;
} }
else function radians(degrees) {
return this.daysOverdrawn * 1.75; return degrees * Math.PI / 180;
} }
Coming up are changes that lead to different types of account having different I can see that distance only uses radians and radians doesn’t use anything inside
algorithms for determining the charge. Thus it seems natural to move overdraftCharge its current context. So rather than pass the functions, I might as well move them
to the account type class.
202 Chapter 8 Moving Features Move Function 203
too. I can make a small step in this direction by moving them from their current function trackSummary(points) {
context to nest them inside the nested calculateDistance. const totalTime = calculateTime();
const totalDistance = calculateDistance();
function trackSummary(points) { const pace = totalTime / 60 / totalDistance ;
const totalTime = calculateTime(); return {
const totalDistance = calculateDistance(); time: totalTime,
const pace = totalTime / 60 / totalDistance ; distance: totalDistance,
return { pace: pace
time: totalTime, };
distance: totalDistance,
pace: pace function calculateDistance() {
}; return top_calculateDistance(points);
}
function calculateDistance() {
let result = 0; This is the crucial time to run tests to fully test that the moved function has
for (let i = 1; i < points.length; i++) { bedded down in its new home.
result += distance(points[i-1], points[i]); With that done, it’s like unpacking the boxes after moving house. The first
} thing is to decide whether to keep the original function that’s just delegating or
return result;
not. In this case, there are few callers and, as usual with nested functions, they
function distance(p1,p2) { ... } are highly localized. So I’m happy to get rid of it.
function radians(degrees) { ... }
function trackSummary(points) {
} const totalTime = calculateTime();
const totalDistance = top_calculateDistance(points);
By doing this, I can use both static analysis and testing to tell me if there const pace = totalTime / 60 / totalDistance ;
are any complications. In this case all is well, so I can copy them over to return {
top_calculateDistance. time: totalTime,
distance: totalDistance,
function top_calculateDistance(points) { pace: pace
let result = 0; };
for (let i = 1; i < points.length; i++) {
result += distance(points[i-1], points[i]); Now is also a good time to think about what I want the name to be. Since the
} top-level function has the highest visibility, I’d like it to have the best name.
return result; totalDistance seems like a good choice. I can’t use that immediately since it will be
shadowed by the variable inside trackSummary—but I don’t see any reason to keep
function distance(p1,p2) { ... }
function radians(degrees) { ... } that anyway, so I use Inline Variable (123) on it.
} function trackSummary(points) {
const totalTime = calculateTime();
Again, the copy doesn’t change how the program runs, but does give me an const pace = totalTime / 60 / totalDistance(points) ;
opportunity for more static analysis. Had I not spotted that distance calls radians, return {
time: totalTime,
the linter would have caught it at this step. distance: totalDistance(points),
Now that I have prepared the table, it’s time for the major change—the body pace: pace
of the original calculateDistance will now call top_calculateDistance: };