Notes taken from watching “Clean Code: Writing Code for Humans” by Cory House on Pluralsight
- “Programming is the art of telling another human what one wants the computer to do” – Donald Knuth
- “Any fool can write code that a computer can understand. Good programmers write code that humans can understand” – Martin Fowler
- We are authors, by definition. And each line of code that we write will be read on average 10+ times
- “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” – Bill Gates
- “Understanding the original programmer’s intent is the most difficult problem” – Fjelstad & Hamlen 1979
Principles for Writing Clean Code
1) Use the right tool
- Don’t give in to wanting to use one tool for everything, don’t be a fanatic
- Don’t be proud of your creativity or “hackery” when a better tool exists. All too often when we feel like this, its because we’ve selected the wrong tool.
- The boundaries between what technology was designed to do is important.
- Stay Native! Keep JavaScript in
.js
files and HTML in.html
files. Keep HTML out of the SQL database. Don’t use inline css style for the same reason; it confuses semantic markup with style. Don’t use a server-side language for dynamic JavaScript.
- Avoid using one language to write another language/format via strings
- Benefits to “Staying Native” to the language is that…
- any web file is cached by the browser so its only requested once
- code coloring
- syntax checking
- separation of concerns
- reusable code
- avoids string parsing
2) High Signal to Noise Ratio
- “Signal” is any logic that follows these points: 1) To the point, 2) Expressive, 3) Do one thing (have a clear responsibility and to it well)
- “Noise” is any logic that contradicts those points.
- When we read code our brain is the compiler. And our brain can only hold 5-9 items in short-term memory at a time
- Strive to keep about ~7 chunks of information in scope at any point. “The Rule of Seven“
- Noise builds over time. The mess in a kitchen builds quietly and did not appear all at once.
- DRY Principle – “Don’t repeat yourself”, which is the same principle as relational database normalization! Any piece of data or code should be in only once place. Copy-And-Pasting code is often a design problem. Strive for the least amount of lines of code possible in this regard because its all code we’ll need to read and maintain over time. Studies have shown that fewer lines of code also have less bugs.
3) Self-Documenting Code
- Clear intent
- Layers of Abstraction so the problem can be looked over in various levels of detail
- Format for Readability
- Favor code over comments; don’t use comments to explain what is otherwise unnecessary ambiguity
Naming
- Imagine reading a book where every noun was a single letter. Or where every verb is one character. Could you follow along?
- You shouldn’t rely in context to be your syntax
- If you’re having a hard time coming up with a name, explain what the class or method does out-loud, even if no one is there, also known as”Rubber Ducking”. By verbalizing, you can discover your solution more easily.
Classes
- A well-defined class should have a single responsibility, and its name should help reflect that
- Regularly ask yourself if the class is doing too much
- Guidelines:
- Noun
- Be Specific (specific names lead to smaller and more cohesive classes)
- Single Responsibility
- Avoid generic suffixes
Methods
- Poor method names require the reader to read the entire class before understanding what the method does
- You need a method name so descriptive that the reader need only read the name to know what it does
- Watch out for the words AND, IF, and OR, because they usually infer that your name is doing too much. This is a sign you need to create two methods, not one. The same is true for naming things besides methods.
- Methods should not have any side effects! Each method should do exactly one thing, and nothing extra. Don’t lie about what the method really does to your readers.
Abbreviations
- Its not the 80s, we have the space and the computing power!
- Almost all abbreviations are inconsistent and can be used for more things than one.
- Because we talk about our code, having abbreviations just gets in the way and the code doesn’t flow off the tongue
Booleans
- Well-named booleans should seem like they’re asking a true-false question. It should read like spoken word.
- Good question-format booleans:
isOpen
,isActive
,loggedIn
, anddone
all infer either true or false.
Symmetry
- When selecting variable names, use clear opposites for matching pairs. Ex.
on
/off
,fast
/slow
,lock
/unlock
,min
/max
Conditionals
- Basic Principles:
- Clearly convey intent
- Its not enough to understand that we have to go right vs left at a fork; you have to understand why there’s a fork to begin with!
- Use the right tool
- Think about the reader when selecting the approach to a conditional
- Bite-size logic
- Sometimes code isn’t the answer
- Clearly convey intent
- Compare booleans implicitly. There are fewer lines, no separate initialization, no repetition, and it reads like speech. Dirty code spoken aloud often sounds wordy and strange.
if (loggedIn == true) {} // vs if (loggedIn) {}
bool goingToChipotleForLunch; if (cashInWallet > 6.00) { goingToChipotleForLunch = true; } else { goingToChipotleForLunch = false; } // vs bool goingGoChipotleForLunch = cashInWallet > 6.00;
- Positive Conditionals
- Don’t use double negatives like
!itNotLoggedIn
, it reduces readability - Negative conditionals are often a sign of “programming by accident”
- Don’t use double negatives like
- Ternary is elegant.
- Far fewer lines, doesn’t require repeating the variable, allowing for more mistakes, doesn’t need a dedicated line for initialization.
- Follows the DRY principle (Don’t Repeat Yourself)
- You may end up needing a true if/else in the future, but don’t pre-plan for that, rather, just put it in when you truly need it. Follow the YAGNI principle. (You Ain’t Gonna Need It)
- Don’t chain Ternary operators or the point of readability is lost
int registrationFee; if (isSpeaker) { registrationFee = 0; } else { registrationFee = 50; } // vs int registrationFee = isSpeaker ? 0 : 50;
- Don’t use string values as types. Use an enum instead
- This can guarantee a single string is only used once
- Without this, a typo will still compile, but the code will have a bug in production. With an enum, your compiler would warn you immediately.
- Allows Intellisensesupport
- Helps document potential state
- Its now searchable. If you search for the word manager, you’ll find tons of un-useful results!
if (employeeType == "manager") // vs if (employeeType == Employee.Manager)
- Don’t use magic numbers
- Make the meaning obvious to the reader
- Don’t require the meaning to be left up to the reader
- Use either a constant variable or enum to make it clear what the number is. Then don’t use that number again.
if (age > 21) {} // vs const int legalDrinkingAge = 21; if (age > legalDrinkingAge) {} //---------------------------------- if (status == 2) {} // vs if (status == Status.Active) {}
- Complex Conditionals
- What question is this conditional trying to answer?
- Techniques to eliminate complex conditionals:
- Intermediate Variables
- Displays why complexity is there and why its needed
- Displays clearly what the complexity is trying to answer
- Containerizes the complexity into something simpler
- Encapsulate complexity in an function
- Export the complexity to a scope that is detached so it is easier to comprehend.
- The name of the function works to clarify the intent of the code
- Adds layers of abstraction that will help readers speed through the code and determine what is important to their task
- Intermediate Variables
if (employee.Age > 55 && employee.YearsEmployed > 10 && employee.IsRetired == true) {} // vs bool eligibleForPension = employee.Age > MinRetirementAge && employee.YearsEmployed > MinPensionEmploymentYears && employee.IsRetired; // ---------------------------------- if (fileExtension == "mp4" || fileExtension == "mpg" || fileExtension == "avi") && (isAdmin || isActiveFile); // vs private bool ValidFileRequest(string fileExtension, bool isActiveFile, bool isAdmin) { return (fileExtension == "mp4" || fileExtension == "mpg" || fileExtension == "avi") && (isAdmin || isActiveFile); } if (ValidFileRequest(fileExtension, isActiveFile, isAdmin)) { return validFileType && userIsAllowedToViewFile; } // the above code still has two logic ideas that could further be abstracted: 1) what a valid file extension is, 2) if a file can be viewed. These could become their own functions to further clarify.
- Favor Polymorphism over Enums for Behavior
- If certain things are constantly being checked and repeated, it may be a sign to put that code elsewhere – in the object of the thing that is being checked! Then, the conditionals will run on their own without the need to be constantly checked by a different class.
- Reduce complexity by
private void LoginUser(User user) { switch (user.Status) { case Status.Active: // logic break; case Status.Inactive: // logic break; case Status.Locked: // logic break; } } // vs private void LoginUser(User user) { user.Login(); } // delegate this login function to the user class, which has three child classes. The correct Login() function is called based off what type of user it is. private abstract class User { public string FirstName; public string LastName; public Status Status; public int AccountBalance; public abstract void Login(); } private class ActiveUser : User { public void Login() {} } private class InactiveUser : User { public void Login() {} } private class LockedUser : User { public void Login() {} }
- Be declarative is possible
- Don’t be ad-hock with how you’re using code. Use the right tool for the job. Use a
foreach()
instead of afor()
, or a.Where()
instead of aforeach()
- Write what you want instead of writing what you’ll do to get what you want.
- This is more possible with tools like LINQ to objects in C#, Lambdaj in Java, jLinq in JavaScript, and Pynq in Python.
- Don’t be ad-hock with how you’re using code. Use the right tool for the job. Use a
- Table Driven Methods
- Sometimes code isn’t the answer
- Sometimes things should be done in the database instead. Don’t write a huge conditional to calculate insurance rates. Just put the set rates into a database and query it. The calculation for the rates may be out of scope for your code. If the rate were to change, do you want to have to update everyone’s copy of the code? Or do a one-line change to a database?
- A table (database) driven approach is necessary in situations like…
- Insurance rates
- Pricing structures
- Complex and dynamic business rules.
- Allows your code to become dynamic
- Avoids hard coding
- Write less code
- Easily changeable without a code change/app deployment
Functions
When to create a function
- To avoid duplication
- any bug fix would otherwise need to be fixed in many places
- DRY is one of the most repeated principles in software engineering
- Look for patterns and redundant shapes in code
- Indentation is a sign of complexity
- Arrow-Code, or code with tons of if statements in it, pushing the indentation outward and nesting code within nested code, is a big sign of high cyclematic complexity, meaning that there are many ways through the same code.
- This hinders testing, unit testing, and bug-finding
- Studies show that comprehension decreases beyond three levels of nested ‘if’ blocks
- To fix arrow code, you can…
- Extract a Method – take the most deeply nested code and create a method with it. This also allows you to add a descriptive method name onto the code. This is the same concept that book authors do when they move text into a footnote or appendix. Imagine the noisy mess that a Wikipedia article would be if all the footnotes remained in the main text instead of at the bottom of the page and linked.
- Return Early – instead of using a bunch of nested if statements, write them inline without nesting them, then use
return false
to route what would otherwise be thefalse
paths. “Use a return when it enhances readability…in certain routines, once you know the answer…not returning immediately means that you have to write more code” – Steve McConnell - Fail Fast. Throw an exception as soon as an unexpected situation occurs. Write “guard clauses” at the beginning of each function to tell early on in the function whether or not the input is good. This creates a contract at the top of the method that the input must abide by to continue. This makes debugging VERY easy since you get an exception thrown under strange conditions.
- Convey intent
- Comments aren’t necessary if you have clear variable names, well-named functions that logically separate the code, and use explanatory intermittent variables.
- To do one thing, and to do it well
- Could you read a book with no paragraphs?
- A concise function will
- Aid the reader
- Promote reuse – smaller functions that do less can be used more in the future that large functions that do many things.
- Ease naming and testing
- Avoid side-effects
Mayfly Variables
- If all the variables are declared at the top of the code, then it violates the aforementioned “rule of seven”
- Well structured functions should only contain “mayfly” variables, which are variables that only live a few hours.
- Initialize the variable just in time
- Do one thing. Short functions make local variables come and go in a flash, becoming a mayfly variable
Parameters
- Strive for 0-2 parameters.
- Too many parameters is a sign that a function needs to be split up
- A small quantity of parameters makes the code easier to understand and test.
- If you’re using flag arguments, or boolean parameters, this is often a sign that it needs its own function.
Signs a Function is Too Long
- When whitespace and comments are used to logically divide a function
- When scrolling is required
- When you can’t come with a perfect name for the function to describe what it does exactly
- When there are multiple conditionals. This is a sign the function could be divided into at least two functions
- When it is hard to digest. When it is hard to read or hard to debug, it is often a sign that there isn’t as many layers of abstraction as necessary.
- Some of Robert Martin’s Guidelines:
- Rarely be over 20 lines
- Hardly ever be over 100 lines
- No more than 3 parameters
- Simple functions can be longer, complex functions should be short.
Exceptions
- Kinds of exceptions
- Unrecoverable – null reference, file not found, access denied
- Recoverable – retry connection, try different file, wait and try again. There is a point though when your app should give up on retrys so there’s not an infinite loop.
- Ignorable – (these are rare. You’re ignoring or “swallowing an exception” for a reason) logging click,
- You should never catch an exception that you can’t handle intelligently. Let it “bubble up”
- Thus, the correct behavior for a broken application is to crash immediately. An application that limps along with buggy or unresponsive parts is a danger to its data and platform.
- Logging an error alone is not enough. In the below example, a speaker could receive an email that they’re registered when they’re actually not!
try { RegisterSpeaker(); } catch(Exception e) { LogError(e); } EmailSpeaker(); // vs RegisterSpeaker(); EmailSpeaker();
- Use a function for the code inside the try block to make it more readable and concise
try { //many //lines //of //complicated //and //verbose //logic //here } catch (ArgumentOutOfRangeException) { //do something here } // vs try { SaveThePlanet(); } catch (ArgumentOutOfRangeException) { //do something here } private void SaveThePlanet() { //many //lines //of //complicated //and //verbose //logic //here }
Classes
- Classes are like headings in a book
- Show the high-level intent of whats inside
When to create a class
- To model an object – abstract or real world
- To increase cohesion – when there is low cohesion, is a sign that a class needs to be divided to become more targeted
- To promote reuse
- To reduce complexity. Solve once, hide away
- To clarity parameters – identify a group of data. If you need to pass the same parameter into many functions, its a sign that those functions and parameter should all be in a class together.
Cohesion
- Class responsibilities should be strongly-related.
- Cohesion is the measure of how related a classes functions and responsibilities are to one another.
- This enhances readability when the class name truly describes everything it does.
- This also increases likelihood of reuse, and reduces the chance that a future developer will reinvent the wheel because your class wasn’t clear enough
- Avoids attracting the lazy developers who don’t want to put their code where it should be
- Specific names lead to smaller more cohesive classes
- To avoid creating low-cohesion classes, watch out for…
- Methods that don’t interact with the rest of the class
- Fields that are only used by one method
- Classes that change often. If there are many more commits than average to a single class, thats a sign its doing too much.
When is a class too small?
- Inappropriate intimacy between two classes
- Feature envy from one class to another
- Too many pieces
Primitive Obsession
- When a large number of primitives are used instead of an object that will encapsulate them all
- When you use objects instead of primitives, you…
- Help readers conceptualize
- Define explicit business properties and rules instead of having them be implicit within the primitives
- Encapsulate data within a single point
- Aids maintenance, especially while searching code for a single string
private void SaveUser(string firstName, string lastName, string state, string zip, string eyeColor, string phone, string fax, string maidenName) // vs private void SaveUser(User user)
Principle of Proximity
- Strive to make code read top to bottom when possible
- Keep related actions together
The Outline Rule
- Collapsed code should read like an outline
- Strive for multiple layers of abstraction
- With this method, you may have methods that are nothing more that a list of lower-level function calls, and thats okay
Comments
- Comments are either signal or noise.
- Its a bad sign when all someone can say about clean code is to use comments
- There must be a justifiable reason why comments exist, and must only exist after weighing the alternative
- Prefer expressive code over comments. Code is kept up to date and is obviously the reference for what code is doing – not comments.
- Use comments when the code alone can’t be sufficient
- Avoid redundant comments. This breaks the DRY principle. You should assume that your reader can read code, as to not require comments for things that may be obvious.
- Avoid comments that show intent if that intent can be expressed in code, like through a well-named constant, enum, a function name, intermediate variable, or extract the conditional into its own function.
- Apology comments are literal malpractice in the software industry. Don’t apologize, just fix it before commit/merge or add a TODO marker comment
- Warning comments are similar, but the developer isn’t even going to apologize.
- Zombie code is code that has been commented out and is just taking up space. It gets in the way of maintenance and searching the code for a string. Kill Zombie Code
- There are two root causes of zombie code. 1) Risk Aversion, hoarding old code in case its needed. Source control can get you your code back though so there’s no need for that! and 2) Hoarding Mentality, hoarding code “just in case its needed”
- Zombie code is directly opposed to comprehension and slows the reader down. It’s just visual noise. Just imagine if the New York Times final product had scribbles, comments, and lines all over it. We would not stand for it!
- Commented out code hinders debugging because it is ambiguous! Maybe the next developer will think that re-adding the code will fix a bug in that area! Probably not. The next developer might think “What did this section do?”, “Was this accidentally commented out?”, “Who did this?”. Then after a change to the uncommented code, the next developer might think “Do I need to refactor this too?”, “How does my change impact this code?”, “What if someone uncomments it later?”
- Divider Comments show a need to refactor the code
- Brace Tracker Comments (comments that follow the braces in order to show the flow of the code) are easily avoidable by refactoring to a function so that the entirety of the inner-brace statement can be fit on the screen at once and the brace tracker comment is not needed.
- Bloated Header – a huge header at the top of the file that may be needed. These can be cleaned up by avoiding line ending characters, not repeating yourself, and following the languages style of conventions.
- Defect Log comments (or comments pointing out where a bug used to be) aren’t useful when source control is used. Change metadata belongs in source control, not code. Imagine reading a book where the author left in their notes on how they fixed logical fallacies and typos. It’d be super annoying and hard to follow.
- Clean Comments:
- TODO, HACK, and UNDONE comments let the developer come back to a certain line or project or to point out to other devs what still needs to be done. These are supported by many IDEs to provide a list of to-do items for you.
- Summary Comments
- Describe intent at a level higher than the code
- Often useful to provide a high level overview of classes
- Risk: Don’t use to simple augment poor naming/code level intent
- Documentation comments like
// see www.facebook.com/api for more documentation
can be useful to the next developer
Stay Clean
When to Refactor
- If it isn’t broke, don’t fix it.
- You should be working with the code already. If the code’s been working reliably for years, don’t risk changing it merely in a desire for cleanliness.
- Refactoring is useful when you fins the code difficult to comprehend or change. If you don’t understand it, others may not understand it either.
- Refactor when you have sufficient test coverage to protect from regression.
Broken Windows
- Accept no broken windows.
- A building with only a couple broken windows will attract vandals to break in and break more windows. This also will attract squatters.
Code Reviews and Pair Programming
- These are highly effective ways to avoid the “broken windows” mentality and promote proactive cleanliness.
- Setting clear guidelines assures that everyone understands the expectations beforehand of what is on- and off-limits for a code review.
- Pair Programming provides
- real-time code review
- increases quality
- naming and refactoring is easier
Boy Scout Rule
- “Always leave the code you’re editing a little better than you found it” – Robert C Martin