opinion

Conducting a Good Code Review

As developers, we perform a lot of code reviews (or at least we should be). They are a great way to not only ensure that the code we intend to deploy to production won't have a negative impact on the system, but also a time to learn and better ourselves.

Unfortunately, I have noticed a trend, especially at work. We tend spend the majority of our cycles verifying whether or not the code conforms to standards. In other words, our first instinct is to scan the code for incorrectly placed braces, improperly named variables, etc. Rather than review the actual quality of the code.

There's a lot of great resources available out there on this subject, so I don't want to regurgitate what's already been published, but I did want to highlight a couple points that this article speaks to.

What a code review is:

  • An opportunity to highlight what is being done well. Reaffirm that the developer is taking the correct approach to solving the problem.
  • An opportunity to highlight what can be done better. Offer different solutions and approaches and use it as a chance to improve one's skill set.

What a code review is not:

  • A witch hunt, a time to point out every little fault. Code reviews are a great opportunity to learn. Developers should be eager to have their code reviewed, not shy away from it for fear of belittlement (even if unintended).

How to conduct a code review:

First and foremost, understand the subject matter. Really familiarize yourself with the problem that the reviewee is trying to solve. Go to the developer's office and immerse yourself in the problem. You can't offer solution suggestions if you don't even know what the problem is.

Focus on how the problem was solved. Not how it's formatted. Think big picture. Is the change the right approach to solve the problem in the grand scheme of things?

Code reviews that just point out every little standard violation aren't beneficial to anyone. Use code reviews as an opportunity to knowledge share, and encourage learning. Not as a platform to demonstrate how well you can enforce formatting guidelines.

I would also encourage you to seek out reviews from other developers, especially if you're even the slightest bit unsure if your solution makes sense. Just because you have the deploy permission, doesn't mean you're stuck in a silo. Go forth and get a second opinion.

If you're looking for additional information, here's a blog series that I completely agree with:


Five Suggestions for Cleaner Code

Being involved in open source exposes you to a lot of different projects, individuals, coding styles... the works. It also means lots of code reviews. More reviews than I'd typically perform at work. In code reviews, I can be pretty opinionated. But I believe they stem from a strong desire to ensure that the code base is as clean and readable as possible.

With that said, these are all approaches that I would like to see all developers adopt. Nothing is hard to reason about and should be easy to start trying to implement from day one.

1. Avoid Flag Arguments

What exactly do I mean by a flag argument? I'll demonstrate with an example.

public void DoWork(bool log)  
{
  Execute();

  if(log)
  {
    // write to some log
  }
}

In the code above, we have a method that does something. When that something is done, it checks to see if we should log the event or not. If the passed in value is true, we go ahead and log the event. Otherwise, we bail out of the method.

This approach to writing methods is ultimately very confusing for the consumer. The implementation details may seem clear enough, but it's hard to reason about when used in the wild.

DoWork(true); // what could true possibly mean? 

I feel a much better approach is to actually split this behavior up into two individual methods. One method for each condition.

public void DoWork()  
{
  Execute();
}
public void DoWorkWithLogging()  
{
  DoWork();
  // log stuff
}

Here, we have a method DoWork that does just that. It only does the work and does not perform any logging. We've also introduced a method called DoWorkWithLogging that just calls DoWork, but additionally performs a logging operation. Hopefully, it's obvious given the name of this method, that it will not only do work, but it will log that it did it.

I feel this approach expresses a lot more intent throughout your codebase and reduces the cyclomatic complexity. We don't have any conditionals!

2. Do One Thing

Now, we hear this a lot, right? Do one thing and do it well. The single responsibility principle. However, it can be hard to actually reason about what does one thing actually mean? A lot of it is subjective, but I generally tend to reason about it in the form of abstractions.

An abstraction isn't a responsibility -- the caller doesn't know anything about the implementation details. The caller simply calls the abstraction and nothing more. Let's see an example.

public int CalculateCostForSize(int size)  
{
  if(size >= LARGE)
  {
    return size * LARGE_MODIFIER;
  }

  return size * DEFAULT_MODIFIER;
}

At first muster, this method is may seem completely acceptable. It's doing one thing, right? It's calculating the cost for a given size. However, there are a lot of implementation details that this method is concerning itself with.

The CalculateCostForSize method has to know what it means to be large. In this case, size is greater than or equal to LARGE. The method also has to know how to actually do the calculation. So one could argue that this method is actually doing three things.

  1. Figure out if the size is large.
  2. Calculate the cost when the size is large.
  3. Calculate the cost when the size is not large.

I feel a better approach would be:

public int CalculateCostForSize(int size)  
{
  if(SizeIsLarge(size))
  {
    return HandleLargeSize(size);
  }

  return HandleDefaultSize(size);
}

private bool SizeIsLarge(int size)  
{
  return size >= LARGE;
}

private bool HandleLargeSize(int size)  
{
  return size * LARGE_MODIFIER;
}

private bool HandleSmallSize(int size)  
{
  return size * DEFAULT_MODIFIER;
}

Now the method really only knows about the algorithm in which to calculate the size. It's not aware of any implementation details. Those are hidden behind abstractions.

3. Leverage Intention-Revealing Names

This is one that I see a lot and is an incredibly easy fix. Consider the following code snippet:

public void DoWork()  
{
  if(input % 2 == 0)
  {
    // when do I do this?
  }
}

Now as programmers, we probably all know what this code means. Using the modulus operator to figure out if a given number is even or not. But what if it's not clear? That's an easy fix then!

public void DoWork()  
{
  // check to see if the input is an even number
  if(input % 2 == 0)
  {
    // oh, why didn't you just say so?
  }
}

This is probably one of the things that grinds my gears the most when it comes to comments. I've ranted on comments before. But this just hurts me. Physically. Please avoid using comments to explain potentially confusing code. I humbly offer an alternative:

public void DoWork()  
{
  var inputIsEven = input % 2 == 0;
  if(inputIsEven)
  {
    // easy enough!
  }
}

Put another way, think instead if the example was some complicated business rule. We've all seen them..

if(input > 5 && input < 10 || (input >= 100))  

That really needs to use an intention-revealing name..

var isWithinAcceptableParameters = (input > 5 && input < 10 || (input >= 100))  
if(isWithinAcceptableParameters)  
{
}

No comments required!

4. Use Source Control.. Not Comments

I am a firm believer that code that is not required in any way to run the application, should not exist in the code base. While this does include dead code, unreachable code, etc (which may be hard to detect at times and sneak in). I mostly refer to code that has been commented out.

public void DoWork()  
{
  if(input > 0)
  {
  }
  /*
  * why didnt this work?
  * if(input > -1)
  * {
  * }
  */
}

It's an extreme example, but not uncommon in my travels. A more common example might look a little something like

var config = new Config();  
config.Title = "title";  
//config.MaxSize = 100;
config.MinimumSize = 1; 

Source control is an invaluable tool. Most say that regardless of what you're coding, you should always use source control. Even for the smallest of projects. There is no reason to muddle up your codebase with these type of comments when source control can retrieve it for you at a moments notice.

5. Avoid Long Parameter Lists

In the world of refactoring, there is a code smell called a data clump. I see this pretty frequently in code reviews. The developer introduced a slew of new methods that all have the same data set, but are all individual parameters.

Data clumps reveal themselves in a couple of ways. Through class members:

private int max;  
private int min;  
private DateTime date;

private string result; 

and through methods..

public string GetResult(int max, int min, int age, DateTime date)  
{
}

We even seem to instinctively clump these related data sets together without even noticing. Doesn't putting that newline between date and result just feel better?

The big problem is when we start copy-pasting these parameters into multiple locations. In order for the dataset to make sense, it requires a max, min, and date so why not encapsulate all three pieces of data into an object?

public class ResultContext  
{
  public ResultContext(int max, int min, DateTime date)
  {
    Max = max;
    Min = min;
    Date = date;
  }

  public int Max { get; }
  public int Min { get; }
  public DateTime Date { get; }
}

Now we can simply pass around the object, rather than a long list of parameters.

In the end, the true takeaway in all of this is.. please step away from // and /* */

Coding Standards Should Not Matter

Yeah that's right. That behemoth of a document that you call your standards document just does not matter. Now that's a pretty bold statement, so let me explain.

I believe that a standards document should be about how to style code. It should be about where to place curly braces, it should answer the great spaces vs. tabs debate (even though there is obviously only ever one clear choice), and similar concerns. What a standards document should not contain is substance, best practices. So what is a standard? A best practice?

It's very likely that the definition of a standard or a best practice vary from individual to individual, so to get everyone on the same page, these are how I define standards and best practices.

  • Standard: A rule put in place by an authority for the purpose of consistency.
  • Best Practice: A proven approach, generally the best way to solve a problem.

The key would in the standards definition is consistency. It should not matter at all how the code is formatted as long as the team agrees to an approach and follows it. Ensuring that the coding style of the repository remains consistent.

Consistency is extremely important. Code should look like a team wrote it, and not any one individual. Everyone on the team is going to be sharing the same code base and the style should not vary from developer to developer. You definitely do not want developers as their first order of business when starting a new task to be to reformat the code to their own personal preference.

A best practice is more along the lines of do not wrap your entire class in a try-catch and swallow the exception to avoid generating errors or keep your collection setters private. While as important as a standard, if not more so, these types of statements should not exist within a standards document.

So why shouldn't a standards document contain both?

First and foremost, they are different concerns. We generally like to split up concerns into easily digestible bits and not combine them. It is not going to matter all that much if your team puts curly braces on the same lines as your conditionals. It is going to matter if you start piling all of your concerns into a single class.

However, from my experience, a standards document is rarely just standards. A standards document will generally combine standards and best practices.

Though I do not believe the solution is to split up these two concerns into different documents. No, I believe a standards document should not exist. Emphasis on document. We should have standards. Unfortunately the medium we choose to present our standards is generally the incorrect one.

So whats the better approach? Let your code be your standards document.

I believe that coding standards are better enforced through tooling and the code base itself. Tools such as ReSharper and StyleCop allow teams to define their standards and then have them be automatically enforced through the development environment. No need to spend time writing out paragraphs in English, comparing and contrasting different styling approaches (Do vs. Don't). No need to worry about document formatting. Simply define your rules and push it to the teams who you intend to consume the styling that was agreed upon.

Like the old saying goes, the best standards document is no standards document at all.

A Comment on Comments

Full disclosure, I really don't like comments.

Like any feature of a language, I believe they have their purpose. After all, they were put in the language in the first place. I've just seen a lot of incorrect uses of comments, and the more I run into them, the more I feel like they don't add a whole lot of value.

Lets dive into some of the reasons why.

Comments bloat classes

I see auto generated comments everywhere. They plague a large portion of our code base. They have good intentions, but often end up making the file larger and harder to understand. Especially when the developer pays no mind to the content that was generated.

public class Cube  
{
    /**
    * The height of the cube.
    */
    public Height { get; set; }

    /**
    * The width of the cube.
    */
    public Width { get; set; }

    /**
    * The depth of the cube.
    */
    public Depth { get; set; }
}

We've turned a simple model of a cube which could be a few lines long, into a bloated model with more lines of comments than of actual useful code. Everything that the comments are describing are self evident. The comment headers just aren't needed. In all honesty, if I see something similar to the above, I just go ahead and delete the comments.

Comments are code duplication

One of my favorite light bulb moments. We as developers like to avoid code duplication, reuse is king. If you take a step back and think about comments in this regard, that's exactly what they are.

Code explains what is going to be done. Comments explain what is going to be done. They just do so in different languages. Just like with code duplication, if you change a line of code, you best remember to also change the comment lest you want to be left with:

/**
* ...
<summary>Returns true when the test succeeds; false otherwise.</summary>  
*/
public TestResult RunTest()  
{

You can probably guess that this method used to be a bool, and it was decided that the returned result needed to be a little more verbose. So the developer went ahead and created a TestResult object to store some more information about the result of the test. However, they forgot to update the comment, and now it's out of sync

Comments can be a sign of a bigger problem

Even if you believe you are using comments as intended, to explain potentially confusing code, a better solution is to just make the code less confusing.

if(!(x > 10 && y < 5 && z == -1)) // ensure the object is not out of bounds 
bool isOutOfBounds = (x > 10 && y < 5 && z == 0)  
if(isOutOfBounds) {

You could even go a step further and assign each int its own const variable so the user isn't guessing what exactly 10, 5, and 0 even mean.

Finally.. I've seen my fair share of auto generated comments, and you know what?

Auto generated comments make me cry
/**
* Processes the async
*/
public async Task<Result> ProcessAsync()

Oh.. I had no idea.

Anyway, after all this comment bashing, lets talk about a couple cases that I believe to be a strong candidates for comment use.

Public APIs are a great use of comments. You are going to have a large body of people consuming your code, calling your methods. It can be greatly helpful if everything is well documented, explaining exactly how everything is supposed to work.

Another valid use for comments is explaining why code is present. I like to think that code explains how it works, where comments explain why it works. Code can only make so much sense, regardless of how you name your variables, how well you format it, etc. Sometimes, there exists a scenario that just needs a little more explanation.

/* Reflection may seem like an odd choice here, but after testing the use case
* scenarios it proved to be much faster than splitting everything up into database
* calls */

Like I said in the beginning. Comments were put into languages for a reason, they have their place. Please, just use them sparingly.