Understanding Azure Deployment Slots

Azure deployment slots are a fantastic feature of Azure App Services. They allow developers to have multiple versions of their application running at the same time with the added bonus of being able to re-route traffic between each instance at the press of a button. They can, however, generate a lot of confusion if you don't fully understand how they work.

So what exactly are Azure Deployment Slots?

Let's assume we have a web app running on Azure App Services. We'll call it http://site.com. When you create a new web application, Azure creates a deployment slot for you, typically called production. However, it's possible to add additional deployment slots.

Put simply, a deployment slot is another web application. It has it's own URL, it could have its own database, connection strings, etc. It can be configured any way you see fit. But why would you want to have two web applications? The most common reason is so that we can have a place to deploy new features to, rather than going straight to production, which can be a little risky.

To accomplish this, we would create a deployment slot called staging. The staging slot is where you would deploy all new changes to your application to validate that everything is working before the changes actually go live to your users. Think of it like a test environment. A test environment that's really easy to spin up and manage. Let's create a deployment slot called staging and have it be accessible via http://site-staging.com

Creating a Deployment Slot

Creating a deployment slot is pretty simple. Open your Azure portal and navigate to your Web App resource. Once there, you should be able to see a menu item labeled Deployment slots.

Clicking on the Add Slot button opens the space to add a new deployment slot. Here you can specify the name of the slot (I used staging) and if you want to copy any pre-existing configurations (such as your current production deployment slot). Press OK and you're all set!

When the deployment slot is first created, it is empty. So you'll want to deploy your latest and greatest changes to your staging environment (or just re-deploy your current production version to get something up and running). Deploying to your new slot is really no different than deploying to your production slot. Using the same tooling, just select the staging slot, rather than the production slot.

At this point, we have two instances of our web application running. One is our production instance, supporting all of our production traffic and another staging environment that we are using for testing the latest and greatest features. When you are satisfied with your tests, you will need to swap the staging and production slots so that your users can benefit from your new features.

Swapping Deployment Slots

Swapping deployment slots routes traffic from the source slot to the target slot. In our case, we want to swap the staging and production slots. This will route our users to the staging app (where our newest changes are) when they navigate to http://site.com.

While that is the easiest way to describe what is happening, there is a lot that is going on behind the scenes that is useful to know.

When Swapping... Source and Target Matter

When performing a swap, you are presented with a source and a target. This may be a little confusing at first. Why would it matter? A swap is just flipping two things! While the end result will be the same, the key takeaway is that up-time is not guaranteed for the source slot.

This is because when you perform a swap, this is what is really happening:

  • First, the staging slot needs to go through some setting changes. This causes the staging site to restart, which is fine.
  • Next, the staging site gets warmed up, by having a request sent to its root path (i.e. '/'), and waiting for it to complete.
  • Now that the staging site is warm, it gets swapped into production. There is no down time, since it goes straight from one warm site to another one.
  • Finally, the site that used to be production (and is now staging) also needs to get some settings applied, causing it to restart. Again, this is fine since it happens to the staging site.

This process guarantees that your destination slot will always been warm and your users won't experience any downtime when the swap happens. Users may experience performance issues when navigating to the staging environment, but this is acceptable as it's not really a production environment.

When Swapping... Settings Are Swapped Too

Spoiler Alert: Not all settings are swapped. It is important to remember that when performing a swap, the settings of a deployment slot are also swapped.. but not all of them.

Some settings make sense to keep specific to the slot, these are called slot settings and can be configured in the Azure portal.

When a setting has been flagged as a slot setting it will not be applied to the target site. This is useful for settings such as connection strings. Maybe you want to have a dedicated database for your staging environment so you create a slot setting to hold a connection string that connects to a database specifically set up for your staging environment.

Some settings will be swapped during the swap process. Obviously any setting that is not marked as a "slot setting" under the Application Settings section, but also includes settings such as CORS and Diagnostic settings. The Azure portal even tells you which settings will be applied before you perform the swap operation as shown below.

Unfortunately the preview does not list all changes that will be applied to the deployment slot. I learned this the hard way.

When Swapping... The Code Does Not Move

This was something I wasn't always quite sure about until I dug into it a little more and ran some of my own experiments. When you deploy changes to a deployment slot, that is where the changes will forever reside until you deploy over them. Consider the following scenario:

Version 1 of your application is deployed to your production deployment slot.

Version 2 of your application is deployed to your staging deployment slot.

As we learned previously, each deployment slot is its own web application. When a swap is performed, Azure swaps the virtual IP address of the source and destination and applies the appropriate settings. The code stays with their respective web applications, so the staging web app effectively becomes the production web app, and the production web app becomes the staging web app.

Put another way, imagine having two boxes. One box has black licorice it in labeled "production", and the other box has KitKats inside of it labeled "staging".

Note: To get this analogy right, you just need to agree that KitKats are the superior candy.

Your customers are currently going to the black licorice box, but you realize it's time to give them an upgrade. So you swap the location of the boxes. You also swap the labels on the boxes. This puts the "production" label on the KitKat box and the "staging" label on the black licorice box. Directing your customers to the box of delicious KitKats. They obviously rejoice.

Admittedly, it's sort of a silly example, but I hope it clears up the fact that when you perform a swap, we aren't picking up whats inside the box and moving them to a different box. We're simply relabeling the boxes themselves.

Rolling Back Changes

If the ability to be able to test your changes before going live isn't enough of an incentive to begin leveraging deployment slots, the ability to roll back your changes at the press of a button should be enough to convince you.

After performing a swap, our users are now hitting the latest version of our application. If for some reason we missed something and we start noticing errors, all we have to do is swap again to put the system back into its previous state.

There's no need to open up Git, revert the commit, and re-deploy the change. We don't need to deploy anything at all! It's just a matter of routing our users back to the site that was working for them previously.

Testing in Production

There's also this nifty little feature that we can leverage called Testing in Production. Testing in Production is essentially Azure's implementation of a canary test. If you're unfamiliar with the term, it stems from the mining days where miners would bring a canary down with them into the mine. If the canary died, they'd know something was wrong with the air quality, warning them to leave the mine as soon as possible.

We do canary testing by routing a small subset of users to our new feature. Continuing with the production and staging examples, what we do is take 5% of all traffic to our website and actually have them go to our staging environment with the remaining 95% continuing to hit our production environment. If we don't notice anything wrong, we can bump the 5% up to 10% or even 20%, until we've reached 100%. This way if anything were to go wrong, we've mitigated the number of users impacted by a bad change.

If you're interested in trying this feature out, it is relatively simple to get going. Simply click on the Testing in Production menu item from within your App Service.

This will allow you to set the percentage of traffic that you want to see going to your staging slot (5% as shown in the figure) and production slot. That's all there is to it!

Wrapping Up

Deployment slots are incredibly easy to use and offer a wide range of features that make them hard to pass up. If you're hosting your application in Azure, definitely consider them for your current and/or next project!

Code Coverage is Useless

Not too long ago there were talks around the office regarding a new testing initiative. Now, by itself, this is fantastic news. Who wouldn't want to actually spend some time and get our testing story up to par?

The problem lies within the approach that was proposed, going so far as to say: "We need to ensure that we have at least 80% test coverage."

While the intention is a good one, code coverage is unfortunately useless.

Now, that is a pretty bold statement, so let me clarify a little bit. Code coverage goals are useless. You shouldn't strive for X% coverage on a given codebase. There are a few reasons for this, so let me explain.

It is possible to test enough

Not all code bases are created equal. One could be for an application that sees millions of hits in a day and is grossly complicated. Another could be for a tiny application that services a couple users a day, if that. I always like to envision these different kinds of applications on a risk plane.

.. yeah I still know my way around MS Paint

Imagine if you will that each dot is an application in our system. The further top-right we go, the more likely that if something were to go wrong it'd be some bad news bears. Whereas the further bottom-left.. eh? Maybe someone would notice.

Now, it would be a silly little to say that every application should have at least 80% code coverage. Why? Opportunity cost. While I am a huge proponent of testing, I don't like to test just because. We should aim to test enough. Test enough so that we have enough confidence that our application will function as we expect it to.

In reality, maybe for our right-winged applications, 80% isn't enough. Maybe that actually should be higher and we should not stop at 80%. On the flip side, our smaller applications in the bottom left probably don't need such a high coverage percentage. The cycles spent adding tests would potentially bring us little to no value and end up just being a waste of time.

Note: I feel like at this point some individuals may be a little confused as to how adding tests would be invaluable. There's a whole development methodology called TDD that creates a high level of coverage just by following the red, green, refactor cycle.The points I make here generally refer to going back and adding tests because someone dictated that the code bases coverage percentage was too low. If you're doing TDD to begin with, then setting a target really won't help. It's just a byproduct.

It's all about context. We can't generalize a percentage of coverage in our code base, because each code base is different.

Fun Fact: Did you know this sort of risk plane chart can be applicable to many different scenarios? Ever wondered what the risk plane for the security guy looks like?

Anyway...

In the same vein, not everything needs a test around it. Let's say we wanted to introduce a new public member into our codebase, something simple

public FirstName { get; set; } 

Introducing this line of code, if not called in any of our tests will drop code coverage. Maybe even below our beloved 80%. The fix?

[Fact]
FirstName_ByDefault_CanBeSet()  
{
  var myClass = MyClass();
  myClass.FirstName = "testname";
  Assert.AreEqual("testname", myClass.FirstName)
}

At this point, we're just testing .NET -- something we definitely want to avoid. I tend to only put tests around code that I know could actually have the potential to change in a way that I do not want it to. Logical code.

Code coverage is easy

Just because we have a lot of code coverage, does not necessarily mean that we can have a lot of confidence that our application works as we expect it to. Everything is always more clear with examples, so let's consider the following:

public class Flawless  
{
  public bool IsGuarenteedToWork()
  {
    // some code
  }
}

Now, methods usually have logic that we would normally want to test, right? Conditionals, mathematical operations, you name it. Though, for our example, it doesn't matter! We just want to increase code coverage. That's our goal.

[Fact]
public void IsGuarenteedToWork_ByDefault_Works()  
{
  var flawless = new Flawless();

  var actual = flawless.IsGuarenteedToWork();
}

And there you have it! 100% code coverage. By default, tests that do not have an Assert will be considered passing. Now you're probably thinking.. oh come on, who would actually do this?

People do silly things when incentivized. My go-to example is that of a scenario in which a company tells QA that for every bug they find at the end of the quarter, they will be given a bonus. Seems pretty reasonable right? The flip side of that is the same company tells development that they will receive a bonus based on how few bugs they introduce into the system.

This scenario incentivizes the failure of opposing groups. The development organization doesn't really want to write any code for fear of introducing a bug and wants QA to miss bugs in their analysis. Whereas the QA group wants development to introduce bugs into the system so that they can find them and be rewarded for doing so.

The other thing that we need to keep in mind is that...

Code coverage context matters

Let's consider that our developer wasn't just trying to game the system, and actually put forth an honest effort to obtaining his code coverage goal. Our implementation could be something like the following:

public class Flawless  
{
  public bool IsGuarenteedToWork()
  {
    for(var x = 0; x < int.MaxValue; x++) 
    {
      // Man, this is gonna work. I'll find that solution.. eventually.
    }
  }
}

.. and let's not forget the test.

[Fact]
public void IsGuarenteedToWork_ByDefault_Works()  
{
  var flawless = new Flawless();

  var actual = flawless.IsGuarenteedToWork();

  Assert.True(actual);
}

I hope it was obvious that the example above is far from performant. But in this case, we've reached 100% code coverage and we're actually asserting that the code is working as we intend it to. The implementation works. The test is correct. Everyone is happy. Almost...

When it comes to testing, there are different stakeholders.

Stakeholders are people whose lives you touch - Mark McNeil

This can be broken down further into the types of stakeholders.

  1. Primary Stakeholder (who I'm doing it for) Example: The customer who requested the feature.
  2. Secondary Stakeholder (others who are directly involved) Example: Your boss and/or other developers on the project.
  3. Indirect Stakeholder (those who are impacted otherwise) Example: The customers of your customer.

As programmers, we are writing code to solve problems for other people (sometimes ourselves if we can find the time). The same section of code matters differently to different people. Person A only cares that the answer is correct. Maybe they're notified when it's ready, but they're pretty indifferent to when they receive it. Person B needs the answer soon after requesting it. Our test only completely satisfies Person A.

There can be a lot of stakeholders when it comes to writing code. Unfortunately, we can't say with confidence, even at 100% code coverage, that our code is going to be compatible with everyone's needs.

After all of the harping on why code coverage is useless as a target. I need to wrap up by saying...

Code coverage can actually be useful

I prefer to leverage code coverage as a metric. Coverage is something that we're aware of, something that we can use to make informed decisions about each codebase.

If we notice that one codebase is consistently dropping in coverage, we can take that as a sign to look a little deeper into what's going on. Is the codebase incredibly hard to test? Are the developers just not putting forth the effort to test, even when it makes sense? Maybe it's actually what we would expect from that code base, so everything is gravy.

Coverage can also just let us know if we're doing an adequate amount of testing. If a mission-critical application only has 10% coverage, we should investigate the reasons for that and potentially start a quality initiative and gets some tests strapped on. It allows us to prioritize our testing initiatives without just randomly picking a codebase and start throwing tests at it.

The entire point of all of this is that setting coverage targets will just be counterproductive to your goals. We should be aware of coverage so that we can make informed decisions, but not let it impact the quality of our code just for the sake of coverage attainment.

Best of GOTO Chicago 2018

Speaking at GOTO Chicago's Open Source panel

Wow, what a week! I'll spare you from any May the fourth be with you puns (unless that in itself counts) and dive right into the best of GOTO Chicago 2018.

This past week I had the opportunity to attend my first GOTO conference, and overall I would say it was a good experience. I honestly was not blown away by the sponsor booths, and some of the talks were a little rough around the edges... but what conference doesn't have its fair share of flops? We're not here to talk about those though, I want to highlight the talks and experiences that really stuck out.

Let's start with...

Testing legacy code

My first exposure to GOTO was an all-day workshop hosted by Michael C. Feathers, the author of the Working Effectively with Legacy Code. While the workshop consisted mostly of techniques for getting legacy code into a testable state, the content presented that was presented would take up an entire blog post in itself (which I will assuredly do at some point in the near future). So I am just going to focus on two key elements that really resonated with me.

public versus published
We've all heard of access modifiers (private, public, etc), but what about published? The easiest way to describe it is to give an example, comparing it to public.

Public should be familiar territory. If something is marked as public, it can be accessed by other classes, subclasses, and even assemblies. Within an organization, it should be relatively easy to change public signatures. Sure it might be a lot of work to do so, depending on how many projects reference the public interface, but it's clear to see what needs to change and you can have a high level of confidence that you won't break anything. You may even have the ability to update all of the references yourself (which I would encourage you to do so).

Published, on the other hand, you don't have any control over. A prime example would be an API that other companies leverage. You can't change the interface willy-nilly because you don't have the ability to update the references. Published is an abstract concept. You don't really know if something is marked as published, because there isn't an access modifier for it. It just isn't supported in the language.

While this makes complete sense, I was just happy to finally put some context around the concept. Furthermore, I've been officially converted to the camp of: If you're changing a public interface, do your due diligence and update the references yourself. No more of this obsoleting a method that will never be updated.

Best is the enemy of better

In reality, after some google-fu, the official quote is:

Perfect is the enemy of good - Voltaire

Regardless, to me this was a simple yet powerful quote that I previously had not heard before. It stems from the idea that so many times in our career we spin our wheels indefinitely while trying to seek out the perfect solution. Maybe we never implement the perfect solution, because it's just not possible right now. However, it may be possible to implement a good solution. We just fail to realize it sometimes.

A big part of the workshop was refactoring code to make it testable. To do so, we made a lot of new classes that the old classes just called into. Our old code essentially just became wrappers around the new code. Obviously, this is not a perfect solution, but it's a step in the right direction to get the code into a clean and testable state.

After wrapping up the testing workshop, it's time to head onto over to the first Keynote of the conference.

Drifting into failure

Presentation: https://www.youtube.com/watch?v=mFQRn_m2mP4

The first keynote of the conference was given by Adrian Cockcroft, the VP of Cloud Architecture Strategy for Amazon. It was a little surprising, as he wasn't on the original agenda. Apparently the original speaker came down with a cold, so he filled in. Honestly, I'm glad he did. It was a very enlightening talk. Adrian focused on a central theme, drifting into failure.

Drifting into failure stems from the idea that people in the workplace may not report issues to upper management for fear of inadequacy or even worse, losing their job. These issues, albeit small at the time, start to pile up eventually leading to a catastrophic event.

One example of this was the fact that airlines that had more reported incidents, actually had less fatalities. At first, this seems pretty counter intuitive. How can you have more incidents, but less catastrophic events? Does this mean that an airline with less reported incidents actually had more fatalities? Yes.

The take away here is that incidents are going to occur no matter what. It's just a matter if they're actually reported. Just because an incident isn't reported, doesn't mean it didn't happen. We can't make our processes better if we are unaware of the problems that are occurring. We need to embrace blameless postmortem cultures so that we become aware of these incidents and do not eventually drift into failure. So how can we prevent this from happening? Create non-events.

Adrian spoke of dynamic non-events. It wasn't clear to me what the actual definition was, so I looked up the meaning in its entirety.

Reliability is a dynamic non-event. It is dynamic because the processes remain within acceptable limits due to moment-to-moment adjustments and compensations by the human operators. It is a non-event because safe outcomes claim little or no attention. - Karl Weick

Creating a non-event essentially means to report abnormal behavior, or at least be aware of it. It doesn't have to be catastrophic, and in all honesty it shouldn't be. If a metric is slightly incorrect or just under the minimum, it should be reported and discussed before it balloons into a much larger problem.

The example that he gave comes from a book, Drifting into Failure by Sidney Dekker. A component of an aircraft passed all safety measures, but barely. It was never reported, because everything was technically fine. The employees did everything they were supposed to do as laid out by the system, but in this case, the system was wrong. It lead to a disastrous crash, killing everyone on board the craft.

Non-events are a time for people to report and learn. No one should be afraid to report bad news, or the possibility of bad news. We should strive to actively create safety. His recommendation to create safety?

Break things on purpose!

GOTO had quite a bit of content on a concept that I wasn't completely aware of, chaos engineering. I knew it existed, but never really dove into the specifics of how it worked or how to practice it. This is another topic that I could easily dedicate a whole post to, so we'll just touch on some key takeaways.

Chaos engineering exists because production hates us. Leveraging DevOps is a step in the right direction, but unfortunately, production is a war zone. Nothing can guarantee that when you push the deploy button, everything is going to work exactly as you expect it to. In all honestly, production doesn't want your code to work. Chaos engineering is really about engineering ourselves out of chaos. Not introducing it into our systems, that would just be silly.

It was noted that people actually cause the most chaos. One example that was given was a command line argument. If a certain command was passed with the letter g, everything was a-OK. However, if you were to forget in that letter g, all hell would break lose. One may think this is just another case of user error, but the speaker presented it as a system error. Why is the line so small between success and failure?

So how do we implement chaos engineering? Typically through learning loops. Chaos engineering is all about "you know what you don't know", so you should actively prod those possible failure cases. The implementation steps were broken down as follows:

  1. Prod the system
  2. Hypothesis of why it happened
  3. Communicate to your team
  4. Run experiments
  5. Analyze the results
  6. Increase the scope
  7. Automate the experiments

In the end, the goal with chaos engineering isn't to break the system intentionally. You would never want to intentionally take down production and impact your customers. If you know of a failure case and how it behaves, there really isn't much value in running a chaos experiment. Everything is already known!

Old is the NEW new

Presentation: https://www.youtube.com/watch?v=AbgsfeGvg3E

I love Kevlin as a speaker, he always does a fantastic job. I highly recommend giving the presentation itself a watch, but the Cliffs notes are essentially that everything in computer science has already been done. Most of the new stuff we see today has appeared in literature that existed as early as the 50s. Software design principles, testing approaches, you name it.

The vast majority of the presentation was comparing new ideas to old ideas. Highlighting the fact that we've been thinking about and solving the same problems for quite some time now.

To me this is very similar to Uncle Bob's post, The Churn. Which states that we as developers are so infatuated with learning all of the new and shiny frameworks and libraries that are released, we drift away from the basics. We forget core principles, and never really advance because we're stuck in the churn of learning every new thing that comes out.

I agree with most of it. I couldn't imagine trying to pick up every little new thing that pops up, and frankly I just don't think it's possible.

In the end, it was a worthwhile conference. I may just have to add it to my list of yearly must GO-TO conferences...

Until next time!

Five Suggestions for Cleaner Code

Being involved in open source exposes you to a lot of different projects, individuals, coding styles... the works. It also means lots of code reviews. More reviews than I'd typically perform at work. In code reviews, I can be pretty opinionated. But I believe they stem from a strong desire to ensure that the code base is as clean and readable as possible.

With that said, these are all approaches that I would like to see all developers adopt. Nothing is hard to reason about and should be easy to start trying to implement from day one.

1. Avoid Flag Arguments

What exactly do I mean by a flag argument? I'll demonstrate with an example.

public void DoWork(bool log)  
{
  Execute();

  if(log)
  {
    // write to some log
  }
}

In the code above, we have a method that does something. When that something is done, it checks to see if we should log the event or not. If the passed in value is true, we go ahead and log the event. Otherwise, we bail out of the method.

This approach to writing methods is ultimately very confusing for the consumer. The implementation details may seem clear enough, but it's hard to reason about when used in the wild.

DoWork(true); // what could true possibly mean? 

I feel a much better approach is to actually split this behavior up into two individual methods. One method for each condition.

public void DoWork()  
{
  Execute();
}
public void DoWorkWithLogging()  
{
  DoWork();
  // log stuff
}

Here, we have a method DoWork that does just that. It only does the work and does not perform any logging. We've also introduced a method called DoWorkWithLogging that just calls DoWork, but additionally performs a logging operation. Hopefully, it's obvious given the name of this method, that it will not only do work, but it will log that it did it.

I feel this approach expresses a lot more intent throughout your codebase and reduces the cyclomatic complexity. We don't have any conditionals!

2. Do One Thing

Now, we hear this a lot, right? Do one thing and do it well. The single responsibility principle. However, it can be hard to actually reason about what does one thing actually mean? A lot of it is subjective, but I generally tend to reason about it in the form of abstractions.

An abstraction isn't a responsibility -- the caller doesn't know anything about the implementation details. The caller simply calls the abstraction and nothing more. Let's see an example.

public int CalculateCostForSize(int size)  
{
  if(size >= LARGE)
  {
    return size * LARGE_MODIFIER;
  }

  return size * DEFAULT_MODIFIER;
}

At first muster, this method is may seem completely acceptable. It's doing one thing, right? It's calculating the cost for a given size. However, there are a lot of implementation details that this method is concerning itself with.

The CalculateCostForSize method has to know what it means to be large. In this case, size is greater than or equal to LARGE. The method also has to know how to actually do the calculation. So one could argue that this method is actually doing three things.

  1. Figure out if the size is large.
  2. Calculate the cost when the size is large.
  3. Calculate the cost when the size is not large.

I feel a better approach would be:

public int CalculateCostForSize(int size)  
{
  if(SizeIsLarge(size))
  {
    return HandleLargeSize(size);
  }

  return HandleDefaultSize(size);
}

private bool SizeIsLarge(int size)  
{
  return size >= LARGE;
}

private bool HandleLargeSize(int size)  
{
  return size * LARGE_MODIFIER;
}

private bool HandleSmallSize(int size)  
{
  return size * DEFAULT_MODIFIER;
}

Now the method really only knows about the algorithm in which to calculate the size. It's not aware of any implementation details. Those are hidden behind abstractions.

3. Leverage Intention-Revealing Names

This is one that I see a lot and is an incredibly easy fix. Consider the following code snippet:

public void DoWork()  
{
  if(input % 2 == 0)
  {
    // when do I do this?
  }
}

Now as programmers, we probably all know what this code means. Using the modulus operator to figure out if a given number is even or not. But what if it's not clear? That's an easy fix then!

public void DoWork()  
{
  // check to see if the input is an even number
  if(input % 2 == 0)
  {
    // oh, why didn't you just say so?
  }
}

This is probably one of the things that grinds my gears the most when it comes to comments. I've ranted on comments before. But this just hurts me. Physically. Please avoid using comments to explain potentially confusing code. I humbly offer an alternative:

public void DoWork()  
{
  var inputIsEven = input % 2 == 0;
  if(inputIsEven)
  {
    // easy enough!
  }
}

Put another way, think instead if the example was some complicated business rule. We've all seen them..

if(input > 5 && input < 10 || (input >= 100))  

That really needs to use an intention-revealing name..

var isWithinAcceptableParameters = (input > 5 && input < 10 || (input >= 100))  
if(isWithinAcceptableParameters)  
{
}

No comments required!

4. Use Source Control.. Not Comments

I am a firm believer that code that is not required in any way to run the application, should not exist in the code base. While this does include dead code, unreachable code, etc (which may be hard to detect at times and sneak in). I mostly refer to code that has been commented out.

public void DoWork()  
{
  if(input > 0)
  {
  }
  /*
  * why didnt this work?
  * if(input > -1)
  * {
  * }
  */
}

It's an extreme example, but not uncommon in my travels. A more common example might look a little something like

var config = new Config();  
config.Title = "title";  
//config.MaxSize = 100;
config.MinimumSize = 1; 

Source control is an invaluable tool. Most say that regardless of what you're coding, you should always use source control. Even for the smallest of projects. There is no reason to muddle up your codebase with these type of comments when source control can retrieve it for you at a moments notice.

5. Avoid Long Parameter Lists

In the world of refactoring, there is a code smell called a data clump. I see this pretty frequently in code reviews. The developer introduced a slew of new methods that all have the same data set, but are all individual parameters.

Data clumps reveal themselves in a couple of ways. Through class members:

private int max;  
private int min;  
private DateTime date;

private string result; 

and through methods..

public string GetResult(int max, int min, int age, DateTime date)  
{
}

We even seem to instinctively clump these related data sets together without even noticing. Doesn't putting that newline between date and result just feel better?

The big problem is when we start copy-pasting these parameters into multiple locations. In order for the dataset to make sense, it requires a max, min, and date so why not encapsulate all three pieces of data into an object?

public class ResultContext  
{
  public ResultContext(int max, int min, DateTime date)
  {
    Max = max;
    Min = min;
    Date = date;
  }

  public int Max { get; }
  public int Min { get; }
  public DateTime Date { get; }
}

Now we can simply pass around the object, rather than a long list of parameters.

In the end, the true takeaway in all of this is.. please step away from // and /* */

A Look at Functional Refactoring

Refactoring is a common task for developers.. hopefully anyway. It keeps code cleaner, more maintainable, and reduces code duplication. It also allows for code to be extended more easily, saving us time in the long run.

xkcd - The General Problem (https://xkcd.com/974/)

There are many approaches we can take to refactoring as outlined by Martin Fowler's refactoring catalog. I'd wager that one of the most common refactoring approach is taking code that exists in multiple spots, wrapping it up in a method, and then calling that new method. Otherwise known as Extract Method.

To give an example, we're probably all familiar with something like the following..

public void MyMethod()  
{
  using (var file = new System.IO.StreamWriter(@"C:\log.txt"))
  {
    file.WriteLine($"Called { nameof(MyMethod) }");
  }
}
public void AnotherMethod()  
{
  using (var file = new System.IO.StreamWriter(@"C:\log.txt"))
  {
    file.WriteLine($"Called { nameof(AnotherMethod) }");
  }
}

Now, there are a couple problems with this implementation. Where the log is being stored to is explicitly defined in both of the methods. This could be problematic in the future if we decide to change the location and/or the file name of the log file.

The other problem is how we are logging is present in both methods. If we ever decided to create distributed logs, we would have to go into each method that called our logging functionality and update the call to StreamWriter to a network call.

Fear not though, we can refactor this approach and make our code a lot better. Applying the Extract Method refactoring gives us:

public void MyMethod()  
{
  WriteToDefaultLog($"Called { nameof(MyMethod) }");
}

public void AnotherMethod()  
{
  WriteToDefaultLog($"Called { nameof(AnotherMethod) }");
}

public void WriteToDefaultLog(string message)  
{
  using (var file = new System.IO.StreamWriter(@"C:\log.txt"))
  {
    file.WriteLine(message);
  }
}

Now that all of the implementation details are put into a central location, and the dependent code snippets are calling the abstractions, it's a lot easier to introduce change into the system. This approach to refactoring is called Imperative Refactoring or "Inside out" refactoring.

Visualized, we take the code in orange (the code to refactor), drop it all into a new method, somewhere in isolation. We then have the code call that new method, that abstraction we just created, instead of being intimately familiar with all of the implementation details. Pretty neat.

However, there is another approach to refactoring called Functional Refactoring or "Outside in". It is useful for when you want to refactor the surrounding code. Think of try/catch blocks or using blocks.

If that is not entirely clear, let's go ahead and look at some code.

Consider the following try/catch block that calls our default logger when a method throws an exception.

public void MyMethod()  
{
  try
  {
    SomeScaryStuff();
  }
  catch (Exception ex)
  {
    WriteToDefaultLog(ex);
  }
}

public void AnotherMethod()  
{
  try
  {
    LessScaryButStillPrettyScary();
  } 
  catch (Exception ex)
  {
    WriteToDefaultLog(ex);
  }
} 

So while the above may seem elegant and straightforward, we have some code duplication. We have explicitly written two try/catch blocks as well as called the same WriteToDefaultLog method. If we wanted to change that logging call, we'd have to update it in two spots.

This may not seem like such a big deal at first, but this could rapidly grow out of control if we wanted to call some logging functionality every time an error occurred in our application.

So how can we remove this duplication?

public static void Try(Action operation)  
{
  try
  {
    operation();
  }
  catch (Exception ex)
  {
    WriteToDefaultLog(ex);
  }
}

By refactoring out the surrounding try/catch block, and passing in the inside code via an Action, we're able to create a new method, Try.

In order to leverage this method, the caller would need to pass in the code that they would want executed within the try block through a lambda expression.

public void MyMethod()  
{
  Try(() =>
  {
    SomeScaryStuff();
  });
}

public void AnotherMethod()  
{
  Try(() =>
  {
    LessScaryStuffButStillPrettyScary();
  });
}

With this approach we're able to encapsulate all of the logic that our try/catch blocks contain into a single location. Need to change the logging approach when an exception is thrown? No problem, we just need to update our Try method and we're good to go.

But wait, there's more! Consider the need to connect to a database through ADO. First you need to establish a connection to your database, and then you need to create a command to execute against that connection.

using(var connection = new SqlConnection("myConnectionString"))  
{
  using(var command = new SqlCommand("dbo.myCommand", connection)
  {
    // do some work
  }
}

This is a pretty common setup, but has a couple drawbacks. First and foremost, it is verbose. The ceremony to call a procedure is multiple lines long. Secondly, if you ever needed to change how you retrieved your connection string... it'd be a sizable effort to update every procedure call. But there's hope! By applying some functional refactoring, we can abstract these implementation details away from the callers and make our procedure calls more succinct.

...
ExecuteCommand("dbo.myCommand", () => {  
  // do work
});
...

public void ExecuteCommand(string commandName, Action operation)  
{
  using(var connection = new SqlConnection("myConnectionString"))
  {
    using(var command = new SqlCommand("dbo.myCommand", connection)
    {
      operation();
    }
  }
}

As we can see, this approach can be applied to a lot of different scenarios. Logic and implementation details are hiding everywhere!, so don't always think Inside out... try Outside in!