skip to Main Content

Understanding Begins with Expressive Names

In 2018, I joined a large project halfway through its development. The original engineers had moved on leaving behind convoluted and undocumented code. Working with this type of code is challenging because you can’t differentiate the plumbing from the business domain. This makes debugging difficult and changes unpredictable because you don’t know the impact. It’s like trying to edit a book without understanding the words.

Many engineers believe the measure of success is when the code compiles. I believe it’s when another engineer (or you in six months) understands the ”why” of your code. The original engineers handicapped the future engineers by not documenting and using obtuse names. The names are sometimes the only window into the previous engineer’s thought process.

Donald Knuth famously said:

Programs are meant to be read by humans and only incidentally for computers to execute. – Donald Knuth

Naming

Naming is hard because it requires labeling and defining where and how a piece fits in an application.

Phil Karlton, while at Netscape observed:

There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton

We see our code through the lens of words and names we use. Names create a language for the next engineer to comprehend. This language paints a picture of how the author bridged the business domain and the programming language.

Lugwid Wittgenstein, a philosopher in the first half of the 20th century, said:

The limits of my language mean the limits of my world. – Ludwig Wittgenstein

The language of our software is only as descriptive as the names we use and using vague names blur the software’s purpose; using descriptive names bring clarity and understanding.

Imagine visiting a country where you don’t speak the language. A simple request such as asking to use the bathroom brings bewildered looks. The inability to communicate is frustrating maybe even scary. An engineer feels the same when confronted with confusing, unclear, or even worse, misleading names.

This feeling is best experienced.

Experience

Examine the first snippet of code, what does this code do? What’s the why?

Take your time.

public class StringHelper
{
    public string Get(string input1, string input2)
    {
        var result = string.Emtpy;
        if(!string.IsNullOrEmtpy(input1) && !string.IsNullOrEmtpy(input2))
        {
            result = $"{input1} {input2}";
        }
        return result;
    }
}

The above code is a simple concatenation of two strings. What the code doesn’t tell you is the “why.” The “why” is so important, without it, it’s difficult to change behavior without understanding the impact. Of course, investigating the code’s usage will likely reveal it’s “why,” but that’s the point. You shouldn’t have to discover the code’s purpose, instead, the author should have left clues, it’s their responsiblity to do so.

Let’s revisit the code, but with a little “why” sprinkled in.

Again, take your time, observe the difference you feel when reading this code.

    public class FirstAndLastNameFormatter
    {
        public string Concatenate(string firstName, string lastName)
        {
            var fullName = string.Emtpy;
            if(!string.IsNullOrEmtpy(firstName) && !string.IsNullOrEmtpy(lastName))
            {
                fullName = $"{firstName} {lastName}";
            }
            return fullName;
        }
    }

The “why” brings the code to life, there’s a story to read.

Communicate

Communicating the intent and the design to the next engineer allows software to live and to grow because if engineers can’t modify the software, it dies. This is a tragedy, even more so when it’s a result of poor design and lack of expressiveness — each is preventable with knowledge.

Do the next engineer a favor and be expressive in your code. Use descriptive names and capture the “why” because who knows, the next engineer might be you.

4 Practices to Lowering Your Defect Rate

Writing software is a battle between complexity and simplicity. Striking balance between the two is difficult. The trade-off is between long unmaintainable methods and too much abstraction. Tilting too far in either direction impairs code readability and increases the likelihood of defects.

Are defects avoidable? NASA tries, but they also do truckloads of testing. Their software is literally mission critical – a one shot deal. For most organizations, this isn’t the case and large amounts of testing are costly and impractical. While there is no substitute for testing, it’s possible to write defect resistant code, without testing.

In 20 years of coding and architecting applications, I’ve identified four practices to reduce defects. The first two practices limit the introduction of defects and the last two practices expose defects. Each practice is a vast topic on its own in which many books have been written. I’ve distilled each practice into a couple paragraphs and I’ve provided links to additional information when possible.

1. Write Simple Code

Simple should be easy, but it’s not. Writing simple code is hard.

Some will read this and think this means using simple language features, but this isn’t the case — simple code is not dumb code.

To keep it objective, I’m using cyclomatic complexity as a measure. There are other ways to measure complexity and other types of complexity, I hope to explore these topics in later articles.

Microsoft defines cyclomatic complexity as:

Cyclomatic complexity measures the number of linearly-independent
paths through the method, which is determined by the number and
complexity of conditional branches. A low cyclomatic complexity
generally indicates a method that is easy to understand, test, and
maintain.

What is a low cyclomatic complexity? Microsoft recommends keeping cyclomatic complexity below 25.

To be honest, I’ve found Microsoft’s recommendation of cyclomatic complexity of 25 to be too high. For maintainability and complexity, I’ve found the ideal method size is between 1 to 10 lines with a cyclomatic complexity between 1 and 5.

Bill Wagner in Effective C#, Second Edition wrote on method size:

Remember that translating your C# code into machine-executable code is a two-step process. The C# compiler generates IL that gets delivered in assemblies. The JIT compiler generates machine code for each method (or group of methods, when inlining is involved), as needed. Small functions make it much easier for the JIT compiler to amortize that cost. Small functions are also more likely to be candidates for inlining. It’s not just smallness: Simpler control flow matters just as much. Fewer control branches inside functions make it easier for the JIT compiler to enregister variables. It’s not just good practice to write clearer code; it’s how you create more efficient code at runtime.

To put cyclomatic complexity in perspective, the following method has a cyclomatic complexity of 12.

public string ComplexityOf12(int status)
{
    var isTrue = true;
    var myString = "Chuck";

    if (isTrue)
    {
        if (isTrue)
        {
            myString = string.Empty;
            isTrue = false;

            for (var index = 0; index < 10; index++)
            {
                isTrue |= Convert.ToBoolean(new Random().Next());
            }

            if (status == 1 || status == 3)
            {
                switch (status)
                {
                    case 3:
                        return "Bye";
                    case 1:
                        if (status % 1 == 0)
                        {
                            myString = "Super";
                        }
                        break;
                }

                return myString;
            }
        }
    }

    if (!isTrue)
    {
        myString = "New";
    }

    switch (status)
    {
        case 300:
            myString = "3001";
            break;
        case 400:
            myString = "4003";
            break;

    }

    return myString;
}

A generally accepted complexity hypothesis postulates a positive correlation exists between complexity and defects.

The previous line is a bit convoluted. In the simplest terms — keeping code simple reduces your defect rate.

2. Write Testable Code

Studies have shown that writing testable code, without writing the actual tests lowers the incidents of defects. This is so important and profound it needs repeating: Writing testable code, without writing the actual tests, lowers the incidents of defects.

This begs the question, what is testable code?

I define testable code as code that can be tested in isolation. This means all the dependencies can be mocked from a test. An example of a dependency is a database query. In a test, the data is mocked (faked) and an assertion of the expected behavior is made. If the assertion is true, the test passes, if not it fails.

Writing testable code might sound hard, but, in fact, it is easy when following the Inversion of Control (Dependency Injection) and the S.O.L.I.D principles. You’ll be surprised at the ease and will wonder why it took so long to start writing in this way.

3. Code Reviews

One of the most impactful practice a development team can adopt is code reviews.

Code Reviews facilitates knowledge sharing between developers. Speaking from experience, openly discussing code with other developers has had the greatest impact on my code writing skills.

In the book Code Complete, by Steve McConnell, Steve provides numerous case studies on the benefits code reviews:

  • A study of an organization at AT&T with more than 200 people reported a 14 percent increase in productivity and a 90 percent decrease in defects after the organization introduced reviews.
    • The Aetna Insurance Company found 82 percent of the errors in a program by using inspections and was able to decrease its development resources by 20 percent.
    • In a group of 11 programs developed by the same group of people, the first 5 were developed without reviews. The remaining 6 were developed with reviews. After all the programs were released to production, the first 5 had an average of 4.5 errors per 100 lines of code. The 6 that had been inspected had an average of only 0.82 errors per 100. Reviews cut the errors by over 80 percent.

If those numbers don’t sway you to adopt code reviews, then you are destined to drift into a black hole while singing Johnny Paycheck’s Take This Job and Shove It.

4. Unit Testing

I’ll admit, when I am up against a deadline testing is the first thing to go. But the benefits of testing can’t be denied as the following studies illustrate.

Microsoft performed a study on the Effectiveness on Unit Testing. They found that coding version 2 (version 1 had no testing) with automated testing immediately reduced defects by 20%, but at a cost of an additional 30%.

Another study looked at Test Driven Development (TDD). They observed an increase in code quality, more than two times, compared to similar projects not using TDD. TDD projects took on average 15% longer to develop. A side-effect of TDD was the tests served as documentation for the libraries and API’s.

Lastly, in a study on Test Coverage and Post-Verification Defects:

… We find that in both projects the increase in test coverage is
associated with decrease in field reported problems when adjusted for
the number of prerelease changes…

An Example

The following code has a cyclomatic complexity of 4.

    public void SendUserHadJoinedEmailToAdministrator(DataAccess.Database.Schema.dbo.Agency agency, User savedUser)
    {
        AgencySettingsRepository agencySettingsRepository = new AgencySettingsRepository();
        var agencySettings = agencySettingsRepository.GetById(agency.Id);

        if (agencySettings != null)
        {
            var newAuthAdmin = agencySettings.NewUserAuthorizationContact;

            if (newAuthAdmin.IsNotNull())
            {
                EmailService emailService = new EmailService();

                emailService.SendTemplate(new[] { newAuthAdmin.Email }, GroverConstants.EmailTemplate.NewUserAdminNotification, s =>
                {
                    s.Add(new EmailToken { Token = "Domain", Value = _settings.Domain });
                    s.Add(new EmailToken
                    {
                        Token = "Subject",
                        Value =
                    string.Format("New User {0} has joined {1} on myGrover.", savedUser.FullName(), agency.Name)
                    });
                    s.Add(new EmailToken { Token = "Name", Value = savedUser.FullName() });

                    return s;
                });
            }
        }
    }

Let’s examine the testability of the above code.

Is this simple code?

Yes, it is, the cyclomatic complexity is below 5.

Are there any dependencies?

Yes. There are 2 services AgencySettingsRepository and EmailService.

Are the services mockable?

No, their creation is hidden within the method.

Is the code testable?

No, this code isn’t testable because we can’t mock AgencySettingsRepository and EmailService.

Example of Refactored Code

How can we make this code testable?

We inject (using constuctor injection) AgencySettingsRepository and EmailService as dependencies. This allows us to mock them from a test and test in isolation.

Below is the refactored version.

Notice how the services are injected into the constructor. This allows us to control which implementation is passed into the SendMail constructor. It’s then easy to pass dummy data and intercept the service method calls.

public class SendEmail
{
    private IAgencySettingsRepository _agencySettingsRepository;
    private IEmailService _emailService;


    public SendEmail(IAgencySettingsRepository agencySettingsRepository, IEmailService emailService)
    {
        _agencySettingsRepository = agencySettingsRepository;
        _emailService = emailService;
    }

    public void SendUserHadJoinedEmailToAdministrator(DataAccess.Database.Schema.dbo.Agency agency, User savedUser)
    {
        var agencySettings = _agencySettingsRepository.GetById(agency.Id);

        if (agencySettings != null)
        {
            var newAuthAdmin = agencySettings.NewUserAuthorizationContact;

            if (newAuthAdmin.IsNotNull())
            {
                _emailService.SendTemplate(new[] { newAuthAdmin.Email },
                GroverConstants.EmailTemplate.NewUserAdminNotification, s =>
                {
                    s.Add(new EmailToken { Token = "Domain", Value = _settings.Domain });
                    s.Add(new EmailToken
                    {
                        Token = "Subject",
                        Value = string.Format("New User {0} has joined {1} on myGrover.", savedUser.FullName(), agency.Name)
                    });
                    s.Add(new EmailToken { Token = "Name", Value = savedUser.FullName() });

                    return s;
                });
            }
        }
    }
}

Testing Exmaple

Below is an example of testing in isolation. We are using the mocking framework FakeItEasy.

    [Test]
    public void TestEmailService()
    {
        //Given

        //Using FakeItEasy mocking framework
        var repository = A<IAgencySettingsRepository>.Fake();
        var service = A<IEmailService>.Fake();

        var agency = new Agency { Name = "Acme Inc." };
        var user = new User { FirstName = "Chuck", LastName = "Conway", Email = "chuck.conway@fakedomain.com" }

        //When

        var sendEmail = new SendEmail(repository, service);
        sendEmail.SendUserHadJoinedEmailToAdministrator(agency, user);


        //Then
        //An exception is thrown when this is not called.
        A.CallTo(() => service.SendTemplate(A<Agency>.Ignore, A<User>.Ignore)).MustHaveHappened();

    }

Closing

Writing defect resistance code is surprisingly easy. Don’t get me wrong, you’ll never write defect-free code (if you figure out how, let me know!), but by following the 4 practices outlined in this article you’ll see a decrease in defects found in your code.

To recap, Writing Simple Code is keeping the cyclomatic complexity around 5 and the method size small. Writing Testable Code is easily achieved when following the Inversion of Control and the S.O.L.I.D Principles. Code Reviews help you and the team understand the domain and the code you’ve written — just having to explain your code will reveal issues. And lastly, Unit Testing can drastically improve your code quality and provide documentation for future developers.

All UTC times are not necessarily the same

A friend pointed out that all UTC Time is not the same. When he told me, I responded with “What!?! What are you talking about? It’s the same.” “No it’s not” he said. He explained, that yes using UTC will allot you an agreed upon time format but that does not guarantee that both server’s clocks are synchronized.

For example, Sever A calls server B for updates. Both Servers use UTC Time. Server A sends over a timestamp, how do we know that the two servers clocks are synchronized and that the two times match, we don’t. The odds are they are not. How can they be? Absolute time does not exist. It’s all relative. By using a timestamp to retrieve data from another server you are making an assumption that both servers have the same time.

*photo reference

2 minutes on Migrating Data

I had a great talk with my friend Dave today. He’s a Data Scientist. He knows his stuff, for sure.

We talked about a number of things, but one that really stuck out was data migration. He says never to migrate via code, use a tool. You are reinventing the wheel. You are locked into your solution. All the risk is in your court. And the solution is not flexible. With that said. He went on to say the most efficient way to move data is with a primary key and a hash.

The destination side will request all the primary key and row hash. Taking the primary key it will check if the row exists. If it does exist it will compare the hash of the source to the hash of the destination row. If they match then the process is repeated for the next row. If they don’t match, then the primary key is added to a list of rows to request from the source. If the primary key does not exist then the primary key is added to the list of rows to be retrieved from the source. When the row comparison is completed all the rows that are stale or do not exist are requested from the source and persisted to the destination.

Grunt

If you enjoy grunt work you’ll do the above. If you are a developer who enjoys building robust applications you’ll leave the grunt work to the tools.

Going Faster Tip #3

In Windows, type the first part of the previous command and then hit F8. The shell will perform a backward search through previous commands that match the first part of what you’ve just typed.

Back To Top