Static Code Analysis: Reducing Your Team’s Cognitive Burden
February 22, 2022•1,194 words
Have you ever run into a pull request that seemed impossible to merge? One with hundreds of comments from a dozen people, with two folks passionately arguing about choosing variable names, which language features to use, or whether to delete that unused method that might get used someday. Nobody can seem to agree on a set of standards, and with no ultimate authority to turn to, the code review devolves into a contest of wills.
Those pull requests from hell result in a lot of wasted time for a software engineering team. Don't you wish you could harness that extra time and funnel it back into building a quality product?
That’s where static code analysis comes to save the day!
Static code analysis is the process of analyzing source code against a standard set of rules. These rules vary based on programming language, business domain, and team preferences, but practically every major programming language has a decent static analysis tool that can be added into your team’s regular workflow.
Static code analysis can be accomplished with a variety of tools and methods. This article is going to talk about just two of them: types and linting. If you don't have either added to your team's workflow, those two are a great place to start.
Types
Programming languages can generally be separated into two camps: those with strong types and those with weak ones.
Strong types include languages like C++, C#, and Rust. Weak types can be found in languages like Python and JavaScript.
In general, types are a way of structuring the data in your code and are checked at compile time. This means bugs related to the type of data you're manipulating are caught up front, as part of the development process. A weakly typed language leads to bugs that happen at runtime, which can lead to a bad user experience or errors in production environments.
Some weakly typed languages have ways of adding in types, so don't despair if your team is already using a weakly typed language. TypeScript is a great example that extends JavaScript to include types. If your tech stack has a way of using types, you should absolutely be using them!
Some programmers, especially those who have never used types, can be hesitant to add them to their codebases. It's one extra thing to learn, and when you switch from being able to run your code immediate to having a compiler yell at you before you can even run the code, the experience can be a bit jarring.
But it's totally worth the upfront cost.
Let's look at a simple example of fetching data from an API in JavaScript:
function fetchData(id) {
return fetch(`https://my-api.com/data/${id}`);
}
function doSomething(id) {
const data = fetchData(id);
// what can we do with data?
}
Do you have any idea what sort of data you'll be getting from the server? Even if you remember right now, will you be able to answer correctly a year after writing the code? Our brains are not perfect records of everything we've done, so at some point you'd have to look at the documentation (if there even is any) or hit some breakpoints while running the code to figure it out.
But sprinkle some TypeScript in there and life gets so much better:
interface MyApiResult {
id: number,
name: String,
address: String,
city: String,
zipCode: String,
}
function fetchData(id: number): MyApiResult {
return fetch(`https://my-api.com/data/${id}`);
}
function doSomething(id: number) {
const data = fetchData(id);
// We can easily use anything listed in the MyApiResult interface!
console.log(`Hello ${data.name}. How is ${city} these days?`);
}
Now we can immediately see that fetchData
will return some basic user information. While this example is a bit contrived, having a whole team working on a codebase and not being able to immediately see what fetchData
does results in a bunch of wasted time looking at documentation or manually running the project and triggering the workflow that runs the code.
Types are the most important type of static analysis, especially as team size grows. Programming is all about manipulating data in a computer, so why shoot yourself in the foot by writing code that ignores what that data looks like?
Save your team brainpower for problems more important than the shape of your data and get yourself a language with a type system!
Linting
The other major piece of static code analysis worth adding to your team's workflow is a linter. Linting is the process of analyzing code for bugs, performance issues, proper use of language features, and stylistic choices to ensure code consistency.
Most modern languages have some sort of linting system. Some are built into the language, like Rust's cargo clippy
command, while others arise from community efforts, like JavaScript's eslint
.
However, initially setting up a linter can be difficult to do on a team. Remember those arguments about code style or the proper language features to use in PRs? A linter codifies that into a standard set of rules that everyone's code can be checked against. So the team will have to agree on what those rules should be and then the computer can enforce compliance with every new addition to the codebase.
The biggest gain from a linter is consistency. Even if you don't like particular linter rules, your team doesn't have to argue about what the code looks like during every pull request. A good team is full of people who will value consistency over the "perfect" linter configuration, so you should strive to pick sensible defaults that everyone can live with. Using a popular configuration is one way of quieting even the noisiest developer, since a configuration that's good enough for hundreds of thousands of other people will be good enough for your team.
Once a linter is installed, make sure it runs automatically and that you have gates in place to not merge any new code until the linter is happy. Without a hard blocker, linter errors can and will seep into your code over time, eventually leaving you with thousands of errors or warning that end up getting ignored by the team instead of addressed. This leads to code rot, performance issues, and a generally unpleasant developer experience when you're faced with a wall of doom anytime you see the linter run.
Conclusion
Programming is a creative endeavor, and human brains only have so much capacity each day. By eliminating thought from entire classes of issues, your team will be free to focus on the things that truly matter: solving problems that users of your system face.
A strong type system and sensible linting rules are two great ways to reduce your team's cognitive burden, allowing you to get more done with less time. Automation is the name of the game in software engineering, and having a computer check code against a set of rules is the perfect use of CPU cycles.
Don't spend your precious time arguing over pointless semantics. Use static code analysis tools.
This is the fifth of nine articles delving into the processes that every effective development team should use. Stay tuned for more!