Git Hygiene
August 18, 2020โข1,987 words
Git is an extremely powerful tool used to track changes in files, particularly source code. It's enabled software development to advance substantially over the last couple decades, and a majority of software developers use it on a daily basis.
That said, many developers don't have a good grasp of the tool. They understand the basics, like branching, merging, and committing, but when you start to reach deeper into the toolbox, git trips up many people. This leads to a sub-standard use of the tool, and it completely locks out some of the most useful features that we should be using to take care of our code bases.
This article is meant to provide some ideas on how to improve your git workflow, listed in an order that you can slowly introduce to your team without too much change at any one point:
- Standardize your commit messages
- Rebase locally instead of merging
- Version your software with tags
- Use a defined branch management methodology
- Squash broken or insignificant commits
- Use git hooks
I'm sure there are other ideas that aren't listed here, but I've found these to be extremely useful in both my personal and professional coding practices.
Let's get started!
Standardize Your Commit Messages
Let's face it. We've all written a poor commit message. I personally have a ton of "fixed bug", "arrrgh", "it's working!", "kill me now", and other useless commit messages in my past. When working on personal projects, this usually isn't a problem. But as soon as you start collaborating, all of those vague, frustrated messages do nothing to help your teammates understand the types of changes being added to the code base.
Since I don't like to reinvent the wheel, I have been using Conventional Commits as a personal standard for all my commits, and I've been working on introducing it to my teams at work. If you don't have a good sense of how you want to structure your team's messages, I'd recommend starting with Conventional Commits. It's used by a wide variety of teams and I have never felt at a loss for words in my commit messages based on the conventions it provides.
That said, the exact convention doesn't really matter. What's important is that you have a standard. You can keep it simple, or even get fancy and use emojis. Once you pick one, encourage your team to follow it!
A good commit message, combined with other steps in this article, will allow developers to look at the git history and grasp what types of changes have occurred. It provides useful historical data that can be extremely helpful when tracking down bugs, on-boarding new teammates, or automatically generating release notes.
Committing to a standard is an easy change to make on your team. It doesn't require any fancy technologies, just a conversation and commitment from your team!
Rebase Locally Instead of Merging
Once I pulled up the git history of a repository and 15 of the top 30 commits were all merges of the dev
branch into various feature branches. This was a repository used by a large team, so every day we'd ship new features to the dev
branch. All the developers would then merge
those changes into their branches so they had the latest code.
Keeping your local repository up to date with the remote one is an excellent practice! But when your git tree becomes nothing but merge
conflicts (which are themselves generally poor commit messages), it becomes impossible for developers on the team to read those excellent commit messages your team recently started writing!
Instead, encourage your teammates to pull new changes from the remote repository by doing a rebase
.
rebase
can be scary, especially for newer developers. I didn't dare touch them for the first few years into my career, because you can really screw things up compared to the simplicity of a merge
. That said, rebase
is a powerful tool for keeping your git history clean. If you need a primer on rebase
, check out the official git documentation to get familiar.
Once everyone gets into the practice of using rebase
, your commit history will be much cleaner, and the only merge commits you'll have are those involved in promoting one branch to another (depending on what type of branch management methodology you use), making it even easier to understand how code is moving through your branches or environments!
Version Your Software with Tags
Another powerful tool that I often see underused is tags. Tags are a simple way to describe a specific commit. You can use them to label your software versions, keep track of a problematic commit, automate your releases, and anything else you can come up with. They're extremely flexible and, most importantly, unchanging (unlike branches).
My favorite use is tagging software releases. SemVer, CalVer, build numbers, or even an incremental count will help you deploy and keep track of your releases. Versioning your software is a great way to reduce confusion on what versions your developers, testers, and users are using. You can even get tags set up as part of your build pipeline to make the release process less of a headache!
Use a Defined Branch Management Methodology
Everyone has a branch management methodology, whether you've put thought into it or not. Sometimes, these methodologies have carried over from previous version control software. Other times it's evolved over time to serve the needs of your team. Regardless of what methodology you use, you should make sure that it's written down and that your team understands the process. If you're having a tough time with merge conflicts, a proliferation of branches, or other issues with your git workflow, I'd recommend checking out at least the following methodologies:
Simply reading about other ways of organizing your branches will help you craft the best one for your team. Personally, I've been happy with both GitFlow and GitHub Flow on past teams, but GitLab Flow is intriguing and I'll likely give it a shot someday!
Once you've got your methodology in place and your team is actively working with it, you can pair it with your CI/CD pipeline (and tags) to further automate your release process!
Squash Broken or Insignificant Commits
A fantastic way to keep your git history clean is to use the squash
command. squash
allows you to mush multiple commits into a single one.
Good developers are in the habit of committing early and often. Those frequent commits save you from losing work if the crazy refactoring idea you had doesn't turn out to work so well. If you had made a commit to save your work before the refactor, it's easy to jump back to a working state without losing anything!
However, by the time you've finished writing a new feature or squashing that bug, your git history likely has a variety of commits, some which may break the build or your tests. Those commits are not very useful to you and your teammates won't care about them either.
That's where squash
comes in! The big caveat when using squash
is that it rewrites history. All those commits become one shiny new commit, complete with all the messages of each commit in the message body. The new history can really trip up your teammates if they happen to be working off the same feature branch as you. Ideally, sharing branches doesn't happen often, but sometimes you gotta do what you gotta do to ship that feature or bug fix.
If you know that nobody is using your branch, it's safe to push that new history and overwrite your long chain of commits. But be careful about it, otherwise you'll end up with some upset teammates.
My favorite way is to leave the squashing until the very end. Your CI/CD tools should be able to squash your PR on merge, which results in a single commit full of all that tasty goodness you finished being delivered to the main working branch for your teammates to enjoy!
squash
and rebase
together lead to a more readable history, as every remote commit has a deep level of meaning. Every commit either ends up delivering new functionality or is an easily identifiable merge to help you track code as it moves through your environments.
Next time you've got a new feature you're developing, give squash
a try! It really is a life-changer.
Use Git Hooks
Finally, we arrive at hooks.
Hooks allow you to step into the git life-cycle and run arbitrary scripts. Those scripts live in the .git/hooks
folder of your repository, and it's likely that there are some example ones sitting there for you to check out right now! The great part is, you can use whatever scripting language or tooling you want. While many of the hooks I've seen are bash
scripts, you are free to choose whatever language fits the workflow and skill-set of your team.
One caution with hooks: these run locally, so your teammates could disable them easily if they get annoyed with the process. It takes strong buy-in from the team to successfully use hooks, but once everybody is on board they are an excellent tool.
There are a lot of great tutorials out there on how and why to use hooks, so I'd recommend you check them out, but the most useful ones I've seen or used are the following:
- pre-commit
- prepare-commit-msg
- commit-msg
- post-commit
pre-commit
The pre-commit
hook is an excellent spot to run your tests, linters, and code analysis tools. If any of those fail, you won't be able to commit! This hook helps ensure that everything being pushed to the remote server at least doesn't break things or degrade the readability or quality of the code.
prepare-commit-msg
The prepare-commit-msg
lets you modify the commit message, allowing you to use a standardized message format. We've already covered why good messages are important. This hook encourages teammates to stick to those standards.
commit-msg
The commit-msg
hook runs after the commit message is provided. This is a great place to check that the story or bug number is included in the message, or that your teammates actually adhered to the standard template provided with the prepare-commit-msg
.
post-commit
The post-commit
hook lets you decide what to do after a successful commit. You could automatically deploy the changes if the message contains release:
, send a notification to the team notifying them new changes are available, or anything else you come up with.
There are even more hooks to explore! Go wild, and automate all the things!
Conclusion
I've thrown a lot at you in one article, especially if any of these git
concepts are new to you. For a quick review, I've suggested that you consider introducing these practices into your team's workflow in the following order:
- Standardize your commit messages
- Rebase locally instead of merging
- Version your software with tags
- Use a defined branch management methodology
- Squash broken or insignificant commits
- Use git hooks
Your team may be more comfortable introducing some concepts sooner than others. Use this article as a guide, and choose the workflow that works best for your team! No team is the same, so no workflow will fit all use cases.
I'm sure there are some amazing ways to use git
that I don't know about or completely ignored, so I encourage you to do your own learning and research when deciding how to set up your version control system. And if you find better ways of doing things, please let me know! I love learning about new features or ways of doing things, which is why I share them in the first place.
Good luck with wrangling your team's git workflow! It's well worth the effort.