Merge vs Rebase: The War on Proper Git Usage Continues
I am split between two groups on my team at work, each with their own git repositories and policies on how you should commit. I describe the differences in this article, along with how I have adapted to them when working with these groups.
Anyone who looks at my LinkedIn profile will notice I work for Disney, more specifically for the Unified Messaging team within Disney. This is the team that is responsible for handling any messages sent to you from a Walt Disney Company (or subsidiary, such as ESPN) service or application. The team has been growing recently in both complexity of the system and number of people being hired or contracted to work with our team. These people, of course, have no knowledge of our system, and have to be taught how we prefer to do things, and many make the same mistakes again and again.
One of the mistakes I have seen is in relation to how people commit to our git repositories, and the personal preferences that this entails. On one side is the team lead of the API and backend team, who prefers a clean history above all else, and requires people to rebase and squash their branches when putting them into our develop branch. On the other side is the team lead of the web frontend for the API, who prefers an easy development process over a clean history, and hates seeing information lost in history. As the only developer on the team at this moment who works on both sides, I get to have a unique view of this dichotomy and the arguments that happen between these two leads whenever a unified process is proposed.
On the backend, the workflow that is championed by the team lead is that you can make as many commits as you want when you are operating on a local branch or a private branch on the server. When you go to merge your changes into develop, you should first use interactive rebase to restructure or (more commonly) squash your commits into only one or two commits, then you can merge that into our develop branch. Merge commits should never appear in private branches, only in the develop branch due to a pull request. If you have merge conflicts in the pull request, rebase on top of develop to get the latest commits while keeping all your commits together.
On the frontend, the workflow that is being used by the team lead (and thus all others who commit to it) is that you can make as many commits as you want as long as those commits do not break the build. Someone should be able to check out your branch while you are working on it in order to run some tests before or play around with it before you have merged it into develop. In addition, they like to see your private branches in git so that they can monitor progress on different features. This leads to the policy that states you should not rebase or otherwise alter your commit history, since it will create problems for others who may have checked out your branch. Merging in your branch should be done without altering any commits as well, and they favor using GUI tools (such as SourceTree) for observing all the churn in the repository.
My workflow has to change depending on what I am working on, but it mostly goes like this:
- On the backend, I keep only one commit around that I am working on. This commit is amended to whenever I am at a convenient point to do so, and this commit should always build, which may lead to a few days between updates to this commit. I utilize rebase to bring in new changes from develop, and with one commit I do not have to worry about complex conflicts which pile up with each commit. Force pushing is the norm for me, and this is the only action I have to drop to the command line for since SourceTree does not allow this. Whenever I am not finished with a work item but need to switch tasks, I utilize stash to keep my changes around without committing them.
- On the frontend, I commit more frequently, and I do not use rebase on my private branch. Most of my features recently have been small, so they only need at most a few commits, but if I were to do another large feature, then it would be done in stages. When it comes time to merge, it is merged in without doing any tricks or changing the commits at all. If I need to bring in someone else's changes, then I do use merge instead of rebasing on them.
By switching how I do things depending on what I am working on, I manage to stay sane, but my personal preference for a workflow is similar to how the frontend is doing things. I view a clean history as not as important since SourceTree provides an excellent tools for investigating the commit history in our repositories. To me, rebase is equivalent to lying, which may be done in extreme circumstances, but should not be used as a tool in all situations. Merge provides a cleaner way to handle conflicts all at once rather than one commit at a time, and it also does not pretend like you started your work later in the timeline. Grouping commits is fine, but that is what branches are for, and any good git tool should allow you to limit your scope to a single branch.
So, that about sums up my thoughts on merge vs rebase. I am sure if I were to have more experience with other tools, or I had different priorities, I would follow more of what the backend team lead wants to see done. As it is, my workflow keeps out of my way and allows me to write code without dealing with a lot of headaches when it comes time to merge into develop. That is all I care about as a developer, and that is what git was supposed to provide for me.