Friday, September 28, 2012

Merge on check-in, is there more to this?

In case you find yourself merging code very frequently when trying to commit/check in/push your changes to the version control system (VCS like GIT or SVN), you should start asking yourself - 
"Why am I merging so many times? What is the reason for this?".

If you are making lots of merges to your code - it could indicate a design problem in your project.
(I am trying to be very careful here, as each project is unique in its characteristics - the amount of code and modules, the amount of people working on it, and so on, so it does not necessarily point that out).

"How come?" you might ask...
Basically merging happens when more than one party checks out a piece of code makes changes to it and then checks it in. Only the first party that checks in is avoiding a merge, but all others that made changes (to a now stale version of a file) will have to merge their changes with the latest commits.

I heard many times the good argument that claims that to check-in small pieces of code at a time reduces the amount of merges and helps avoiding big sudden changes to the code which could render it buggy. 
While a "small" size of code is something subjective the idea is clear and makes sense.

It is easy to keep track of small changes. A single bug fix or small enhancement and so on.

Although this is a good practice and will help you reduce the amount of merges, there are sometimes still situations where merges happen more than they should.

There could be another reason why you're merging more than you should, and this reason could be - a design "problem" in your code.
Such "problem" could be for example :

1. Classes that represent two (or more) logical units or well defined purpose.

Then it's clear that if 2 people (or more) would need to make some changes regarding different functionalities of this class, because this is the way work was distributed in their team, those people are very likely to face merging issues when they will try to check-in their code.
While if the class would have been splitted into 2 (or more) distinguished classes with clear purpose, then the 2 parties wouldn't need to face a merging issue.
Unless... (here comes #2)

2. Dependencies - sure almost everyone has dependencies and that is how things work. 
But what happens if you are starting to have a lot of dependencies? Then you're also likely to face a merge. 
Only this time you'll have to change your dependants' code (code who uses your class/interface as API) in order not to break compilation, and in those dependants is exactly where the merge lies. 
(Someone who just worked on that class earlier using your now 'old' API... and now you have to adapt their new code! - I'm not talking about changing your interface or API as part of a general change in the infrastructure of the system, that is obvious, and intended to affect client code. I'm talking about the times where it was not meant to be or shouldn't be due to bad design, wrong API usage and so on.)
Although adapting client code might sound less problematic, reducing those incidents as much as possible by reducing the amount of your dependencies, will help you avoid more merges. (Changer point of view).
Also as a client, It would be less likely that someone would come and change your client code, and adapt it in a way that you did not mean.

So if you're having lots of merges, try to understand why you have them before trying to resolve them. Sometimes it is very tempting to go on with the merge, as today there are good automatic tools to perform a merge. But, maybe you'll find something interesting, or come up with a way to refactor code that was taken for granted as a "law of nature". 

You'll eventually earn less merges, a more modular well defined code, and become less error prone, 
After all, who hasn't made a mistake once of accidentally dropping some code on a merge?

So not every merge is a reason to start wondering why this is happening, but it should make you at least aware of the situation and think whether it could be different.
Hope  this post wasn't stating the obvious.

1 comment: