With every commit you do, Git not only registers the change in your code, but marks you as the author of this change as well. It uses the settings user.name
and user.email
to create an entry like this one in the log:
1 2 3 4 |
commit aeaeb9e927592e907ae1a7f5c381876e05109a80 Author: John Doe <john@doe.org> Date: Mon Jun 10 20:11:11 2019 +0200 Your commit message |
As long as you use Git only to manage your source code, this behaviour does not require any attention. You write your code, commit and push as you like, and everything works. However, if you intend to do any form of data mining on your Git repository, you need to look deeper on how authors of Git commits are tracked.
Is there a problem?
The settings for your name and email are just strings. You can set whatever you want and for Git those values are case-sensitive. A user.name of John and john are therefore two different users and Git will show them as two different authors. It can get even worse, then the email address is case-sensitive too:
1 2 3 4 5 6 7 |
git shortlog -se 1 John Doe <John@Doe.org> 1 John Doe <John@doe.org> 2 John Doe <john@doe.org> 1 Max <max@test.co.uk> 3 Max Example <hi@test.com> 1 john doe <john@doe.org> |
The problem starts when you need to know how many commits are made by John. He may use different computers and have different settings on each of them. And even when he changes them all to an identical value, the commits already done will still show the old values.
Therefore, with every command you do to find out who has changed a file you will need to manually count the different authors. That is a lot of work, luckily you do not need to do that if you use a .mailmap file.
Use .mailmap to merge authors
Git will apply the settings in the .mailmap file before it shows you the output of any command you run. The content of .mailmap follows a simple template:
Name you want to keep <email> Name you no longer want <email>
For every user in your project that has multiple author identities, you can create an entry in the .mailmap file. You first write the author information you want to keep, then a space and then the author info you no longer want. In my example the .mailmap file looks like this:
1 2 3 4 |
John Doe <john@doe.org> John Doe <John@Doe.org> John Doe <john@doe.org> John Doe <John@doe.org> John Doe <john@doe.org> john doe <john@doe.org> Max Example <hi@test.com> Max <max@test.co.uk> |
If I now run the shortlog command again, I only get one entry per “real” author:
1 2 3 |
git shortlog -se 5 John Doe <john@doe.org> 4 Max Example <hi@test.com> |
With this little trick you do not need to count or guess, Git just shows you what you need to know.
Conclusion
Merging authors using .mailmap is especially helpful when your team members use different machines or different Git clients. All you need to do is to update the file whenever a new combination turns up and everyone can focus on their work and not try to catch every Git misconfiguration. Try it when you need to use Git for more than just source control.
1 thought on “Little Git Tricks: Use .mailmap to Merge Different Authors”