I was reviewing some pull requests at work today. One of the PR’s had an updated composer.lock
file. We usually check if the reference
matches the version
for this update, to see if that commit is actually released on the module’s master
branch:
|
|
Usually, this reference
matches the hash of the commit we’ve tagged as this version
. In this particular case however, the hash mentioned in reference
was nowhere to be found in the commit log. So what’s going on here?
To investigate, I tried checking out that particular hash:
|
|
That’s funny.. I check out 2c53986
, but I end up with 155ffc0
.
If we check the rev-list
for that tag, you’ll notice that the latest commit is indeed 155ffc0
:
If we check show-ref
however, the tag points to a different hash instead:
If we do the same for tag 0.11.0
you’ll notice that they both point to the same commit however:
So, what’s the difference then? The answer is that there are two distinct types of tags.
Lightweight and Annotated Tags
{% pullquote %} This is all due to the fact that there are two distinct types of tags in git:
- Lightweight tags: A tag that is attached to an existing commit. This merely functions as a pointer to a specific commit, and as such it ‘piggybacks’ on that commit’s hash as identification. This type of tag does not allow you to store any information that specific to the tag.
- Annotated tags: A tag that has its own commit hash and is, as such, stored as a separate object in git. This tag allows you to store information that is related to this specific tag. You can add a tag message, GPG sign it, and the tagger is stored.
So, which one should I use? The short answer is annotated tags. Read this StackOverflow answer to see why, because I can’t explain it any clearer than that :)
Creating an annotated tag is easy:
|
|
-s
will GPG sign your tag. More about this further down.-a
will create an annotated tag.-m "<message>"
will add a tag message.- If you need to amend / fix / replace an existing tag, you can use the
-f
parameter to overwrite the current tag.
Why should I care?
You should care about the advantages of annotated tags. To elaborate, here are three viable use cases:
Whodunnit
Take my composer.lock
example. We write modules in separate modules, and include those modules using composer
in actual projects. If i review a project, I only see the composer.lock
file, but don’t immediately see the code for the actual module. I need some way to make sure that the code that’s being rolled out in the current project is approved and stable. This usually means that I have to dive into the module’s code and review that as well. But I’m no expert on every single module my company has created, so I’m probably not the best reviewer for a (large) number of modules. How do I know that the code has been reviewed and it’s all good? I check the tag
. If it has been tagged by the module’s owner I can rest assured that a proper and thorough review has been done before this release was tagged. Annotated tags make this easy:
|
|
As you can see, git show
has information about the tag, and specifically who tagged it. Further down, you see the actual commit that this tag is connected to. That commit has a different author, but the tag was done by me, so it should be all good, right? No need to delve into that code any further, so we can move on with our original PR.
In short: Annotated tags give you separation between tagger (reviewer and/or releaser) and author/committer.
What is contained in this tag?
Another vital piece of information that is usually contained in the pull request and/or other review docs that may exist is the context of the change set:
- Which changes are included in this change set?
- To which JIRA tickets are these changes related?
- Why have we decided to release these changes as a new tag?
- Are any additional actions required to make this release work? (Play-books)
As said, we usually document these changes in the PR, which gives a good overview of the entire context. But if you’re on the ‘receiving end’ of the repository, and you don’t have access to the review software for instance, that context might not be as clear.
That’s why it’s a good idea to copy all (or some) details from the PR into the annotated tag message, so a permanent piece of documentation about the how/what/when and why exists in the repository. See the example above for a plausible tag message.
Whodunnit for paranoid people
Because, let’s face it, everyone should be a bit paranoid these days. How do I know that the tag and/or the repository haven’t been tampered with? Everyone can impersonate you in git, as long as they have write access to the origin
repository. So let’s say that your private key has leaked, a hacker has configured git config user.email
with your e-mail address, committed some malware to your repository and moved your tag to a version that includes the malware. That’s not good. That’s not good at all.
Granted, if the above scenario should happen to you, you probably have bigger issues than verifying a tag. But what if you pull in some external changes that you don’t quite trust? Or what if one of your enterprise customers wants to ensure that they pull in a verified set of changes onto their platforms?
That’s why you GPG sign your tags. Remember that -s
parameter we’ve added?
During creation of the tag you’ll notice this:
and in the git show
example above you’ll notice that a PGP signature was included. We can use this information to verify if the tag
was intact (not tampered with), and that it was actually signed by someone we know and trust:
Yes, that tag has a good signature, and it was signed by me, whom I explicitly added to my GPG web of trust.
So, now you know how and why. Start using annotated tags!
Certain details, messages, addresses and hashes in this article have been altered to protect company specific details.