2019-12-20

Reviewing Merge Requests in Your Editor Instead of GitLab/GitHub

This post presents a small tool that I built for myself to help me during code review. It lets you view the diff of merge requests (or “pull requests” if you use GitHub) using the diff viewer of your editor instead of having to browse GitLab/GitHub. After a short overview of the tool (its usage and the git commands it uses), I give a more detailed explanation of its inner functioning, I list the corner cases I took into account, and I give some details on the implementation using Python.

When using GitLab or equivalent products such as GitHub1, you spend quite a lot of time reviewing “merge requests”, and a good part of this time consists in looking at the “changes” introduced by the branch that is going to be merged: this is called the “diff” of the branch.

A merge request diff in GitLab

It’s great that GitLab lets you view this diff directly in your browser, but for complex merge requests you will quickly miss the speed and tooling of you editor.

It turns out that most modern editors include a diff viewing tool as well, but it is usually only made to see the diff between your last commit and the current state of your code. In technical terms, it shows you the diff from your current Git HEAD to your current working tree (the “working tree” in Git denotes the files on your disk).

A diff being viewed in VS Code

This post presents a small tool that I wrote to see the same diff that you would see in GitLab, but it lets you review it in your editor. This means you can enjoy powerful search, code peeking and navigation, static analysis etc … It will also let you edit files so that the changes you make will not be mixed with the changes introduced by the merge request, letting you commit the extra changes in a separate commit to append to the branch that is about to be merged.

a merge request diff viewed in VS Code thanks to my little tool
adding modifications on top of a merge request while in “merge view”

Important remark: this tool is only made for “rebase-style” workflows where the feature branch being merged is a descendant of the master branch it is being merged into, because that’s how we do it at my workplace. There is no guarantee it works for “merge-style” workflows where the feature branch is missing some commits from the master branch. One day I’ll improve my tool to cover this situation as well.

Illustration of the use case this tool is for, and the one where you should not use it

The core of the tool is just two Git commands, but I turned them into a Python script to make the experience a bit smoother, and now every day it is making my code reviewing work so much faster and more comfortable. Here is a glimpse of how you use it (I invoke it as merge-view):

~/misakey (feat/209-server-relief=)$ merge-view
~/misakey (merge-target-feat/209-server-relief +)$ # reviewing, maybe still editing a bit
~/misakey (merge-target-feat/209-server-relief *+)$ # new modifications are not staged (thus the "*")
~/misakey (merge-target-feat/209-server-relief *+)$ # ok let's go back to previous state
~/misakey (merge-target-feat/209-server-relief *+)$ merge-view
~/misakey (feat/209-server-relief *=)$ # new modifications are still here ("*")

And here are the “two git commands” in question:

git checkout --detach
git reset --soft master

You may also want to know how to go back to your previous state:

git checkout <the branch of the merge request>

However there are many corner cases to have in mind if you don’t want to mess your repository up, so it’s better to make a small tool that includes a few checks. Plus having just one command (merge-view) is faster to type and easier to remember.

The source code of the tool is available in a repository at GitLab.com.

How the Git Commands Work

We assume that we start from the branch of the merge request, call it feature. So our Git HEAD is pointing to the feature branch which in turn is pointing to some commit. Recall that we also assume that master is an ancestor of our current location.

The situation we start from

We assume that our working tree (the files on our disk) is at the state corresponding to the feature branch 2. The state we want to reach is where HEAD is pointing to the same commit as master (C in our example) while the working tree still corresponds to feature.

Why isn’t it sufficient to just do git checkout master? Because it would update the working tree as well. The purpose of git checkout master is to “prepare for working on master” (see documentation).

The git reset command has a --soft option to prevent updating the working tree, but the purpose of git reset <some branch> is to move the current branch so that it points to the same commit as <some branch> (see documentation). In our case this would move feature, and this is not what we want, we only want to move HEAD.

This is why we do git checkout --detach. This command puts us in “detached HEAD mode” meaning that HEAD points directly to a commit instead of pointing to a branch (see documentation). Once we are in detached mode, git reset --soft will not move any branch because we don’t have a “current branch” any more. We then end up in the following situation where our working tree is still in the version corresponding to the feature branch (thanks to the --soft option):

The situation that we reach

It is then easy to see that the diff tool of our editor will show the diff corresponding to the merge request of feature into master.

What’s really nice with these two commands is that the difference between master and feature ends up being already in the Git index, also called “staging area”, which is the place where you add modifications to be included in your next commit (see documentation). This means that if you edit the files, your new modifications will not be mixed with the modifications from the merge request because they are not in the index. This lets you keep working on top of the merge request branch. You can then group your additional modifications in a commit by first checking out feature and then building your commit the usual way. I am not entirely sure why the merge request modifications end up in the index, but I am very glad it does.

~/misakey (feat/209-server-relief=)$ git checkout --detach
HEAD est maintenant sur dd5289b docs: doc for "update password" endpoint
~/misakey ((dd5289bd…))$ git reset --soft master
~/misakey ((ab051350…) +)$ git status
HEAD détachée depuis dd5289b
Modifications qui seront validées :
  (utilisez "git reset HEAD <fichier>..." pour désindexer)

    modifié :         auth-backend/go.mod
    modifié :         auth-backend/go.sum
    modifié :         auth-backend/src/cmd/auth.go
    modifié :         auth-backend/src/controller/login_flow_echo.go
    modifié :         auth-backend/src/controller/service.go
    modifié :         auth-backend/src/controller/user_account_echo.go
    modifié :         auth-backend/src/model/authentication.go
    modifié :         auth-backend/src/model/password.go
    nouveau fichier : auth-backend/src/model/secret/confirmation_code.go
    nouveau fichier : auth-backend/src/model/secret/password/password.go
    nouveau fichier : auth-backend/src/model/secret/password/prehashed/encoding.go
    nouveau fichier : auth-backend/src/model/secret/password/prehashed/encoding_test.go
    nouveau fichier : auth-backend/src/model/secret/password/prehashed/hmacsha256.go
    nouveau fichier : auth-backend/src/model/secret/password/prehashed/prehashed.go
    nouveau fichier : auth-backend/src/model/secret/password/unhashed/argon2.go
    nouveau fichier : auth-backend/src/model/secret/password/unhashed/plain.go
    nouveau fichier : auth-backend/src/model/secret/secret.go
    modifié :         auth-backend/src/model/user_account.go
    modifié :         auth-backend/src/model/user_account_extended.go
    modifié :         auth-backend/src/service/authn/confirmation_code.go
    modifié :         auth-backend/src/service/authn/methods.go
    modifié :         auth-backend/src/service/authn/password.go
    modifié :         auth-backend/src/service/login_flow.go
    modifié :         auth-backend/src/service/security/hash_comparator.go
    modifié :         auth-backend/src/service/token.go
    modifié :         auth-backend/src/service/user_account.go
    modifié :         auth-backend/src/service/user_account_password.go
    modifié :         auth-backend/src/service/user_account_password_test.go
    modifié :         docs-host/README.md
    modifié :         docs-host/postman/Auth.postman_collection.json
    modifié :         docs-host/www/backend/swagger/auth-backend/auth/auth.yaml
    modifié :         docs-host/www/backend/swagger/auth-backend/common_responses.yaml
    modifié :         docs-host/www/backend/swagger/auth-backend/users/users.yaml

Creating an ad-hoc Branch for the Merge Target

Note that I didn’t come up with these commands all by myself. I searched “how to change git branch without changing files” and I found this answer on Stack Overflow. You will find many other solutions for this problem, but I went for this one and I’m quite happy with it, mainly because it uses the standard “porcelain” Git commands I am used to and not low-level “plumbing” commands. And it works great. Maybe one day I will consider using a different approach.

Also note that the solution I found has three commands, not two, the third command being git checkout master. This is because the problem the answer solves includes switching to some branch, so ending up with HEAD pointing at a branch, and that’s not really what we want to do. At the beginning I had kept this third command, but it was confusing to see Git telling me that “we are currently at branch master”. I was afraid that if I leave the repository like this and come back to it later, I would not immediately realize that I am in this very specific “merge view” state.

So the third command is not necessary for our use case, but instead of removing it I replaced it with the following:

git checkout -b merge-target-feature

Where feature is just the name of the branch you are coming from. git checkout -b <some branch> creates a branch pointing to the current commit and makes HEAD point to this new branch. Now it is much more explicit and understandable which state you are in when you are in the repository.

VS Code diplaying the name of our ad-hoc branch

Note that this command involves some computation. Namely, you have to compute the name of the new branch from the name of your starting branch. This cannot be done with, say, a simple Git alias, and this is one of the reasons I used a Python program instead (see below for a description of the Python program). Also note that the ad-hoc branch should probably be deleted when you exit merge view.

Handling Corner Cases and Preventing Accidents

The main corner case I had to deal with was when the local master branch is not up-to-date compared to the remote master branch. In this situation you would not have a correct view of the effects of merging. For now the only thing my tool does is checking if master and origin/master are in sync (it does a git fetch before that to make sure origin/master is in sync with its remote counterpart).

There are other simple corner cases that I take into account, like being on the master branch, not being on a branch at all (detached HEAD mode), the repository being “dirty” …

Something I want to stress is that for now my tool just prints an error and exits when one of these corner cases is detected. In the case of master not being up-to-date with origin/master you may want to just move master to origin/master, but what if there was a good reason that the two where at different locations? The safest option is just to give a nice error message to the user (me) and let him decide what to do. Everyone that has been using Git knows how easy it is to mess up your repository when trying to be “smart”.

Creating a Handy Python Module

For some time my “tool” was just one Python file, that is what we usually call a “Python script”. Now I turned it into an actual Python package to make the code cleaner and to make it easier to install it on my machine (recall, the source code is available here).

It only consists in a git_tools directory with two files, utils.py and merge_view.py. The former contains all the “helper functions” used by the later such as run_cmd, get_current_branch etc. merge_view.py contains three functions, view_merge_request, quit_merge_view and toggle_merge_view. The last one is the function that is executed when I call merge-view in my shell and it applies either of the two others functions depending on whether we are currently in merge view or not. Detection of the merge view state is done by applying a regular expression on the name of the current branch.

Then there is a setup.py file that is next to the git_tools module and only contains the following code:

from setuptools import setup
setup(
    name="git_tools",
    entry_points={
        "console_scripts": [
            "merge-view = git_tools.merge_view:toggle_merge_view",
        ],
    }
)

The console_scripts part creates a merge-view shell command when I install my package with Pip, and it tells Pip that the command should call function toggle_merge_view in module git_tools.merge_view.

To install my tool as a Python package I run the following command while being next to the setup.py script:

pip install -e ./

The -e option is for “external package” so that Pip won’t actually copy the code to a location in my system, and instead it will do some sort of “soft link” to the original code. This way when I change the code of my tool it changes the behavior of my merge-view command without having to re-run pip install.

Note that there are no __init__.py or __main__.py files. __init__.py is optional in Python since quite some time, and __main__.py is only required if you want to execute the whole module. Here the module I created is meant to contain all the git tools I build for myself (hence the name git_tools) and I still haven’t decided what should happen when trying to “execute” the module. For now all it does is complaining that it is not executable:

$ python -m git_tools
/home/cedricvr/.pyenv/versions/3.8.2/bin/python: No module named git_tools.__main__; 'git_tools' is a package and cannot be directly executed

You can still import it, though:

$ python
Python 3.8.2 (default, Mar 21 2020, 10:02:27)
[GCC 9.2.1 20191008] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from git_tools import merge_view
>>> merge_view
<module 'git_tools.merge_view' from '/home/cedricvr/repos/perso/git-tools/git_tools/merge_view.py'>

  1. In this post we will consider GitLab because this is the product I use but everything should work as well with GitHub, Bitbucket etc …↩︎

  2. this should be the case if we moved to this branch with a git checkout↩︎