Using Git to effectively manage your code

Hi everyone

This time I thought I would give a quick insight into the ways which I have found to be best when working with Git. Especially if you’re in the position where you’ve never had to manage or decide the direction of how to use source control.

I’m sure you all know the benefits of source control, whether Git or other, but I thought it would be worth highlighting them anyway:

  1. Structured, integrated development (that is, you can have multiple people working on the same project/release at any given time)
  2. Simplified deployments
  3. Easy way to track where your code is in the SDLC (Software Development Life Cycle)
  4. Easy way to rollback if terrible bugs have somehow gotten through onto a live environment

Introduction

If you’re faced with the somewhat daunting task of introducing source control to your digital team there are a few things you’re going to need to consider.

  1. What are you hoping to achieve by source control? Are you just hoping to see who has done what, or are you looking to version control your code, etc?
  2. What is the skill level with Git on the command line like in your team, do you need to set aside some time for training on this?
  3. What kind of software are you managing, are there lots of different systems, or a single all encompassing one?

The answers to these questions will largely dictate how you need to go about doing this. I will also strong recommend, if you have dependencies, using something like Packagist and composer to manage this as, trust me from experience, it will greatly lessen your pain of staging and production deployments.

Setting up the repositories

This is where there is some debate. Take WordPress, for example, do you source control the whole installation? Or just specific parts?

For me, personally, I feel the best thing to do is source control the specific part you can install, in WordPress this might be a theme or a plugin, in Moodle it’ll be a module or something like that. The reason for this, in my mind, is that there is already version control on everything else within there. WordPress and Moodle both support version numbers, as does Joomla and I’m sure Drupal does too.

So why source control someone else’s code? They’ve already done this for you.

Woah there, Nelly! There’s a huge “gotcha”…

The worst thing you can do in any kind of version control, in my personal opinion, especially if the repository is publicly accessible: do not store passwords, encryption keys, API or access keys in your repository. It’s a terrible idea, and there is a vulnerability with Git that, unless explicitly disallowed, someone can access your .git directory (where all of your source code, references and everything else are stored) and could access your passwords.

The easy way around this is to add a .gitignore to hide your configuration files. Laravel does this itself, as the configurations and keys are stored in a .env file, WordPress will not do this, you’ll need to add /wp-config.php to your .gitignore file, then it will be automatically ignored from your repository forever.

Another word to the wise, if you forget to do this in the first instance, and then decide to delete the file or anything like that; this is version control – change your passwords – the previous commit, including the configuration with all of your passwords and keys and such will still be accessible.

We have repositories, what do we do now?

Now you need to have a working and uniformed way of managing those repositories and the commits. A bunch of non-descript commits is not actually going to do anything to help you manage your source code, I mean, it’ll be there, but it’s not going to be of any use to anyone; it’ll just be handy to say “we have Git”.

I’m going to break the management of your Git repositories into three major parts: commits, branches and tags; which I’ll cover below.

#1 – Commits

In my team I have a rule. Whenever you have completed a job, microtask, fixed an issue, etc. then you need to commit. Anything that you’d be gutted if you lost. We also always push to the remote repository straight away. Git is no use if it only exists on your machine, which you then drop and smash into a quadrillion pieces, is it?

So, for me, push little and often; and explain your thinking in your commits. Here are some terrible commit messages I’ve seen in the past, which will make you realise why, when you’re trying to investigate a bug, you should make sure your messages are good.

Example One: Fixed bug

Fixed what bug?! What was the issue, how did you rectify it, what were you thinking?

Example Two: Various code updates

Well yes, but why?

You can see why this would be frustrating when you’re trying to understand what the code changes were for, especially if you’re investigating an issue on staging or live.

A good commit message usually entails something like this:

1.3 Specification Boilerplating

– Added framework functionality to do something really important, it was important because we had to consider this terrible design flaw

– Improved performance with password hashing; comparison method is now leaner, and is handling the string in a different way to previously

– Resolved Jira issue #GIT-219

So, in summary, commit little and often – with a good message as to why you’re committing, what you’ve just changed/added and what the purpose of it is.

In case you’re unfamiliar with the Git syntax:

git add -A // Add all of the files that have been changed
git add MyClass.php // Specifically add this file to your commit
git add Namespace/AnotherClass.php // Specifically add a file in that directory to your commit
git commit -m "My commit message"

// If you want to get really snazzy

git commit -m "My Commit Subject

A detailed description of everything I did in this commit"

This can be handy if you want to break the files down into specifically what you did to each one.

#2 – Branches

Personally I think any major or minor version should be in its own branch. It’s usually a good idea to name this branches based on the version number and the fact that they’re development (Composer also recognises this)

I also then like to branch for each developer, this makes it easy to keep committing and not worry about conflicts until you explicitly want to merge your stable code – just because it has been committed does not mean that it is, at this point, stable.

My personal preference looks something like this:

| master            - representative of live
|- dev-1-3          - 1.3 version development "hub"
|-- dev-1-3-johno   - my specific development branch

What happens is I branch from master to create my development branch for this version, then I branch from my development branch to do my work, I can commit and push to this branch without any issues. Let’s say Gary joins the party and is going to be developing alongside me; he would then branch from dev-1-3 and create a dev-1-3-gary branch.

When I’m happy my code is stable and ready for integration, I merge my branch into the dev hub branch. But there is a specific way of doing this to avoid conflicts – you do not, under any circumstance, want to be breaking the dev 1.3 branch; what if you have 5 developers merging down from it? You do not want to be that unpopular guy!

Merging looks something like this, in English:

Commit to dev-1-3-johno > Switch to dev-1-3 > Pull > Switch to my branch & merge > Switch to dev hub, merge my branch in

The reason for doing this is any breakages and merging issues you want to handle on your branch; without breaking the dev hub. Then you can merge back without issue.

Assuming no merge conflicts your Git commands are going to look something like this:

// For completeness, this is how you would create your branches
git checkout master
git branch dev-1-3
git checkout dev-1-3
git branch dev-1-3-johno
git checkout dev-1-3-johno

// I'm on my development branch
git commit "My commit message" // My code is committed
git push // My code is now on the remote repository
git checkout dev-1-3 // Jump to the 1.3 development branch
git pull // Get the latest changes, who knows if Gary has pushed something to it
git checkout dev-1-3-johno // Jump back onto my branch
git merge dev-1-3 // Merge Gary's work back in
git push // Push my branch, which now also has Gary's code
git checkout dev-1-3 // Jump back onto the development branch
git merge dev-1-3-johno // Merge my code onto the development branch
git push // Push the development branch, which now has both mine and Gary's code on it

I’ve seen various different ways of managing branches, using versions, everyone having their own development branch, etc. but it is important, I think, to remember the core purpose of the branches – to separate out your commits and code until it’s ready to go onto live.

So, from what I’ve seen, the best way to do your branching is by version. This means you can jump between versions with ease. Maybe you actually start the 1.4 development, which is a long task, before finishing the 1.3 development.

Once you’ve finished the development of 1.3 and it has been released, merge your development branch into master. We’ll cover using tags to help with deployment shortly.

Another Gotcha, this time with branches

One massively important thing to remember with branches, let’s say we’re working on version 1.3 and committing away. A security patch (let’s say version 1.2.12) could be pushed onto master, which you could well lose with your new changes – thus losing the security fix and reintroducing the vulnerability.

Make sure you are merging master back into your main development branch frequently, otherwise you run the very real risk of overwriting bug fixes and patches.

#3 – Tags

The key difference between a branch and a tag is simple: a branch changes over time, a tag does not.

I personally like to use tags to represent a stage in the development process. Following Composer conventions (that’s what international conventions are for) I like to represent the three major checkpoints of any version release:

  1. Ready for staging deployment
  2. Stage passed – release candidate
  3. Deployed onto live  and successfully tested – release

This is quite an easy one really. Composer uses the prefix “v” (for version) when tagging releases and various other conventions which can be found on their website.

So tagging up your code would look something like this

// Depending on your tagging convention, you might tag by version
git tag staging-1-3-0

// Or perhaps staged by month and increment
git tag staging-2017-08-001

// For completeness, on the staging server you would go to the root of the repository
git checkout staging-2017-08-001

// If you are already checked out onto a branch (on live, for example)
git pull origin master --tags

// It will now say you've pulled a new tag, let's say the release candidate for 1.3.0
git checkout rc-1-3-0

// Successful test pass - still checked out onto your RC, which remember, can't change
git tag release-1-3-0
- or - 
git tag v1-3-0

// Failed test - quick rollback to last release, let's say 1.2.12
git checkout release-1-2-12
- or -
git checkout v1-2-12

// At this point it would be a good idea to merge your development branch into master, no separation needed, this code is on live
// Somewhere other than your live environment:
git checkout master
git merge dev-1-3

Be prepared, and have a convention ready, for when something fails staging or fails live deployment. And don’t forget, once your live deployment is finished to tag up the code, because this is now a successful release.

Other gotchas, and things are broken

People who haven’t used Git much before in the past will almost always find a way (and I’m often dumbfounded) to break their repositories. This can be for various reasons; but here are a couple of quick fixes (not for the faint hearted! You will break things)

You can nuke all of your changes by doing this:

Don’t do any of this unless you have a backup of any code you do not want to lose otherwise you will lose it

// Let's just pretend you weren't even at work since your last successful push
git reset --hard; git clean fd;

Alternatively, if it really is FUBAR and you’ve no way of figuring out how to fix it (or don’t have the time) this old chestnut will never fail you

// Let's just pretend this was never a repository
rm -rf .git
git init
git remote add origin {your repository URL}
git pull

The last one really is a last resort and can sometimes get a bit messy. Probably a good idea to have a backup of any code you don’t want to lose if using any of the commands in this section.

In summary, Git will enrich your life

Ultimately, all of this stuff will enrich your development team. It will help you manage your source code effectively, have an easier time with deployments, rollbacks, and seeing what is going on. It also help you to really change the way that you’re working; knowing you always have a nice way of rolling back to any given point.

What the branching structure does is help keep master clean, it also allows lots of developers (I think 8 is the busiest I’ve seen) work on the same version, without constant merge conflicts, which can be a real pain; slow down development, and cause a lot of lost time and frustration.

A word of caution, with this tale: When done badly, Git can be your worst enemy. You have to make sure everybody in your team is singing from the same hymn sheet, so to speak. Otherwise you’re going to end up with all kinds of problems. A single developer not committing their code until the end of a version (when it’s ready for staging, theoretically) can really screw everything up if the other developer has been using Git properly. Education is key on this one. And a few broken repos and salvages of code are lessons hard learned.

Thanks for reading 🙂

Johno

I'd love to hear your opinion on what I've written. Anecdotes are always welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.