Git and configuration files

Posted on Wed 17 August 2016 in Effective git usage

A question that pops up all the time in the #git IRC channel is how do I ignore changes to a tracked file? Invariably, that is an XY problem of a person trying to shoehorn a configuration file into a git repository.

There are various reasons why people want their configuration files with their code in a repository, but most of them boil down to laziness. The non-lazy ones are almost all trying to store git metadata inside the repository, which is also ill-advised.

So what do we do instead? We use git, but the way it's supposed to be used!

Git metadata

Most software build systems support the storing of software version information in build products. If you properly tag your versions, you can use commands like git describe to generate this version information during build.

So don't hardcode branch names, or revision info. And if you depend on your version number being different in each branch, don't store it in git either. Use git commands at runtime or build time to determine all this:

  • git tag --annotate to create properly tagged releases
  • git describe to generate a version number based on those tags and new commits
  • git rev-parse --symbolic-full-name HEAD to find the branch you are on

If you wish to ship tarballs with version information, you can either make your build scripts query and embed this data (example), or use git archive to create the tarball. Using attributes defined in .gitattributes, you can instruct git archive to replace certain placeholders with git versioning data in the exported files, as demonstrated in the following example:

$ cat config.h
#define GIT_COMMIT "$Format:%H$"
#define GIT_REFS "$Format:%D$"
$ cat .gitattributes 
config.h export-subst
$ git --no-pager log --oneline
a7188ff (HEAD -> master, tag: v1.0) Initial version
$ git archive v1.0 | tar -O -x config.h
#define GIT_COMMIT "a7188ff379e9fc6a211b74cfd6450e7487aeecea"
#define GIT_REFS "HEAD -> master, tag: v1.0"

Configuration files

So now the matter of configuration files. At first sight it seems nice and easy to store everything in the same repository, but soon you will find that this is a false economy. Here are some of the problems you will encounter:

  • It is no longer possible to have a customized local configuration file that is not overwritten by git.
  • You can not deploy changes to your configuration without deploying your application.
  • Any secrets in the configuration (such as database or API passwords) can be read by everybody who has access to your source code.

So how do you manage your configuration files? The same way as your other configuration files which are not in your repository: use a configuration management system.

There are many good configuration management systems out there, including some popular open source ones such as puppet and chef. And of course you put your configuration management recipes in a git repository, just not the same one as your code.

That solves the local overrides and separate deployments, but how about secrets? How you solve that depends on your configuration management system; for puppet you can use hiera with eyaml for example, which allows you to use public key encryption for your secrets, so they're stored in git in an encrypted way. Alternatively, you can use a tool like vault to store and distribute secrets.

Deploying together

If your deployment system is "push all your code and configs to this remote", all the solutions above do not work for you. However, a deployment process like this is quite suboptimal and should be redesigned anyway. If you can't do that (e.g. because your PAAS provider doesn't support sane deployments), there is still a way forward.

The best way to do this is to keep two repositories, or three if you have a central repository that you push to. Your normal work is done in one of the repositories, the other is only used for configuration and deployments. In that second repository, you have a branch with all your configuration. When you wish to do a deployment, you fetch from the main repository and merge the code into your branch with the configuration. That branch can then be pushed to deploy.