Deployment from GitHub

2856319701_bf7d69ca06_b

As part of our ongoing migration to GitHub one of the issues we had to resolve is how to properly deploy to production from GitHub. The straight-forward solution is to create a machine user, which is equivalent to the current approach we had in place for deploying from our self-hosted git repository. But having a machine user can quickly become a big security hole, especially since such user usually quickly gains broad access rights. So, while the approach with the machine user has been good enough for self-hosted git repository protected by our VPN, it feels inadequate for deploying from GitHub. One option that GitHub offers and self-hosted git repository doesn't (at least not out of the box) are deploy keys. As the name already implies a deploy key is an SSH key that grants access to a single repository on GitHub. The constraint that a deploy key can be used to access a single GitHub repository only makes the approach much more secure, but on the other hand it also require more complex SSH keys management for projects with dependencies to other private projects also hosted on GitHub.

We are mostly a Python shop and we use pip to install our dependencies. Each of our projects contains a special file called requirements.txt which specifies the list of dependencies for the project. In the back, pip uses standard git client to access GitHub but pip by itself does not provide an option to specify a SSH key that should be used to access a particular GitHub repository. At first blush, inability to specify repository specific SSH key seems to prohibit using deploy keys when installing dependencies hosted as private repositories on GitHub. But thanks to some cool people on the web and some experimentation by Idioterna and myself we have found a very elegant solution to this problem.

As said before pip uses standard git client which uses default ssh client to access private repositories on GitHub. Consequently we can use GIT_SSH environment variable and a custom script to instruct ssh to use the right deploy key depending on the repository we are accessing. Here's the script we use:

[code language="python"] #!/usr/bin/env python import sys, os, re

org_name = 'Zemanta' # Change this to your organization repo = re.findall(r"'.%s/([^']+)'" % org_name, sys.argv[2])

ssh_command = ["ssh"] if repo: repo_name = repo[0] if repo_name.endswith('.git'): repo_name = repo_name[:-4]

key_dir = os.path.join(os.environ['HOME'], '.ssh/keys/') key_file = os.path.join(key_dir, repo_name) if os.path.isfile(key_file): ssh_command = ["ssh", "-i", key_file]

os.execvp("ssh", ssh_command + sys.argv[1:])

[/code]

This script assumes that deploy keys are stored in the .ssh/keys folder within user's home under the name of repository. This enables simple management of deploy keys which is completely transparent to pip and developers alike.

Enhanced by Zemanta