Running Git in AWS Lambda Functions
Posted by Miguel Lopez on Thu 07 March 2019 in tutorials
Technical Stack: AWS Lambda, Python 3.7, Serverless Read: 5 minutes
Prerequisites
- Serverless Framework Installed Serverless Quick Start
- AWS Free Tier Account AWS Sign Up
- Check out the AWS Lambda Documentation if you're new to AWS Lambda and serverless technology.
- Check out GitPython - a python package used to interact w/ git repositories.
Introduction
This post builds on my previous post on building AWS lambdas with serverless framework.
I've been tinkering around with AWS Lambda Functions a lot at work. My latest project has me exploring the possibility of running Terraform, Terragrunt and git in an AWS Lambda function.
My purpose for this project was simple.
- Download github projects from Python
- Store project code in the
/tmp/
folder of Lambda containers - Allow me to create PRs, commits, etc from a lambda
In this tutorial, i'm mainly going to focus on the problems I encountered while getting GitPython to work in AWS Lambda w/ Python runtimes.
GitPython
GitPython is a library built on git commands, therefore, it requires the git binary to be installed.
Install it in your python package by running the following pip command or including it in your requirements file.
pip install GitPython
First Issues
Here was my initial lambda function as defined in my serverless.yml
:
run-git:
handler: src/handler/run_git.lambda_handler
name: ${self:provider.stage}-${self:service}-git
description: run git commands from lambda
memorySize: 256
timeout: 30
The lambda handle at run_git.lambda_handler
ran the following python code:
from git import Repo
def lambda_handler(event, context):
project_name = event['github_project']
org = event['github_org']
git_url = "https://github.com/%s/%s" % (org, project_name)
print("Downloading repo from %s............" % git_url)
repo = Repo.clone_from(git_url, '/tmp/%s' % project_name)
The code was simple, it would download a github project and store it in the /tmp/
.
Should have been easy until I ran into this error:
Unable to import module 'src/handler/run_git': Failed to initialize: Bad git executable.
The git executable must be specified in one of the following ways:
- be included in your $PATH
- be set via $GIT_PYTHON_GIT_EXECUTABLE
- explicitly set via git.refresh()
All git commands will error until this is rectified.
This initial warning can be silenced or aggravated in the future by setting the
$GIT_PYTHON_REFRESH environment variable. Use one of the following values:
- quiet|q|silence|s|none|n|0: for no warning or exception
- warn|w|warning|1: for a printed warning
- error|e|raise|r|2: for a raised exception
Example:
export GIT_PYTHON_REFRESH=quiet
Looking for a quick solution; I immediately put GIT_PYTHON_REFRESH=quiet
in to the ENVIRONMENT
variables section of my Lambda function.
That resulted in:
Cmd('git') not found due to: FileNotFoundError('[Errno 2] No such file or directory: 'git': 'git'')
cmdline: git clone -v https://github.com/hearsaycorp/hearsay-messages /tmp/hearsay-messages: GitCommandNotFound
Traceback (most recent call last):
File "/var/task/src/handler/run_terraform.py", line 13, in lambda_handler
download_hearsay_repo(repo_name, git_hash)
File "/var/task/src/utils/github_utils.py", line 16, in download_hearsay_repo
repo = Repo.clone_from(git_url, '/tmp/%s' % project_name)
File "/var/task/git/repo/base.py", line 988, in clone_from
return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs)
File "/var/task/git/repo/base.py", line 933, in _clone
v=True, universal_newlines=True, **add_progress(kwargs, git, progress))
File "/var/task/git/cmd.py", line 548, in <lambda>
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "/var/task/git/cmd.py", line 1014, in _call_process
return self.execute(call, **exec_kwargs)
File "/var/task/git/cmd.py", line 738, in execute
raise GitCommandNotFound(command, err)
git.exc.GitCommandNotFound: Cmd('git') not found due to: FileNotFoundError('[Errno 2] No such file or directory: 'git': 'git'')
That error made it clear to me. AWS Lambda functions did not come bundled with git
executables in the runtime container. Therefore, GitPython
could not run commands against git
in the container's $PATH
.
Next Issues
I thought alright, simple, how hard could it be to install git
on the $PATH
.
Turned out to be pretty tough. I looked all over the internet and stumbled onto the following article by cloudbriefly.com. At first glance, it seemed way too complicated. However, the more I read it, the more it made sense.
I figured I'd give it a shot and run their python code before running GitPython
code.
I downloaded the git
binary from Amazon Repositories and added the /tmp/
path to my $PATH
.
Still this resulted in:
git.exc.GitCommandNotFound: Cmd('git') not found due to: FileNotFoundError('[Errno 2] No such file or directory: 'git': 'git'')
Solution
I was about to give up when I stumbled upon the following Lambda Layer.
This Lambda Layer promised to include the binaries for ssh
and git
regardless of the lambda runtime.
I had never used Lambda Layers before but remember attending a session about them at AWS re:Invent.
(Read about Lambda Layers here)
Seemed too simple. I had searched the internet for hours on a solution for installing git on AWS Lambda Functions.
I included the layer in my serverless.yml
file like so:
run-git:
handler: src/handler/run_git.lambda_handler
name: ${self:provider.stage}-${self:service}-git
description: run git commands from lambda
memorySize: 256
timeout: 30
layers:
- arn:aws:lambda:us-west-2:553035198032:layer:git:5
I deployed my function and BOOM! It worked. Just like that. My lambda function now included a ~layer~ that would allow me to use the git
binary. So simple. Now I could play
around with git
in my lambda functions.
Conclusion
This should help you install git
on AWS Lambda functions and use the GitPython python package.
PM me on LinkedIn if you have any questions! Info should be located on the left.
-- Miguel Lopez