Reproducibility in DevOps

· 1802 words · 9 minutes read automation development devops python

With the rapid speed of development in DevOps sometimes it is valuable to take a step back and review decisions made along the way. It can be easy to make a mistake early in the development process that can make your code not be reproducible down the line in a byte for byte copy down the line, which can impact your customers, your ability to create hotfixes for them, and your developers when working on introducing new features but getting inconsistent results. Hopefully a-lot of this is review for the most part. Also just a note, not all of this is as critical when working on a SaaS application that you host, though following these practices could be helpful in your development process. With all that said let’s dive into dependency management.

Dependency Management

Dependency management is not a new problem, it’s existed for decades, but if you’re not familiar with it, essentially it’s how to manage the libraries that your code depends on. There are some obvious issues, as well as more subtle problems that can make it so that you can’t get the same output when rebuilding the same code, so lets go over them, take a look at this python requirements.txt and we’ll go over whats kind of issues you might see:

jinja2>=2.8
cherrypy==6.0.1

In this requirements.txt file Jinja2 and CherryPy will be installed, the problem is that the version of Jinja2 is not locked down, all we are specifying is that the version is greater than or equal to 2.8. If you need to make a hotfix for a customer in the future it’s possible you’ll get a different version of Jinja2. This can easily be solved by changing the >= to ==. Here’s go over a slightly less trivial example of a similar issue.

jinja2==2.8
cherrypy==6.0.1

With this updated version of the requirements.txt from the previous example we are one step closer to solving the problem, we should always get the same versions of dependencies every time right? Not quite, while we will get the correct versions of Jinja2 and cherrrypy, those aren’t actually all the dependencies we will get, if you look at the setup.py file for CherryPy 6.0.1 you’ll find this little section:

install_requires = [
    'six',
]

CherryPy depends on the six library, but we haven’t specified a version in the requirements.txt, and the setup.py states that any version will work by just saying it has to be installed and not specifying a version. In order to get this to be truly reproducible you need to specify your requirements as well as the requirements of the libraries you plan to use. A good library leaves them somewhat open ended either giving you a range, or like CherryPy has done here doesn’t specify a version. The reason this is good for libraries is that very few projects have only one, and so some may have overlapping dependencies. For example if CherryPy specified that it required six==1.9.0 but another one of your libraries needed a newer version you would run into issues and need to do some rework to make everything fit together. The final requirements.txt should look like this to maximize being able to reproduce an artifact down the line that is a byte for byte replica.

jinja2==2.8
cherrypy==6.0.1
six==1.9.0
MarkupSafe==1.1.0 # Comes from the jinja2 requirements

This will make sure that all your libraries all the way down will be specified as an exact version. This is considered a best practice, and while it will mean a little bit more work up front, it can prevent hard to find bugs when different clients or even different developers are using different versions of the same library, and while this doesn’t lock down the OS that you are running on, that will be discussed further on.

While above is the most common dependency issue, what happens when a dependency disappears? This isn’t a common scenario but an example of this is the left-pad chaos that happened in 2016 where a disgruntled developer pulled their library from NPM which some of the most popular repositories (like babel) depended on. It was fixed relatively quickly but for a short time the internet was essentially broken. Obviously if this happens to a library that you depend on then you will be unable to compile or run your code, thankfully there are tools out there that will mirror upstream package managers like NPM, PyPI, maven, etc. The options vary between what languages they support and their cost, but a few examples of artifact management solutions that are type agnostic are Sonatype Nexus and JFrog Artifactory. In addition to the type agnostic ones there are tons of artifact management tools that allow mirroring upstream repos of their own type, like verdaccio (NPM), devpi (PyPI), or the official Docker Registry. Actual setup of these are outside the scope of this but the idea is you can use these to make sure that if you depend on something, even if it gets deleted upstream it will still exist for you. On top of that, the cache gives you the added benefit that you don’t have to go to the internet every time you need to pull in the librarie(s), you can do it over your own local network via your cache.

One word of warning though if you do host your own artifact management system then if there’s an issue with a dependency that gets it black listed like what recently happened to event-stream@3.3.6 then it will be on you to remove it from your own caches. If you weren’t using your own mirror then it wouldn’t be an issue as the upstream repo already blacklisted/purged it, but because you have a mirror in this scenario it knows about the artifact and won’t reach out to the upstream repository and pull it down. For more information on that breach checkout this blog post by snyk.io.

Runtime Environments

When it comes to building, testing, and deploying applications in a reproducible fashion, container technology like docker or rkt have made life easier for developers. Design a container and it can run everywhere, which is great, however there are some considerations to keep in mind when designing your container. The first consideration is tags and I’ll show you an example of an issue you might run into where you get different results than you expect:

FROM python:3
WORKDIR /opt
COPY requirements.txt
RUN pip install requirements.txt
COPY app/ ./app

Let’s look at the example above and think about what could go wrong, for reference I suggest looking at the tags available for official python image. You’ll notice that python:3 is a shared tag, more specifically it points to whatever the latest GA release of python 3 is available. So let’s say you build your application using FROM python:3, you aren’t able to guarentee you’ll get the same version of python which could be a huge deal. Also if you don ’t specify a tag then you will just get the latest tag which generally changes every build and definitely isn’t what you want in order to make something reproducible. You could change your tag to be more specific for example FROM python:3.7.1 which would make it even less likely that you would run into an issue where the environment changes out from under you, and is probably sufficient for an official image like this, however what if it was an image from someone you don’t necessarily trust to not reuse tags? Then there’s still a solution for you, you can use the image hash. Let’s take the FROM python:3 example and make it the most reproducible in this example:

FROM python@sha256:3870d35b962a943df72d948580fc66ceaaee1c4fbd205930f32e0f0760eb1077
WORKDIR /opt
COPY requirements.txt
RUN pip install requirements.txt
COPY app/ ./app

By specifying the sha you’re guaranteed to get the exact same image even if the tag (3.7.1) was reused down the line. This also means if you certify an image as safe (for example if you were to use random_user/python:3.7.1) they are not able to point 3.7.1 at an image that includes a bitcoin miner as you are not using the tag to pull the image at all. That all being said, this route is not for everyone or every case, that hash is not user friendly, sure you can leave a comment but comments get stale fast. This method is for those that are super paranoid and want to validate they are getting the content they expect.

Finally docker is great because of it’s layered filesystem allowing for caching, however where there’s caching there can be inconsistencies. Let’s say you decide to use the image python:3 even though we went over not wanting to. You have multiple servers that build docker images in your CI pipeline that are load balanced across with something like jenkins or gitlab-ci. The question is who’s copy of python:3 is the correct one? This can be resolved by using a more exact tag, or using the sha method mentioned above, but one option which could also help is the --no-cache which will ignore the local image cache when building. While our example with python:3 wouldn’t make a reproducible image in the long run, in the short term you will get more consistent images as different CI servers might have used different versions of the python:3 image downloaded and by ignoring the cache it’ll always pull it. I still suggest using a more explicit tag or the container sha method but if you want to take advantage of semver and point at a shorter tag like python:3 it is imperative that you understand these potential pitfalls.

I’ve spent a-lot of time discussing ways to ensure byte for byte reproducibility, but it’s possible that is not something you need, for example if you are running a SaaS application that gets multiple updates per day, being able to recreate the same code is not as big of a deal, you can simply roll back a change if there’s an adverse affect, and there’s no real need for creating hotfixes for customers since all the customers are running on your managed instances. While these practices will help you, they are not as critical as someone who ships code to customers for them to run. If you are creating a product that customers run on their own, being able to produce a hotfix which you know has no other changes besides the ones you made is critical, as every change brings the potential of a new bug. That being said the practice of making reproducible code by explicitly calling out your dependencies and their dependencies, as well as proxying them through something like artifactory or nexus, and using the most specific tag possible on docker images will make life easier for both your developers, as well as operations in the long run and should increase the speed that products are able to go out the door.