Building an IoT CI/CD pipeline (AWS, Ansible, Jenkins, Docker and git)
Creating a CI/CD pipeline suitable for an IoT/ Machine Learning project
The rise of the Internet of Things has forced people to re-think how software for embedded systems gets developed. As devices become more connected, traditional project management frameworks are no longer viable. With these small devices with less computational ability and memory a continuous integration and deployment pipeline that allows the team to follow Agile/Scrum methodologies is needed. In this article, I am going to go over how I built a CI/CD pipeline for an IoT project I was working on. The stack I am using includes Ansible, Jenkins, AWS IoT, Docker and git. The entire pipeline will be explained a bit more in this article.
There are a couple of requirements I had for the IoT project I was working on.
I needed a way to update and deploy new code changes to the devices at scale.
The tests needed to be run and the code built for the right environment.
The devices needed to be configured with security certificates and config files automatically.
Git was used for the version control for my project. A simple repository was created on Github to contain the code. For the package control, Docker was used along with Dockerhub for the repository. Docker allows for applications to run with memory virtualization, similarly to how Virtual Machines work. Unlike VMs, Docker containers do not require a guest OS for the application layer or a hypervisor and allow for so-called “microservices” architecture.
To build a regular python package as a Docker image we need to include a Dockerfile in the package. This is essentially just a script that includes a list of dependencies, an environment to build it for (Python ARM32) and a script in the package to run. The Dockerfile can be seen below.
FROM arm32v7/python:3 WORKDIR /usr/src/app COPY requirements.txt ./ RUN pip install - no-cache-dir -r requirements.txt COPY . . ENTRYPOINT [ "python", "./main.py" ]
In order to run the unit tests and build the docker image for the correct environment (ARM32 for the Raspberry Pi), Jenkins was used. The Jenkins server polls the Git repo for commits then builds the image using the Docker plugin. To build the image for the correct environment I needed to run it on the same host so I set up a Jenkins server on a raspberry pi. This could be implemented on a micro-instance running Raspbian but it was easier for me to set it up on a local device. More information on the integration of the two services can be found here.
In order to onboard the new devices, I needed a configuration management tool. Ansible allows me to write scripts on a master host, define a list of host IP addresses to connect to and run the scripts on those hosts. Ansible is extremely easy to set up because it uses SSH to connect to hosts. This means that an agent doesn’t need to be installed on the slave host and all that is required is a user with root permissions (depending on what you’re executing in the script). This isn’t as scalable as other configuration management tools like Puppet or Chef but for onboarding IoT devices, it’s a more than a suitable tool. There are three main scripts that I wrote on the Ansible master host were to set up a user for SSH, create the config file containing hostname/ MQTT topics and a script to set up the AWS certificate files. The IoT project I was working on needed to connect and send data to the AWS IoT cloud service. AWS IoT has a feature called Just In Time Provisioning (JITP) which allows you to create your own CA certificate and automatically provision resources when a device first connects. It makes onboarding devices much more scalable and doesn’t require the engineer to manually SSH into the device to set it up. The YAML script I wrote to do this includes various OpenSSL CLI commands to create the CA certificate and the AWS CLI to attach a IAM policy and credential to the certificate. In order to run the commands on the hosts, I created an IAM policy that has very few abilities (mainly get registration code and attach policy). The access key and secret access key is used when executing the commands. I also create an IAM Role for the JITP itself. By using environment variables the slave hosts do not have access to these keys meaning there is less security risk than hard-coding them. The script mentioned can be found in this Git repo here.
With the above-combined tools and services the IoT devices can onboard automatically and update the software they are running while minimising the memory necessary. There are a couple of things that could be added to make this pipeline even better such as using Kubernetes to update the docker image automatically.