As a solution architect, one of the reasons I enjoy this kind of consulting work is the ability to come up with different design approaches and picking the best one that meets the client’s brief. What then happens when there is no brief and you are both the architect and the client? Well, this blog gives you an insight into my thinking, what influenced the decisions I made, and lessons learned along the way.
Coming from a place of knowing little about pipelines or AWS in general to wanting to build a fully automated deployment pipeline was no small feat to take on at the time. This was the first project I worked on, after joining The Scale Factory. I was looking forward to this project as I had a different view of pipelines as a developer compared to building one or two (more on this later). This meant that I could break it, mend it, and uncover the inner workings. To add to that, I was completely in charge of every decision (this doesn’t happen on clients project), which gave me an opportunity to explore, try different architectures, see what worked and what didn’t - Oh what better way to experiment. As some of you may know projects like this rarely come to an end, it is either abandoned or kept as a side gig for constant tweaking.
The next bit was deciding which AWS compute resources to pick, keeping in mind that the source code had some frontend state. The first option I considered was deploying on EC2 (Elastic Compute Cloud) which would mean I was responsible for the configuration, security (in the cloud), and patching the instance. I also wanted to examine the alternative of using AWS ECS with Fargate: a managed, serverless compute engine for containers, including some logging capability via CloudWatch Logs.
Choosing AWS Fargate was an easy decision as I had the container images ready to go. I could let AWS worry about the underlying infrastructure (although I still spent some time trying to understand the different NetworkMode settings for Fargate, and had to get my head around ECS port mappings). To me this was a small price to pay in exchange for the other benefits it came with. The next question I wanted to tackle was: how do I build and host these container images? With a little digging, I found AWS ECR (Elastic Container Registry), for uploading Docker-compatible images. I also picked AWS CodeBuild for image builds and found that the two services integrated seamlessly. To make this work, I added a buildspec.yaml file for CodeBuild which had instructions on how to build and where to push the built image - i.e to the ECR I’d created.
It was now time for the build, also had a couple of options here - I could click through the AWS web console to deploy the app, or write infrastructure as code (IaC). Choosing the latter option meant spending a bit more time on this project in the short term but had long term benefits as I could reuse both actual code fragments and skills gained for future projects. I could see that the IaC approach promised a faster pace of redeployment in a different account or region, the option of deploying at scale, as well as being a valuable skill to master for future projects. Deciding which tool to use wasn’t a difficult one; Terraform, being platform agnostic, made it much more attractive with documentation that is beginner friendly.
At this point, I had an infrastructure pipeline in Terraform building locally from my computer and, whilst that was fine, it wasn’t a fully automated process. What if this was a much bigger team project? To understand how to implement that for AWS, I needed a way to centrally trigger Terraform runs. I spoke to my colleagues about the problem and GitHub Actions came highly recommended - I spent some learning how to use it, then went on to create a pipeline that worked really well.
So, I had GitHub Actions provisioning the infrastructure at every push to the GitHub repository, next on the list was getting the deploy pipeline to build and deploy the application itself. I already had CodeBuild to build docker images, ECR to store them, ECS Fargate for compute. At this point I realised something was missing: how do I tie everything together and have a sequential flow on each stage?
The workflow I had in mind needed an entry point, a way to start the build, something that can integrate with GitHub to fetch the right source commit on push, as well as integrate with the other AWS services. This is where AWS CodePipeline came into the scene. CodePipeline is a fully managed continuous delivery service and it integrates well with CodeBuild, ECR, and ECS Fargate. This was absolutely perfect as not only did it fulfill all the above stated requirements, I could also see what steps were failing directly from the console.
So far I’ve made no mention of an application load balancer and auto scaling groups - this was because at the time I was mainly focused on getting a pipeline going and knew that these changes would be much easier to add to the Terraform pipeline if / when the need arose.
A lot of things became apparent as I was in the process of building this pipeline. I set off wanting to build a pipeline but ended up with two as I could see the need for separation of concerns. Pipeline 1 was triggered when a developer pushed code changes, a new image will be built and deployed. Pipeline 2 was triggered when there was an infrastructure code change, and triggered a Terraform run.
Being in charge of this project made it easy to pivot if something didn’t go as planned. Most of my decisions were based on what tools would get the job done quickly - there wasn’t a cloud of legacy systems and dependencies hovering over my head as I made them (real world problems). As for the cost implications, it is a tiny fraction of the cost of buying physical hardware, configuring and managing them (one of the many perks of cloud engineering).
Documenting this process made me realise that the path most pipeline automation projects go through are not linear; being able to re-group and pivot based on new insight or findings is a valuable option to have. Picking up new skills along the way is inevitable. Every path I took solved a problem I was facing at the time.
Looking back, maybe I could have looked at using Terraform Cloud for storing state instead of S3 bucket and a DynamoDB table locking. Maybe I would have refactored the application to remove the state from the frontend before deployment.
Did I always make the best decisions? Maybe not… Did I learn in the process? Definitely yes! Would I change the decisions I made? No, because those hurdles gave me an opportunity to learn and grow.
This blog is written exclusively by The Scale Factory team. We do not accept external contributions.