The cost of not going cloud native

IT transformation initiatives cost you money. The most successful ones often cost less than you think.

What if you don’t change, and you’re staying with a legacy architecture? I’m going to talk about what that can cost you.

Martin Vorel

Before I jump in, let’s make sure we agree on what I mean by cloud native. Luckily for me, some smart people at the CNCF came up with an online definition, so I’m essentially going to repeat the key points from that.

Cloud native is about letting your organization run workloads that can scale. This doesn’t mean you need to be the size of Facebook or Google; what’s important is that you’re able to operate at a scale that’s right for you, and to adapt as your needs change. You can do this using utility computing services such as AWS, Azure or Google Cloud, if you want to.

Cloud-native architectures are loosely coupled. There’s no enterprise-wide, mandatory systems component, such as a service bus. You might use the same tool to deploy every workload, but if you wanted to switch away, you could. Cloud native also believes strongly in automating away toil as a means to enable small, frequent iterations.

Cloud optional

As a concept, cloud native doesn’t mean “it runs on someone else’s computer”. It might well—but these are different ideas. Let’s be clear, though: if your architectural choices or your existing set of policies mean it can’t run on someone else’s computer, that’s going to be limiting.

If you can’t run on the cloud, you (usually) can’t use managed services. In the world of on-premises hosting, you can run an internal market that provides defined, managed offerings such as NFS NAS, or managed virtualisation, or a Postgresql cluster per project. You might even have an online service catalog for each of these.
The good news is that you don’t have to pay the cost of having a cloud provider manage this for you; the bad news is you do get the cost of running that in house. Given the scale of the firms who make this their core business, delivering even a much lesser service at the same or lower cost is a big ask.

Shared resources

Does your deployment pipeline involve shared resources? Deployment pipelines are all about standardising the routes for taking code from starting its life in a colleague’s brain into a place where it’s delivering value. There’s almost always a shared element at some point in that process. I’m proposing that cloud-native architectures let you pick the parts that are shared, and that choice lets you avoid having people spending loads of time waiting on in-use resources.

I want to be clear: building code in the cloud doesn’t automatically mean you can sit back and count this as done. However, there are plenty of places where a cloud native architecture helps.

If you’re bought into the cloud native mindset, you define your infrastructure and your build process using declarative code or APIs. This means that when you’re feeling the pain of waiting a long time to run integration tests on a single shared environment, you can choose an alternative. Maybe you’re going to use that declarative code to make two test infrastructures, cutting the wait time in half if they’re constantly in use. Maybe you’re going to autoscale at the infrastructure level, so you have 0 test systems at the weekend when your colleagues are away from their computers, and n of them during the working week.

Big ticket changes

How do the team you’re in feel with a release imminent (or, worse, overdue)? Does the mood switch to risk management and careful testing? Do you freeze the code for changes (if you do: just one branch, or the whole thing)?

Probably lots of you are already sold on continuous delivery. Great! A few years back quite a few of our engagements at The Scale Factory involved an element of winning over hearts and minds about involving CI / CD thinking into development and release cycles. These days, we’re seeing the opposite - developers take these processes for granted. It’s those expectations that you now have to meet.

Small batches help you deliver more output overall. Sometimes, though, the business needs to make a major change: unveiling a new product, for example, or adding a new payment method. Can you release that kind of update with the same confidence as a regular update?

Clearly, going cloud native doesn’t take away all your risks. The new product might not sell, no matter how smoothly you deploy. There are techniques in the cloud-native toolkit that are worth drawing on: feature flags, canary deployments, and distributed tracing are all part of a picture here. The more you can take away the IT side of worrying about a major release, you can free up time to focus on the business outcomes and the value.

Skills and experience

If you’re not already running cloud native, and you’re sold on the benefits so far, you’re going to want a team with the skills and experience to deliver those changes.

What happens next is going to have a big impact on any transformation. More and more IT professionals are used to working on and with automated infrastructure. They’ve got used to cloud-native ways of working, taking it for granted they can call an API to define a database or add a compute task.

Can your team choose the tools they’d like to use? Again, this isn’t just about being in the cloud. I mean, sure, it’s frustrating to put in a form Q7 to ask the database team to define a new tablespace in the shared DBMS, but it’s also a pain to have to learn a bespoke technology to submit a pull request to run some code to set that database up in the cloud.

Ideally, you get to pick something widely known; Docker as a container image format is a great example. Even if you’re not using Docker itself at any point in the process, you still get the benefits of a well-known technology where a new hire can get up to speed really quickly.

Compliance beyond the paperwork

You might be wondering how all of this fits in to regulated environments. The short answer is “very well” and, in fact, cloud native IT architectures can support high-assurance workloads across a range of business and public sector organisations. Let’s look at a few details.

Maybe your firm or your industry regulators want a detailed list of changes when you update infrastructure? Tools such as Terraform can capture evidence both ahead of time (a Terraform plan) and as you make the changes. You can digitally sign these, send them for approval and keep an archival copy. We set this up for a customer and, although it’s added extra steps when they update their production systems, it hasn’t held up their cloud adoption journey.

What if you need to enforce two person access control? Whether you use GitHub, your own hosted source control system, or something else from the market, there are loads of options to enforce code review rules. Many of them (like GitLab and GitHub) can run on-premises or in the cloud, whichever is the right fit for you. You can add automation to recommend or enforce additional reviews and processes, and you don’t have to teach a bunch of new skills to people who already understand source code management.

OK, but what if changes are denied without management approval? To be honest, that’s the same problem written differently. The challenge is to make the process of reviewing a code change also become the process of approving the update that deploys it. Once you get that in place, the rest is pretty straightforward.

There’s more than one way to do it

The cloud native definition avoids naming specific implementations, but does call out some technology families and architectural patterns: immutable infrastructure, sure, but also service meshes, microservices, and containers. Definitely don’t get seduced by the idea that “cloud native” means setting up a Kubernetes cluster and using that to run all your workloads—it doesn’t.

If you want to be able to hire and keep a team that can keep your systems current, it’s important to pick enough of these ideas so that your colleagues can automate themselves out of having to pick up the toil. This isn’t about freeing up their time to deliver more value: it’s about providing interesting work that’s aligned with the outcomes you want. Your sharpest staff know what tasks they enjoy, and repeated rework isn’t it.

I’ve been there and I know what it’s like to nurse systems back to health for the umpteenth time, whether these are single servers or a cluster of containers. I know we say that boring is powerful – but sometimes, it’s just boring. The next time you’re hiring for an IT engineering role, make sure that the work you’re offering isn’t.

Adopting cloud infrastructure and services lets you try out ideas without big capital commitments. You can go beyond that to buy in to cloud-native ideas and architectures; when you do, you and your colleagues get a great opportunity to experiment, learn, and adapt.


I’m a consultant at The Scale Factory, where we empower technology teams to deliver more on the AWS cloud, through consultancy, engineering, support, and training. If you’d like to find out how we can support you and your team to level up on cloud adoption, get in touch.