Everything you read about infrastructure as code (IaC) is focused on how it works or why you want to make sure that it actually is building the way you want it to build.
These are critical areas. But are we thinking enough about how we use this approach in our organization?
As Melinda Marks from ESG states in a company report, “83% of organizations experienced an increase in IaC template misconfigurations” as they continue to adopt the technology.
We know from work done by the Cloud Security Alliance (“Top Threats to Cloud Computing: Egregious Eleven“) and others, misconfigurations continue to be a top risk in the cloud.
IaC is supported to reduce
misconfigurations by systematizing the creation of infrastructure, adding a level of rigor and process that ensures teams are building what they want and only what they want. If ~83% of teams aren’t seeing that, there’s a deeper issue at play.
On smaller teams, one where the Dev and Ops bits of the DevOps philosophy are together, that makes sense. IaC allows these small teams to use the same language — code — to describe everything they’re doing.
This is why we’re seeing even higher-level abstractions than tools like Terraform or AWS CloudFormation in the AWS CDK and projects like cdk8s. Those high-level abstractions are more comfortable for developers.
An ops/SRE/platform perspective of a cloud service will be wildly different from a developer perspective of the same service. A developer will look at a queuing service and dive into its interface — a simple endpoint to add and one to read? Sold. That’s an easy integration.
This operational perspective aims to find the edges. So, when does this queue reach its limit? Is the performance constant or does it change radically under load?
Yes, there are overlapping concerns. And yes, this is a simplified view. But the idea holds. IaC solves a lot of problems, but it can also create and amplify the disconnect between teams. More importantly, it can highlight the gap between the intention of what you’re trying to build and the reality of what you have built.
As a result, this is where the security concerns often escalate.
Most tooling — commercial or open source — is focused on identifying things that are wrong with the infrastructure templates. This
is a good construct. Making this
would be bad. These tools aim to generate these results as part of the continuous integration/continuous delivery (CI/CD) pipeline.
That’s a great start. But it echoes the same language issue.
Who’s Talking and Who’s Listening?
When an IaC tool highlights an issue, who will address it? If it’s the development team, does it have enough information to know why this was flagged as an issue? If it’s the operations team, are the consequences of the issue laid out in the report?
For developers, what often happens is that they will simply adjust the configuration to make the IaC testing pass.
For operations, it’s typically a matter of whether the tests are passing. If they are, then on to the next task. That’s not a knock on either team; rather, it highlights the gap of expectations versus reality.
What’s needed is context. IaC security tooling provides visibility into what is about to (hopefully) be built. The goal is to stop issues before they get into production.
Today’s IaC security tooling is highlighting real issues that need to be addressed. Taking the output of these tools and enriching it with additional context that is specific to the team responsible for the code is a perfect opportunity for some custom automation.
This will also help bridge the language gap. The output from your tooling is essentially in a third language — just to make things more complicated — and needs to be communicated in a manner that makes sense to either a development or an operations audience. Often both.
For example, when a scan flags that a security group rule doesn’t have a description, why does that matter? Just getting an alert that says “Add a description for context” doesn’t help anyone build better.
This type of flag is a great opportunity to educate the teams that are building in the cloud. Adding an explanation that security group rules should be as specific as possible reduces the opportunity for malicious attacks. Provide references to examples of strong rules. Call that out without knowing the intention, and other teams can’t test the validity of the security confirmation.
Security is everyone’s responsibility, so acknowledging the language gap between developers and operations will highlight opportunities like this to add simple automations that provide insights to your teams. This will help improve what they are building and, as a result, will drive better security outcomes.
About the Author
I’m a forensic scientist, speaker, and technology analyst trying to help you make sense of the digital world and it’s impact on us. For everyday users, my work helps to explain what the challenges of the digital world. Just how big of an impact does using social media have on your privacy? What does it mean when technologies like facial recognition are starting to be used in our communities? I help answer questions like this and more. For people building technology, I help them to apply a security and privacy lens to their work, so that they can enable users to make clearer decisions about their information and behavior. There is a mountain of confusion when it comes to privacy and security. There shouldn’t be. I make security and privacy easier to understand.