Security Observability has become an important concept recently as companies have started building software with a cloud-native mindset, embracing distributed, immutable, and ephemeral systems. As infrastructure has shifted from traditional deployment methods, older monitoring systems are no longer effective, and a new set of practices — called “observability” — has emerged.
In this post, we explain what observability is, why security observability important, and outline six principles that will help you design and monitor your systems for security observability. (For an in-depth discussion, download our new whitepaper: Cloud Security Observability: A Guide to Reducing Your Cloud Native Infrastructure Risk.)
What is Security Observability?
In their book Cloud Native DevOps with Kubernetes, Justin Domingus and John Arunde offer a plain language definition of observability: “The observability of your system is a measure of how well-instrumented it is, and how easily you can find out what’s going on inside it,” and all thought leaders in the field agree that teams should design and build observable systems to help their companies achieve their particular business goals.
Operations teams aren’t the only ones facing challenges monitoring cloud-native infrastructures. New infrastructure layers increase attack surfaces, and highly automated, ephemeral infrastructure has forced security teams to adjust their strategies. As this diagram shows, each gain in efficiency brings new risks and security challenges:
Because potential exploitation points aren’t located inside a single perimeter, attackers can leverage new surfaces and processes throughout the software development lifecycle. As such, security observability is essential so Security and DevOps teams can obtain a better understanding of the overall health of their systems, detect abnormalities, and investigate incidents quickly and effectively.
The Six Principles of Security Observability
The rest of this post outlines six principles you can use to help design and monitor your systems for security observability.
1. Embrace change and adaptability
Oftentimes security has been looked at as a binary: You’re breached or not; you’re secure or insecure. This causes security to act as enforcers, reducing risk at all costs, even if this means impeding the business.
However, the reality is that security is a business function that should contribute to an organization’s overall financial success, and this means it should be viewed as a spectrum, not a binary. Security professionals should make ongoing decisions about risk to protect/enhance the business’s success, and this requires security observability. This is especially important as companies embrace new technology, whether it be Kubernetes or new deployment and configuration mechanisms.
2. Reduce reliance on perimeter-based controls
The traditional approach to securing a data center focuses on fortifying the perimeter — tracking and controlling traffic that’s entering and exiting. In cloud-native environments, everything is “outside the perimeter,” and this requires a shift to a zero-trust model that concentrates on understanding and controlling who is doing what and where they are doing it. The perimeter is now the authentication layer for any API you care about, from your cloud provider’s to your microservices.
Detecting anomalies regardless of an attacker’s point of entry means expanding your system boundary and instrumenting the different components that make up your system.
3. Use monitoring to validate controls
As explained earlier, traditional security typically involves putting controls in place to block attacks or detect known issues, like vulnerabilities. This isn’t wrong, but it’s only part of the picture. Security controls are only effective when they are actually working — and to know they are working, you need to verify. That’s why security controls should have a symbiotic relationship with monitoring. The data collected during monitoring should uncover areas of risk, which should then inform controls. Inversely, the controls need to be verified by the monitoring.
4. Map risk signals to business context with threat modeling
Data is only as valuable as the way it’s leveraged — and too many companies pull tons of data only to find they can’t make sense of it. Security observability is about making sense of the data, so to cut through noise, it’s critical to understand which signals actually matter. This process starts with the question “What are your company’s business goals?” and continues with “What could put these at risk?” This practice is a key component of Threat Modeling, and although Threat Modeling can be complex at times, it can also be as simple as spending a few minutes thinking through the ways that something can be exploited.
Once you’ve begun to understand the threats your business may face, it’s time to decide what the key indicators of risk are for your company — keeping in mind that these can include anything from an alarm-worthy sign that you’ve been breached to a sign that your infrastructure could be misconfigured. You can apply this process to different parts of your infrastructure and every time you make a change. If your company is thinking of adopting containers or moving a development experiment to production, for example, you’ll need to think through how they can be exploited.
5. Use telemetry analytics to proactively predict risk
Different kinds of risk indicators should have different reactions so you can obtain actionable information while avoiding alert fatigue. Behaviors can be broken down as follows:
- Risky: These are usually done by internal team members and can expose your business to risk or obscure the signs of a real attack.
- Suspicious: These indicate that behavior from a bad actor is potentially happening.
- Malicious: These are clear indicators that a bad actor is at work.
Using these categories, you can devise a system that lets your business strategically review these signals and use them to inform changes you’re making to policies and configurations. This could be as simple as blocking time each week to review alerts, or it could be something more sophisticated, like funneling them into dashboards on your SIEM or other analytics tool.
6. Optimize workflows and systems for forensic investigation
Regardless of urgency, all alerts requiring action should be included in an automated workflow to ensure that your team is aware and can take action in a timely manner.
In addition to building response workflows, it’s important to consider forensic investigation when you’re designing systems. Investigating incidents is always going to be part of Security Operations — and it’s important to make sure it can be done efficiently to ensure that you’re stopping malicious attackers as early as possible. To do this, think through the data you will need to pull from your system and how you plan to store and consume it.
In true DevOps form, you should also think through what processes can be automated away. This is the best way to cut out the noise, so whatever your team is looking at is actionable — whether it’s trends or a catastrophic alert.
Wrapping up . . .
These principles will help you design a Security Observability roadmap that fits your specific business requirements and objectives. But keep in mind that there is no one-time, one-size-fits-all process that will get you there. Security observability requires a dynamic, continuous mindset that will help you shift from a reactive to a proactive security posture by informing the way you design, monitor, and continually improve your cloud-native systems to reduce risk.
For an in-depth discussion of the six Cloud Security Observability principles, download a copy of our new whitepaper: Cloud Security Observability: A Guide to Reducing Your Cloud Native Infrastructure Risk.