Agentless vs. agent-based cloud architectures: Why does it matter?

(c)iStock.com/TERADAT SANTIVIVUT

In the world of security, monitoring and analytics solutions for IaaS cloud, there’s a lot of discussion and debate about agent-based vs. agentless service architectures. Your choice of agentless vs. agent-based cloud security can have a big impact on the efficiency of your day-to-day operations for security and compliance as well as your ability to protect your cloud environment in the future. It’s important to understand what the difference is, and be fully aware of the tradeoffs involved in this decision.

So what’s the difference?

Agent-based and agentless services basically differ from each other in how they collect information and provide control across the entities in your cloud environment (network/security groups, server instances, load balancers, database services, etc.). With an agent-based workload security product, you install a small software agent in each of your server instances. The agent is responsible for collecting relevant information from the server it is installed on, sending the information back to a central control system, and giving you the ability to control security at an instance/virtual machine level. For example, agent-based security products gather information about host firewall setup and network traffic flowing between servers, and provide file integrity monitoring (FIM) and the ability to configure the firewall on each host. 

Figure 1: Data aggregation and control with an agent-based approach to cloud security

Agentless services, on the other hand, talk directly to the underlying cloud platform (e.g., AWS, Azure) through the service provider’s API to get information about instances, services and the network, and control security. Because the agentless service talks directly to the platform, no modifications are required in the resources that are part of your environment. Because of this, agentless services, also called cloud-native or API-based services, are completely transparent to the applications and workloads.

Figure 2: Data aggregation and control with an agentless approach

Why it matters

The first workload security platforms for public cloud environments were agent-based. This was in part because the security mechanisms exposed by cloud service providers were still evolving and weren’t as feature-rich in the early years as they are today. Agent-based tools also gave businesses that migrated workloads to the public cloud a way to bring tools that were developed for datacenter environments (and which they were familiar with) into public cloud environments. This made the transition to the public cloud easier for some of them.

But agent-based solutions designed for more static and predictable datacenter environments are a poor fit for the dynamic needs of a public cloud environment. Let’s look at five reasons why.

1. Operational Overhead of Agent Management: In agent-based solutions, you are responsible for installing agents on every instance in your cloud environment, keeping the agent current and up-to-date, and troubleshooting any connectivity issues. This is of course something that has been commonplace in the world of enterprise IT. Customers of endpoint security solutions such as Sophos and antivirus products such as Intel McAfee are used to dealing with agents in Windows and Linux hosts.

However, in cloud environments where you have hundreds or thousands of instances or virtual machines across dozens of VPCs around the world with hundreds of accounts accessing them, and you have instances being added and deleted from your environment frequently, the complexity of managing even something as simple as agents becomes a significant burden. Agent management opens up another small window of vulnerability in a cloud environment. Who can guarantee that an agent is installed on a newly launched rouge instance? In environments with multiple business units and stakeholders with their own configuration management systems and base images, ensuring that an agent is installed in each instance is particularly hard.

2. No Place to Install Agents in Function-as-a-Service (FaaS) and Built-in services: Even as late as five years ago, AWS for most enterprises was just three or four services, namely EC2, S3 and EBS. You could install agents in EC2 instances and fully protect your environment. 

But as of 2016, public cloud has exploded in terms of services offered, now looks more like this:

Figure 4: Broad range of services on AWS today

Many businesses now actively use the built-in database services (DynamoDB, RDS), load balancers (ELB) and big data services (EMR, ElasticSearch). FaaS offerings such as AWS Lambda (aka serverless computing) are also becoming popular. These services either don’t let you install agents in them, or have nowhere to put an agent (where does the agent go in a Lambda function?). Agent-based solutions completely fail to monitor or protect these services. You may not be using these services actively today, but do you want your cloud security solution limiting what you can monitor and protect in the future?

3. No Awareness of Cloud-Native Services: Not only do agent-based security products fail to protect cloud-native services such as ELB and RDS, but they do not even allow these services to be modeled in instance security policies. For example, there is no way to specify in a security policy that an instance can get incoming connections only from an ELB or can send outbound traffic only to RDS. You have to resort to an overly permissive “open to all” approach because of these limitations.

4. Cloud-native security with cloud-agnostic policy automation wins: Managing security and compliance in a cloud-agnostic way is important not only for hybrid cloud scenarios but also for multi-cloud deployments where you may have workloads running on Azure and AWS and don’t want to be tied exclusively to one particular platform. You can achieve cloud-agnostic security management without having to install and manage agents. Cloud security platforms allow you to manage the security posture of multiple public cloud environments by specifying policies and rules (the “what” of cloud security) in a cloud-agnostic way while using the native capabilities of each cloud to implement and enforce the security policies (the “how”). You get the best of both worlds, because you are able to specify security policies once across multiple clouds, and then use the powerful controls provided by each cloud to implement them.

Figure 5: Combining cloud-native implementation with cloud-agnostic policies

5. Unnecessary tax on your AWS environment with agent-based approach: With agent-based solutions, not only do you have agents running in each instance and taking a bite out of CPU utilization, you also have these agents talking to a service controller and consuming bandwidth in your cloud environment. The overhead is, of course very small, but nevertheless it’s still a tax that you are paying on your cloud bill. In an agentless solution, the security platform talks directly to AWS through the cloud’s control plane API, without impacting performance or resource utilization in your cloud environment.