AWS vs Azure: AWS Security Groups and Microsoft Azure Network Security Groups
One of the major challenges in adopting cloud is getting used to doing things differently.
I work with a lot of IT and security engineers that have been tasked with leading their company into the cloud promised land, and one of the mistakes they make is applying old paradigms to new technology.
While cloud is not by any means “new”, there are still a lot of questions around how things work and which tools should be used for which jobs. In this series, we’ll focus on how Security Groups (or firewall rules) work across the major public cloud platforms, and the most prominent private cloud platform - henceforth referred to as the Big Four.
Aws vs Azure: In this post, we’ll focus on AWS security groups and Azure Security Groups.
Cloud Security Doesn’t Work Like Traditional Firewalls (mostly)
In a traditional network, you usually have firewalls that filter traffic as it moves from one network to another. At the most basic level, when a packet enters the firewall it gets inspected and compared to a set of rules, those rules decide whether the packet lives or dies.
With most cloud platforms, the enforcement point is a bit different. Instead of having a dedicated network entity that enforces rules on incoming/outgoing traffic, each individual server is associated with a security policy. More specifically, the (virtual) network interface card on each server has firewall rules applied to it.
A generic 3-tier web app secured on a cloud
Keep in mind that the diagram above is just an example, and security policies behave a bit differently on the various cloud vendors. The general principle of segmentation at the server level though, is similar on most clouds.
Note that your cloud security solution doesn’t have to end with security groups. There’s a breadth of solutions out there that introduce different layers of security controls to cloud deployments, some examples are Trend Micro’s Deep Security or Palo Alto Networks’ VM-Series Next Generation Firewall. Security groups however, in their different variations, are the built-in security control for most clouds and provide the baseline for server security.
AWS Security Groups
In AWS, Security Groups are sets of permissive (‘Allow’ only) inbound and outbound rules that are associated with instances. Whenever an instance is created within a VPC, it has to be associated with a Security Group. By default all VPC instances are associated with the “default” Security Group, which exists in each VPC.
A few things about AWS Security Groups:
- They are not an EC2 service. They fall under the VPC service and can secure other entities such as RDS or ELB.
- By default, some limitations apply to Security Groups but extensions can be requested:
> Up to 500 AWS security groups per VPC
> Up to 50 rules per Security Group
> Up to 5 Security Groups can be applied to a network interface
When a Security Group is associated with an instance, it’s associated with the primary network interface. Additional interfaces can be added as ENIs (Elastic Network Interface).
The ‘default’ Security Group is automatically associated with all instances unless specified otherwise. The initial setting for the ‘default’ Security Group are:
> Allow inbound traffic only from other instances associated with the same ‘default’ Security Group
> Allow all outbound traffic
- Each Security Group has two sets of rules, inbound and outbound. Inbound rules dictate how traffic enters the instance, and outbound rules inspect traffic leaving the instance.
- Security Group rules are stateful - for example, if we configure an inbound rule that says “Allow HTTP from 22.214.171.124”, we don’t need to create an outbound rule that allows HTTP responses from our instance.
We’ve already mentioned that Security Groups only contain “Allow” rules, but that fact has another interesting consequence - rule order doesn’t matter. As an ex-networking/firewall guy myself, order of operation used to be a big deal for me. When configuring security policies on a Cisco ASA or Juniper SRX, you have to make sure you don’t have rules that negate each other.
Oftentimes, the best practice would be to create a whitelist, a list of permissive rules that allow necessary traffic and let an implicit or explicit “deny all” rule take care of the rest. This was usually the reason you ultimately wind up with massive and unwieldy policies containing thousands of ultra-specific “126.96.36.199 to 188.8.131.52” rules.
The reason I’m mentioning this is because if you, like me, at some point of your career had to trade in physical firewalls and routers for Security Groups and Virtual Gateways, you’ve noticed that the change is not just with implementation, but with planning as well.
Each Security Group rule has 4 fields:
Type, protocol and port range are pretty straightforward. Source/Destination can specify an IP address, range or Security Group. When specifying a Security Group as a source or destination, it means every instance associated with that Security Group, and allows more legible network topographies.
EC2 Classic Security Groups
If your AWS account is old enough, it supports EC2 Classic. EC2 Classic treats compute as one big pool of resources in each region, as opposed to VPC which creates isolated cloud deployments. Generally, EC2 Classic Security Groups behave like their VPC counterparts, with a few exceptions:
You can only configure an instance’s Security Groups on instance creation: once an instance is up and running you can only add and remove rules from its associated groups, but not add or remove any Security Groups from them (Note: if you modify rules on a Security Group, it will affect all instances associated with it).
EC2 Classic Security Groups are tied to regions. For an instance to be associated with a Security Group they must be in the same region.
In EC2-Classic, you can associate an instance with up to 500 security groups and add up to 100 rules to a security group (Note: if you need 500 security groups for an instance, you’re doing it wrong).
Security Group Best Practices
Most best practices around AWS Security Groups, or Security Groups in general, have to do with limiting sprawl. The point is to get as far away as you can from these extremely bad practices:
Every new instance uses a new Security Group
Every newly required access gets a new Security Group
All instances use the same Security Group
The main point will be creating relevant Security Groups in advance and assigning them correctly upon instance creation to eliminate the need for ad-hoc security.
Using the ‘default’ Security Group for all instances is essentially using the old “hard on the outside, soft on the inside” security model. Using the ‘default’ Security Group also opens your infrastructure to all kinds of risks: it’s very easy to just add another rule when you need a new type of access on an instance, but when you add that rule to the ‘default’ Security Group it makes all its associated machines vulnerable.
Here’s an example of an AWS security paradigm for you to consider.
Broad-to-narrow Security Groups
Create a few “general purpose” Security Groups as the baseline of your VPC’s security. For example, separate Security Groups for Windows and Linux instances, that allow RDP and SSH respectively, along with required ports for management tools. These groups will replace the ‘default’ Security Group. Since these groups are going to be applied to most instances in the VPC, regardless of their function, consider whether you want all members of these groups to be able to talk to each other.
Once the security baseline is in place, create role-based Security Groups for web servers, databases, ELBs, test environments or anything else relevant to your use-case.
Azure Network Security Groups
Many of the same principles that apply to AWS can also apply to Azure, but Azure Network Security Groups (NSG) have a few important differences:
NSGs can be applied to individual VMs, subnets, or both
NSGs have both ‘Deny’ and ‘Allow’ rules - This means that rule order (or priority) matters!
Like EC2 Classic Security Groups, Azure NSGs can only be applied to resources in the same region they were created in
Azure has a security feature called Endpoint ACLs, you can’t have both an NSG and an endpoint ACL applied to the same VM
All NSGs include a set of default rules that cannot be changed or deleted, but can be overridden
Like AWS Security Groups, Azure NSGs have two sets of rules, inbound and outbound.
Each rule has the following properties:
Priority - A best practice will be to use large increments (100,200) so you won’t have to edit the priorities of existing rules when adding new ones
Source - Any/CIDR block/Tag (Tags are explained below)
Protocol - TCP/UDP/Any
Source Port - Range/Single Port/Any
Destination - Any/CIDR block/Tag (Tags are explained below)
Destination Port - Range/Single Port/Any
Action - Allow/Deny
The default rules in each NSG are:
Note the default Azure tags, VirtualNetwork, AzureLoadBalancer and Internet.
From the Azure documentation:
VIRTUAL_NETWORK: This default tag denotes all of your network address space. It includes the virtual network address space (CIDR ranges defined in Azure) as well as all connected on-premises address spaces and connected Azure VNets (local networks).
AZURE_LOADBALANCER: This default tag denotes Azure’s Infrastructure load balancer. This will translate to an Azure datacenter IP from which Azure’s health probes originate.
INTERNET: This default tag denotes the IP address space that is outside the virtual network and reachable by public Internet. This range includes Azure owned public IP space as well.
Unlike AWS Security Groups, Azure NSGs have a hierarchy between them. NSGs can be applied to VMs, subnets, or both. An NSG that’s been associated with a subnet will apply to all VMs in that subnet.
In the case where NSGs are applied both to the subnet and to individual VMs within that subnet, inbound traffic will hit the subnet NSG first, and then the VM NSG. It’s important to remember that traffic must be allowed by both the subnet NSG and the VM NSG in order to pass through.
But it’s Actually a Bit More Complicated Than That
Microsoft Azure has two deployment models, Classic and Resource Manager. Simply put, old and new. The two deployment models are different approaches for using the Azure cloud platform, and they handle resource provisioning differently. I highly recommend reading more about the differences between Resource Manager and Classic.
In Classic Deployments - NSGs are applied to VMs. This means that the NSG rules will apply to all traffic coming to and going from the VM.
In Resource Manager Deployments - NSGs are applied to NICs. This means that the NSG rules will only apply to the relevant NIC. In a multi-NIC machine, the NSG will not process traffic from other NICs unless configured on them.
In both deployments - NSGs can be applied to subnets. This means that the NSG rules will be applied to all NICs that belong to that subnet.
I know, this is all a bit of a mess. Microsoft’s recommendation is to use the Resource Manager for new resources, and re-deploy any existing Classic resources you might have to the Resource Manager as well. Here are a few important and useful references:
- What is a Network Security Group? Provides great examples for using Azure NSGs, and the important differences between Classic and Resource Manager deployments.
- Create a VM with multiple NICs: Understand how multi-NIC VMs work in Azure.
- Understanding Resource Manager and Classic deployments: The same article linked above, understand the key differences in deployments and how they might affect your use-case.
Network Security Group Best Practices
Most of the best practices we discussed above will apply here as well. If you’re using multi-cloud infrastructure, it’s best to try to enforce security in a similar fashion across the different cloud platforms.
Most of the Azure-specific best practices will have to do with making your life easier when it comes to managing NSGs.
Use the Resource Manager - If possible, use a Resource Manager deployment over a Classic one. That’s the direction Azure is heading and more granular controls are being added. Furthermore, the new Azure portal makes it easier to create and modify NSGs. If you use a cloud management platform, make sure it uses these new APIs.
Use whitelists - Since you can add both “Allow” and “Deny” rule, you have the possible pitfall of contradicting rules. To make your life easier, make the Azure NSGs behave like Security Groups on other clouds. Create a list of “Allow” rules and let the “Deny All” rule catch the rest.
Don’t use blacklists - Alternatively, you can go the other way around with “Deny” rules and an “Allow All” cleanup rule - but that might result in an overly permissive policy and a large number of “Deny” rules. Further, some compliance regulations don’t permit “Allow All” rules when securing sensitive information. Since this is such a bad practice, we can only speculate that Microsoft added this for backwards-compatibility.
There are a few general takeaways from all of this: The cloud presents a new security model, and traditional firewall concepts will not always apply
When planning cloud security, consider the possibility of your cloud strategy expanding and introducing more platforms into the mix. Managing your policies in a unified manner, and having your AWS and Azure policies behave the same will make management easier.
If you have any insights from your experience with cloud security, we’d love to hear them! The next post in the series will deal with security policies on OpenStack and Google Compute Engine.
Feel free to contact me directly at firstname.lastname@example.org .