Episode 106 — NACL vs NSG: stateless/stateful thinking and inbound/outbound logic
In Episode One Hundred Six, titled “NACL vs NSG: stateless/stateful thinking and inbound/outbound logic,” we treat these controls as different layers of traffic filtering, because the exam often tests whether you can reason about where a policy should live and how statefulness changes what you must explicitly allow. Network access control lists, often shortened to NACLs after first mention, and network security groups, often shortened to NSGs after first mention, are both ways to filter traffic, but they operate at different scopes and with different assumptions. If you confuse the layers, you end up with policies that look correct on paper but fail in practice, especially when return traffic is blocked or when duplicate rules drift out of sync. The simplest exam-ready mindset is to think of NACLs as subnet guardrails and NSGs as resource-level policy, and then to remember that stateless filtering requires you to think about both directions explicitly. When you can narrate traffic in and traffic out with clarity, you can choose the correct control point and avoid the common failure modes.
Before we continue, a quick note: this audio course is a companion to the Cloud Net X books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.
Network access control lists are stateless rules applied at subnet boundaries, meaning they evaluate packets without remembering whether a connection was previously established. Stateless filtering is powerful as a coarse boundary control because it can restrict what enters or leaves a subnet regardless of what individual instances or interfaces do, but it also requires more careful rule design because every permitted flow must be allowed in both directions explicitly. At a subnet boundary, the intent is often to enforce broad guardrails, such as denying clearly disallowed ports, preventing certain inbound exposures, or limiting certain outbound destinations, while keeping the policy small enough to remain understandable. Because NACLs operate on packets, not on session state, they do not automatically allow return traffic, so the return path must be permitted explicitly or connections will fail in confusing ways. The exam tends to emphasize this stateless characteristic because it drives the most common misconfiguration, where teams allow an inbound request but forget the outbound response rule or forget the ephemeral port behavior of client connections. When you treat NACLs as coarse, stateless boundaries, you naturally keep them simple and you design them with explicit directionality.
Network security groups are stateful rules applied to interfaces or resources, meaning they track connection state and automatically allow return traffic for established sessions when a flow is permitted. Stateful filtering aligns well to targeted control because it can be attached to specific resources, such as virtual machine interfaces, service endpoints, or other resource-level constructs, and it can enforce least privilege access for those resources without requiring subnet-wide broad rules. Because NSGs are stateful, you generally allow the initiating direction and rely on state tracking to permit the response, which reduces the complexity of writing symmetrical rules. This stateful behavior also makes NSGs more forgiving in many common application flows, because you do not have to remember to open ephemeral return ports explicitly for each allowed session. The exam often expects you to recognize that NSGs are where you express precise intent like “only this application tier can reach this database on this port,” while the subnet remains protected by broader guardrails. When you understand that NSGs are interface-level and stateful, you can explain why they are the preferred layer for fine-grained segmentation inside a cloud network.
Using NACLs for coarse guardrails and NSGs for targeted control is a practical rule of thumb because it minimizes policy sprawl and makes troubleshooting easier. Coarse guardrails at the subnet level can block obviously unwanted exposure, such as denying inbound access to sensitive subnets from untrusted sources, while leaving detailed service-to-service allowances to the resource-level NSGs. Targeted control at the NSG level supports least privilege because you can scope rules to the exact sources, destinations, and ports required for a workload without affecting unrelated systems that share the subnet. This division of responsibility also reduces the risk of accidental broad permissions, because you do not have to write overly permissive subnet rules just to satisfy one workload’s need. The exam tends to reward this layered approach because it reflects how cloud security is typically engineered: broad boundaries first, then precise control at the resource. When you adopt this model, policy becomes easier to audit because each layer has a clear purpose.
Rule evaluation order matters because both types of controls evaluate rules using defined precedence, and misunderstanding precedence creates outcomes that look random to operators. Many implementations evaluate rules from lowest number to highest number or from highest priority to lowest priority, and the first match can determine the outcome, which means ordering can shadow later rules. In stateless NACLs, the need for return traffic allowances is also tied to evaluation order, because an intended allow rule can be overridden by an earlier deny rule that matches the same traffic. In stateful NSGs, evaluation order still matters for which rule grants access, but the stateful nature reduces the need for explicitly symmetric return rules, simplifying the rule set. The exam expects you to recognize that precedence and ordering are operational realities, not minor details, because a misordered deny can break a service and a misordered allow can quietly weaken segmentation. Understanding order also supports troubleshooting because you can explain which rule is actually matching when traffic is unexpectedly blocked or allowed. When you design with order in mind, you reduce both security surprises and availability incidents.
Inbound and outbound directionality is where most confusion happens, especially with stateless filtering, because people tend to think in terms of “open port” rather than “allow a conversation.” Inbound rules govern what enters a subnet or interface, and outbound rules govern what leaves, but a single client-server interaction involves both directions, including response traffic that may use ephemeral ports on the client side. Common misconfigurations include allowing inbound server ports but blocking outbound responses, or allowing outbound initiation but blocking inbound return traffic, creating failures that appear intermittent because retries and timeouts behave inconsistently. Another misconfiguration is assuming that “outbound allow all” is harmless, which can enable exfiltration and command-and-control channels, particularly from sensitive segments. The exam tends to focus on directionality because it is fundamental to understanding stateless and stateful behavior, and because many scenario questions depend on knowing which side is initiating and which side is responding. When you narrate a flow carefully, including who initiates and where the return goes, directionality becomes manageable rather than confusing.
A scenario that clarifies layered behavior is blocking a port at the subnet level while allowing more specific internal traffic with an NSG, which shows how coarse guardrails can override targeted allowances. Imagine a subnet that hosts sensitive databases, and the organization wants to prevent any direct inbound access on a particular port from untrusted networks, so a NACL denies inbound traffic on that port at the subnet boundary. Within the environment, an NSG attached to the database interface allows the application tier to connect on that port, but that allowance only matters if the traffic can reach the interface in the first place. If the NACL deny applies broadly, it will block the traffic before it ever reaches the NSG, which means the targeted allow never takes effect, and the application fails. The correct design would ensure the NACL guardrail is scoped appropriately, such as denying untrusted sources while allowing trusted internal sources, or keeping the NACL simpler and relying on NSGs for the precise allow. The exam often uses scenarios like this to test whether you understand that multiple layers can apply and that the more upstream boundary can override downstream permissions.
Forgetting ephemeral return ports is a classic pitfall with stateless filtering because client-side ports are often dynamically selected and therefore not obvious when you read a rule set. When a client initiates a connection to a server port, the client uses an ephemeral source port, and the server’s response returns to that ephemeral port, meaning the return traffic must be allowed by stateless rules. If you only allow inbound server ports and do not allow outbound response traffic, or if you block inbound responses to ephemeral ports, the connection fails even though the server port appears open. This can create confusing symptoms, such as partial connectivity, slow timeouts, or failures that appear only under certain traffic patterns, because different clients and retries use different ephemeral ports. The exam expects you to recognize this because it is one of the most common reasons stateless policies break applications unexpectedly. Remembering ephemeral ports is really remembering the return path, and stateless filtering forces you to express that return path explicitly.
Duplicate rules across layers create confusion and outages because when the same policy is replicated in NACLs and NSGs, drift becomes inevitable and troubleshooting becomes a guessing game. If a port is allowed in one layer but denied in another, responders may spend time adjusting the wrong control, believing they have fixed the issue when the upstream layer still blocks traffic. Duplicated policies also increase maintenance load because every change must be applied consistently in multiple places, and under incident pressure, consistency is often the first thing to slip. The exam typically expects you to avoid unnecessary duplication by assigning responsibilities to layers, such as keeping NACLs minimal as guardrails and using NSGs for precise workload policy. When duplication is unavoidable, documentation and validation become critical, but the safer default is to minimize overlap and keep each layer’s role distinct. Clarity is a defensive control in itself because it reduces the chance of accidental exposure and reduces recovery time during outages.
Quick wins include keeping NACLs simple and documenting NSG intent, because simplicity at the subnet boundary reduces the chance of return-path mistakes and makes it easier to reason about what is blocked globally. A simple NACL often focuses on a small number of broad denies or allows that reflect architectural boundaries, rather than trying to encode every application flow. NSG intent documentation should state which workloads the NSG protects, what flows are required, and why certain ports and sources are permitted, because that context prevents legacy rules from accumulating and supports troubleshooting. Documentation also helps reviewers identify when a rule can be narrowed or removed, especially after migrations and architecture changes. The exam tends to reward the idea that maintainability and clarity are part of security, because complex, undocumented policies become unreliable over time. When you keep the NACL small and the NSG purposeful, you preserve both security and operability.
Operationally, testing changes in non-production first is essential because small mistakes in traffic filtering can create large outages, especially when policies apply to shared subnets or critical resources. Non-production testing allows you to validate both the intended flow and the return path behavior, and it helps reveal hidden dependencies that were not documented. Testing also supports phased rollout, where you apply a change to a limited scope, observe results, and then expand, reducing the chance of widespread disruption. The exam expects you to recognize that network policy changes are high-impact and should be treated with change discipline, including rollback planning and validation checks. In hybrid environments, testing should include cross-environment flows because peering and shared services can behave differently than purely internal traffic. When changes are tested and verified, confidence increases and emergency exceptions become less necessary.
A memory anchor that fits this episode is subnet stateless, interface stateful, mind return path, because it captures the practical distinction that drives most exam questions and real-world errors. Subnet stateless reminds you that NACLs operate at subnet boundaries and do not track sessions, so both directions must be considered explicitly. Interface stateful reminds you that NSGs apply to resources or interfaces and track connection state, so return traffic is handled more naturally when a session is permitted. Mind return path reminds you that the success or failure of a flow often hinges on responses and ephemeral ports, not on the obvious server port alone. This anchor helps you reason quickly about where to apply a policy and what to check when troubleshooting connectivity. When you can apply it under time pressure, you are demonstrating the exact kind of practical understanding the exam is designed to measure.
A prompt-style exercise is deciding where to enforce a stated policy, because the correct answer depends on whether the policy is a broad boundary control or a precise workload control. If the policy is “no inbound access to this subnet from untrusted networks,” a subnet-level NACL guardrail can make sense, because it enforces a broad boundary regardless of workload. If the policy is “only the application tier can reach the database on this port,” an NSG is the better place because it is targeted, stateful, and easier to scope to the specific resources involved. If the policy is “restrict outbound internet access for sensitive systems,” a combination is often used, with NSGs enforcing precise egress for the sensitive resources and a simple NACL guardrail preventing unexpected outbound paths at the subnet boundary. The exam expects you to justify placement by scope and by statefulness, not by personal preference. Practicing this selection builds the ability to place controls where they are effective and maintainable.
Episode One Hundred Six concludes with the key differences: NACLs are stateless subnet boundary guardrails that require explicit inbound and outbound allowances for conversations, while NSGs are stateful resource-level controls that support precise least privilege policy with automatic return traffic handling. Understanding rule evaluation order and directionality prevents the most common misconfigurations, especially the ephemeral return port mistakes that break applications under stateless filtering. Avoiding duplicated rules across layers reduces drift and troubleshooting confusion, and keeping NACLs simple while documenting NSG intent improves long-term maintainability. The return-path rehearsal assignment is to take a simple client-server flow, narrate which side initiates, what ports are used, how the return traffic travels, and then decide whether the policy belongs at the subnet NACL or at the resource NSG based on scope and statefulness. When you can narrate that return path clearly, you demonstrate exam-ready understanding and the operational mindset needed to avoid self-inflicted outages. With that mindset, NACLs and NSGs become complementary tools rather than competing sources of confusion.