Troubleshoot ACI Security Policies - Contracts

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Contents

Introduction

This document describes steps to understand and troubleshoot ACI Security Policies, known as Contracts.

Background Information

The material from this document was extracted from the Troubleshooting Cisco Application Centric Infrastructure, Second Edition book, specifically the Security Policies - Overview, Security Policies - Tools, Security Policies - EPG to EPG, Security Policies - Preferred group and Security Policies - vzAny to EPG chapters.

Overview

The fundamental security architecture of the ACI solution follows a permitlist model. Unless a VRF is configured in unenforced mode, all EPG to EPG traffic flows are implicitly dropped. As implied by the out-of-the-box permitlist model, the default VRF setting is in enforced mode. Traffic flows can be allowed or explicitly denied by implementing zoning-rules on the switch nodes. These zoning-rules can be programmed in a variety of different configurations depending on the desired communication flow between endpoint groups (EPG) and the method used to define them. Note that zoning-rule entries are not stateful and will typically allow/deny based on port/socket given two EPGs once the rule has been programmed.

Methods to program zoning-rules

The main methods to program zoning-rules within ACI are as follows:

The following diagram can be used to reference the granularity of zoning-rule that each of the above methods allows for control:

Comparsion between zoning-rule methodologies

Differences between Security Constructs

While utilizing the contract method of programming zoning-rules, there is an option for defining the contract scope. This option must be given careful consideration if any route leaking/shared service design is required. If the wish is to get from one VRF to another within the ACI fabric, contracts are the method to do so.

The scope values can be the following:

Reading a zoning-rule entry

Once the zoning-rule is programmed, it will appear as the following on a leaf:

+---------+--------+--------+----------+----------------+---------+---------+-----------------+----------+----------------------+
| Rule ID | SrcEPG | DstEPG | FilterID | Dir | operSt | Scope | Name | Action | Priority |
+---------+--------+--------+----------+----------------+---------+---------+-----------------+----------+----------------------+

Policy Content-Addressable Memory (CAM)

As each zoning rule gets programmed, a matrix of the zoning-rule entry mapped against filter entries will begin to consume Policy CAM on the switches. While designing allowed flows through an ACI fabric, special care should be taken when re-using contracts, as opposed to creating new ones, depending on the end design. Haphazardly re-using the same contract across multiple EPGs without understanding the resulting zoning-rules can quickly cascade into multiple flows being allowed unexpectedly. At the same time, these unintentional flows will continue to consume Policy CAM. When Policy CAM becomes full, the zoning-rule programming will begin to fail which can result in unexpected and intermittent loss depending on configuration and endpoint behaviors.

VRF leaking, global pcTags and policy enforcement directionality of shared L3Outs

This is a special callout for the shared services use case which requires contracts to be configured. Shared services typically imply inter-VRF traffic within an ACI fabric which relies on the usage of either a 'tenant' or 'global' scoped contract. To fully understand this, one must first reinforce the idea that the typical pcTag value assigned to EPGs are not globally unique. pcTags are scoped to a VRF and the same pcTag could potentially be re-used within another VRF. When the discussion of route leaking comes up, start to enforce requirements on the ACI fabric including the need for globally unique values including subnets and pcTags.

What makes this a special consideration is the directionality aspect tied to an EPG being a consumer vs a provider. In a shared services scenario, the provider is typically expected to drive a global pcTag to get a fabric unique value. At the same time, the consumer will retain its VRF-scoped pcTag which puts it in a special position to be able to now program and understand the usage of the global pcTag value to enforce policy.

For reference, the pcTag allocation range is as follows:

VRF policy control enforcement direction

In each VRF it is possible to define the enforcement direction setting.

Understanding where the policy is enforced depends on several different variables.

The table below helps to understand where the security policy is enforced at leaf level.

Where is policy enforced?

VRF enforcement mode

Policy enforced on

· If destination endpoint is learned: ingress leaf*

· If destination endpoint is not learned: egress leaf

Consumer leaf (non-border leaf)

Provider leaf (non-border leaf)

Border leaf -> non-border leaf traffic

· If destination endpoint is learned: border leaf

· If destination endpoint is not learned: non-border leaf

Non-border leaf-> border leaf traffic

Consumer leaf (Non-border leaf)

*Policy enforcement is applied on the first leaf hit by the packet.

The figure below illustrates an example of contract enforcement where EPG-Web as consumer and L3Out EPG as provider have an intra-VRF contract. If VRF is set to Ingress enforcement mode, policy is enforced by the leaf nodes where EPG-Web resides. If VRF is set to Egress enforcement mode, policy is enforced by the border leaf nodes where L3Out resides if VM-Web endpoint is learned on the border leaf.

Ingress enforcement and egress enforcement

Contract Topology

Tools

There are a variety of tools and commands that can be used to help in the identification of a policy drop. A policy drop can be defined as a packet drop due to a contract configuration or lack thereof.

Zoning-rule validation

The following tools and commands can be used to explicitly validate the zoning-rules that are programmed on leaf switches as a result of completed contract consumer/provider relationships.

'show zoning-rules'

A switch level command showing all zoning rules in place.

leaf# show zoning-rule
+---------+--------+--------+----------+----------------+---------+----------+-----------------+----------+----------------------------+
| Rule ID | SrcEPG | DstEPG | FilterID | Dir | operSt | Scope | Name | Action | Priority |
+---------+--------+--------+----------+----------------+---------+----------+-----------------+----------+----------------------------+
| 4156 | 25 | 16410 | 425 | uni-dir-ignore | enabled | 2818048 | external_to_ntp | permit | fully_qual(7) |
| 4131 | 16410 | 25 | 424 | bi-dir | enabled | 2818048 | external_to_ntp | permit | fully_qual(7) |
+---------+--------+--------+----------+----------------+---------+----------+-----------------+----------+----------------------------+

'show zoning-filter'

A filter that contains the sport/dport information that the zoning rule is acting on. The filter programming can be verified with this command.

leaf# show zoning-filter
+----------+----------+-------------+-------------+-------------+----------+-------------+-------------+-------------+-------------+----------+
| FilterId | Name | EtherT | Prot | ApplyToFrag | Stateful | SFromPort | SToPort | DFromPort | DToPort | Prio |
+----------+----------+-------------+-------------+-------------+----------+-------------+-------------+-------------+-------------+----------+
| implarp | implarp | arp | unspecified | no | no | unspecified | unspecified | unspecified | unspecified | dport |
| implicit | implicit | unspecified | unspecified | no | no | unspecified | unspecified | unspecified | unspecified | implicit |
| 425 | 425_0 | ip | tcp | no | no | 123 | 123 | unspecified | unspecified | sport |
| 424 | 424_0 | ip | tcp | no | no | unspecified | unspecified | 123 | 123 | dport |
+----------+----------+-------------+-------------+-------------+----------+-------------+-------------+-------------+-------------+----------+

'show system internal policy-mgr stats'

This command can be run to verify the number of hits per zoning-rule. This is useful to determine whether an expected rule is being hit as opposed to another, such as an implicit drop rule that may have a higher priority.

leaf# show system internal policy-mgr stats
Requested Rule Statistics
Rule (4131) DN (sys/actrl/scope-2818048/rule-2818048-s-16410-d-25-f-424) Ingress: 0, Egress: 0, Pkts: 0 RevPkts: 0
Rule (4156) DN (sys/actrl/scope-2818048/rule-2818048-s-25-d-16410-f-425) Ingress: 0, Egress: 0, Pkts: 0 RevPkts: 0

'show logging ip access-list internal packet-log deny'

A switch level command that can be run at iBash level which reports ACL (contract) related drops and flow-related information including:

leaf# show logging ip access-list internal packet-log deny 
[ Tue Oct 1 10:34:37 2019 377572 usecs]: CName: Prod1:VRF1(VXLAN: 2654209), VlanType: Unknown, Vlan-Id: 0, SMac: 0x000c0c0c0c0c, DMac:0x000c0c0c0c0c, SIP: 192.168.21.11, DIP: 192.168.22.11, SPort: 0, DPort: 0, Src Intf: Tunnel7, Proto: 1, PktLen: 98
[ Tue Oct 1 10:34:36 2019 377731 usecs]: CName: Prod1:VRF1(VXLAN: 2654209), VlanType: Unknown, Vlan-Id: 0, SMac: 0x000c0c0c0c0c, DMac:0x000c0c0c0c0c, SIP: 192.168.21.11, DIP: 192.168.22.11, SPort: 0, DPort: 0, Src Intf: Tunnel7, Proto: 1, PktLen: 98

contract_parser

An on-device Python script which produces an output that correlates the zoning-rules, filters and hit statistics while performing name lookups from IDs. This script is extremely useful in that it takes a multi-step process and turns it into a single command which can be filtered to specific EPGs/VRFs or on other contract related values.

leaf# contract_parser.py
Key:
[prio:RuleId] [vrf:] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags][contract:] [hit=count]

[7:4131] [vrf:common:default] permit ip tcp tn-Prod1/ap-Services/epg-NTP(16410) tn-Prod1/l3out-L3Out1/instP-extEpg(25) eq 123 [contract:uni/tn-Prod1/brc-external_to_ntp] [hit=0]
[7:4156] [vrf:common:default] permit ip tcp tn-Prod1/l3out-L3Out1/instP-extEpg(25) eq 123 tn-Prod1/ap-Services/epg-NTP(16410) [contract:uni/tn-Prod1/brc-external_to_ntp] [hit=0]
[12:4169] [vrf:common:default] deny,log any tn-Prod1/l3out-L3Out1/instP-extEpg(25) epg:any [contract:implicit] [hit=0]
[16:4167] [vrf:common:default] permit any epg:any tn-Prod1/bd-Services(32789) [contract:implicit] [hit=0]

Packet classification validation

ELAM

An ASIC level report used to check forwarding details which indicates, in the case of a dropped packet, the drop reason. Relevant to this section, the reason can be a SECURITY_GROUP_DENY (contract policy drop).

fTriage

A Python-based utility on the APIC which can track end-to-end packet flow with ELAM.

ELAM Assistant App

An APIC App that abstracts the complexity of various ASICs to make forwarding decision inspection much more convenient and user friendly.

Please refer to the "Intra-Fabric Forwarding" section for additional details on the ELAM, fTriage and ELAM Assistant Tools

Policy CAM usage

Policy CAM usage on a per leaf basis is an important parameter to monitor to ensure the fabric is in a healthy status. The quickest way to monitor that is to use the 'Capacity Dashboard' within the GUI and explicitly check the 'Policy Cam' column.

The 'Leaf Capacity' view of Capacity Dashboard

Capacity Dashboard

'show platform internal hal health-stats'

This command is useful for validating a variety of resource limits and usage, including Policy CAM. Note that this command can only be run in vsh_lc, so pass it in using the '-c' flag if being run from iBash.

leaf8# vsh_lc -c "show platform internal hal health-stats"
|Sandbox_ID: 0 Asic Bitmap: 0x0
|-------------------------------------
.
Policy stats:
=============
policy_count : 96
max_policy_count : 65536
policy_otcam_count : 175
max_policy_otcam_count : 8192
policy_label_count : 0
max_policy_label_count : 0
=============

EPG to EPG

Generic policy drop considerations

There are numerous ways to troubleshoot a connectivity issue between two endpoints. The following methodology provides a good starting point to quickly and effectively isolate whether the connectivity issue is the result of a policy drop (contract induced).

Some high-level questions worth asking before diving in: