Framework and Requirements for Ethernet VPN (EVPN) Operations, Administration, and Maintenance (OAM)CiscoThe Atrium Building, Floor 3Weygand St.BeirutLebanonssalam@cisco.comCisco170 West Tasman DriveSan JoseCA95134United States of Americasajassi@cisco.comGoogle, Inc.1600 Amphitheatre ParkwayMountain ViewCA94043United States of Americaaldrin.ietf@gmail.comJuniper Networks1194 N. Mathilda Ave.SunnyvaleCA94089United States of Americajdrake@juniper.netFuturewei Technologies2386 Panoramic CircleApopkaFL32703United States of America+1-508-333-2270d3e3e3@gmail.comPBB-EVPNfault managementperformance management
This document specifies the requirements and reference framework for
Ethernet VPN (EVPN) Operations, Administration, and Maintenance (OAM).
The requirements cover the OAM aspects of EVPN and Provider Backbone Bridge EVPN (PBB-EVPN). The framework defines the layered OAM model
encompassing the EVPN service layer, network layer, underlying Packet
Switched Network (PSN) transport layer, and link layer but focuses on
the service and network layers.Introduction
This document specifies the requirements and defines a reference
framework for Ethernet VPN (EVPN) Operations, Administration, and
Maintenance (OAM) . In this context, we use the term "EVPN OAM" to loosely refer to the OAM functions required for and/or
applicable to and .
EVPN is a Layer 2 VPN (L2VPN) solution for multipoint Ethernet
services with advanced multihoming capabilities that uses BGP for
distributing Customer/Client Media Access Control (C-MAC) address reachability information
over the core MPLS/IP network.
PBB-EVPN combines Provider Backbone Bridging (PBB) with EVPN in
order to reduce the number of BGP MAC advertisement routes; provide client
MAC address mobility using C-MAC aggregation and
Backbone MAC (B-MAC) sub-netting; confine the scope of C-MAC
learning to only active flows; offer per-site policies; and avoid C-MAC
address flushing on topology changes.
This document focuses on the fault management and performance
management aspects of EVPN OAM. It defines the layered OAM model
encompassing the EVPN service layer, network layer, underlying Packet
Switched Network (PSN) transport layer, and link layer but focuses on
the service and network layers.Relationship to Other OAM Work
This document leverages concepts and draws upon elements defined
and&wj;/&wj;or used in the following documents: specifies the requirements and a reference model for OAM as
it relates to L2VPN services, pseudowires, and associated Packet
Switched Network (PSN) tunnels. This document focuses on Virtual Private LAN Service (VPLS) and Virtual Private Wire Service (VPWS) solutions and services. defines mechanisms for detecting data plane failures in
MPLS Label Switched Paths (LSPs), including procedures to check the correct operation of the
data plane as well as mechanisms to verify the data plane against
the control plane. specifies the Ethernet Connectivity Fault Management (CFM)
protocol, which defines the concepts of Maintenance Domains,
Maintenance Associations, Maintenance End Points, and Maintenance
Intermediate Points. extends Connectivity Fault Management in the following
areas: it defines fault notification and alarm suppression functions
for Ethernet and specifies mechanisms for Ethernet performance
management, including loss, delay, jitter, and throughput
measurement.Specification of Requirements
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14
when, and only when, they appear in all capitals, as shown here.
Terminology
This document uses the following terminology, much of which is defined
in :
CE
Customer Edge device; for example, a host, router, or switch.
CFM
Connectivity Fault Management
DF
Designated Forwarder
Down MEP
A MEP that originates traffic away from and terminates
traffic towards the core of the device in whose port it is logically located.
EVI
An EVPN instance spanning the Provider Edge (PE)
devices participating in that EVPN .
L2VPN
Layer 2 VPN
LOC
Loss of continuity
MA
Maintenance Association; a set of MEPs belonging
to the same Maintenance Domain (MD) established to verify the
integrity of a single service instance .
MD
Maintenance Domain; an OAM Domain that represents a
region over which OAM frames can operate unobstructed .
MEP
Maintenance End Point; it is responsible for
origination and termination of OAM frames for a given MA. A MEP is
logically located in a device's port .
MIP
Maintenance Intermediate Point; it is located between
peer MEPs and can process and respond to certain OAM frames but does
not initiate them. A MIP is logically located in a device's port
.
MP2P
Multipoint to Point
NMS
Network Management Station
P
Provider network interior (non-edge) node
P2MP
Point to Multipoint
PBB
Provider Backbone Bridge
PE
Provider Edge network device
Up MEP
A MEP that originates traffic towards and
terminates traffic from the core of the device in whose port it is
logically located.
VPN
Virtual Private Network
EVPN OAM FrameworkOAM Layering
Multiple layers come into play for implementing an L2VPN service
using the EVPN family of solutions as listed below. The focus of this
document is the service and network layers.
The service layer runs end to end between the sites or Ethernet
segments that are being interconnected by the EVPN solution.
The network layer extends between the EVPN PE (Provider Edge) nodes
and is mostly transparent to the P (provider network interior)
nodes (except where flow entropy comes into play). It leverages
MPLS for service (i.e., EVI) multiplexing and split-horizon
functions.
The transport layer is dictated by the networking technology of the
PSN. It may be based on either MPLS LSPs or IP.
The link layer is dependent upon the physical technology used.
Ethernet is a popular choice for this layer, but other alternatives
are deployed (e.g., Packet over SONET (POS), Dense Wavelength Division Multiplexing (DWDM), etc.).
This layering extends to the set of OAM protocols that are involved
in the ongoing maintenance and diagnostics of EVPN networks.
below depicts the OAM layering and shows which devices have
visibility into what OAM layer(s).
Service OAM and Network OAM mechanisms only have visibility to the PE
nodes but not the P nodes. As
such, they can be used to deduce whether the fault is in the customer's own network, the local CE-PE segment, the PE-PE segment, or
the remote CE-PE segment(s). EVPN Transport OAM mechanisms can be
used for fault isolation between the PEs and P nodes. below shows an example network where Ethernet domains
are interconnected via EVPN using MPLS, and it shows the OAM mechanisms
that are applicable at each layer. The details of the layers are described in
the sections below.EVPN Service OAM
The EVPN Service OAM protocol depends on what service-layer
technology is being interconnected by the EVPN solution. In the case of
and , the service layer is Ethernet; hence, the
corresponding Service OAM protocol is Ethernet CFM .
EVPN Service OAM is visible to the CEs and EVPN PEs but not to the P
nodes. This is because the PEs operate at the Ethernet MAC layer in
and , whereas the P nodes do not.
The EVPN PE MUST support MIP functions in the applicable Service OAM
protocol (for example, Ethernet CFM). The EVPN PE SHOULD support MEP
functions in the applicable Service OAM protocol. This includes both
Up and Down MEP functions.
As shown in , the MIP and MEP functions being referred to are
logically located within the device's port operating at the customer
level. (There could be MEPs/MIPs within PE ports facing the provider
network, but they would not be relevant to EVPN Service OAM as the
traffic passing through them will be encapsulated/tunneled, so any
customer-level OAM messages will just be treated as data.) Down MEP
functions are away from the core of the device while Up MEP functions
are towards the core of the device (towards the PE forwarding
mechanism in the case of a PE). OAM messages between the PE Up MEPs
shown are a type of EVPN Network OAM, while such messages between the
CEs or from a PE to its local CE or to the remote CE are Service OAMs.
The EVPN PE MUST, by default, learn the MAC address of locally
attached CE MEPs by snooping on CFM frames and advertising them to
remote PEs as a MAC/IP Advertisement route. Some means to limit the
number of MAC addresses that a PE will learn SHOULD be implemented.
The EVPN PE SHOULD advertise any MEP/MIP local to the PE as a MAC/IP
Advertisement route. Since these are not subject to mobility, they
SHOULD be advertised with the static (sticky) bit set (see ).EVPN Network OAM
EVPN Network OAM is visible to the PE nodes only. This OAM layer is
analogous to Virtual Circuit Connectivity Verification (VCCV) in the case of VPLS/VPWS. It provides
mechanisms to check the correct operation of the data plane as well
as a mechanism to verify the data plane against the control plane.
This includes the ability to perform fault detection and diagnostics
on:
the MP2P tunnels used for the transport of unicast traffic between
PEs. EVPN allows for three different models of unicast label
assignment: label per EVI, label per <ESI, Ethernet Tag>, and label
per MAC address. In all three models, the label is bound to an EVPN
Unicast Forwarding Equivalence Class (FEC). EVPN Network OAM MUST provide mechanisms to check the
operation of the data plane and verify that operation against the
control plane view.
the MP2P tunnels used for aliasing unicast traffic destined to a
multihomed Ethernet segment. The three label assignment models,
discussed above, apply here as well. In all three models, the label
is bound to an EVPN Aliasing FEC. EVPN Network OAM MUST provide
mechanisms to check the operation of the data plane and verify that
operation against the control plane view.
the multicast tunnels (either MP2P or P2MP) used for the transport
of broadcast, unknown unicast, and multicast traffic between PEs. In
the case of ingress replication, a label is allocated per EVI or
per <EVI, Ethernet Tag> and is bound to an EVPN Multicast FEC. In
the case of Label Switched Multicast (LSM) and, more specifically,
aggregate inclusive trees, again, a label may be allocated per EVI
or per <EVI, Ethernet Tag> and is bound to the tunnel FEC.
the correct operation of the Ethernet Segment Identifier (ESI) split-horizon filtering function.
In EVPN, a label is allocated per multihomed Ethernet segment for
the purpose of performing the access split-horizon enforcement. The
label is bound to an EVPN Ethernet segment.
the correct operation of the Designated Forwarder (DF)
filtering function. EVPN Network OAM MUST provide mechanisms to
check the operation of the data plane and verify that operation
against the control plane view for the DF filtering function.
EVPN Network OAM mechanisms MUST provide in-band monitoring
capabilities. It is desirable, to the extent practical, for OAM test
messages to share fate with data messages. Details of how to achieve
this are beyond the scope of this document.
EVPN Network OAM SHOULD provide both proactive and on-demand
mechanisms of monitoring the data plane operation and data plane
conformance to the state of the control plane.Transport OAM for EVPN
The Transport OAM protocol depends on the nature of the underlying
transport technology in the PSN. MPLS OAM mechanisms as well as ICMP and ICMPv6 are applicable,
depending on whether the PSN employs MPLS or IP transport,
respectively. Furthermore, Bidirectional Forwarding Detection (BFD) mechanisms per , ,
, and apply. Also, the BFD mechanisms pertaining to
MPLS-TP LSPs per are applicable.Link OAM
Link OAM depends on the data-link technology being used between the
PE and P nodes. For example, if Ethernet links are employed, then
Ethernet Link OAM (, Clause 57) may be used.OAM Interworking
When interworking two networking domains, such as actual Ethernet
and EVPN to provide an end-to-end emulated service, there is a need
to identify the failure domain and location, even when a PE supports
both the Service OAM mechanisms and the EVPN Network OAM mechanisms.
In addition, scalability constraints may not allow the running of proactive
monitoring, such as Ethernet Continuity Check Messages (CCMs)
, at a PE to detect the failure of an EVI across the EVPN
domain. Thus, the mapping of alarms generated upon failure detection
in one domain (e.g., actual Ethernet or EVPN network domain) to the
other domain is needed. There are also cases where a PE may not be
able to process Service OAM messages received from a remote PE over
the PSN even when such messages are defined, as in the Ethernet case,
thereby necessitating support for fault notification message mapping
between the EVPN Network domain and the Service domain.
OAM interworking is not limited, though, to scenarios involving disparate
network domains. It is possible to perform OAM interworking across
different layers in the same network domain. In general, alarms generated
within an OAM layer, as a result of proactive fault detection mechanisms, may be injected into its client-layer OAM mechanisms. This allows the
client-layer OAM to trigger event-driven (i.e., asynchronous) fault
notifications. For example, alarms generated by the Link OAM mechanisms may
be injected into the Transport OAM layer, and alarms generated by the
Transport OAM mechanism may be injected into the Network OAM mechanism, and
so on.
EVPN OAM MUST support interworking between the Network OAM and
Service OAM mechanisms. EVPN OAM MAY support interworking among
other OAM layers.EVPN OAM Requirements
This section discusses the EVPN OAM requirements pertaining to fault
management and performance management.Fault Management RequirementsProactive Fault Management Functions
The network operator configures proactive fault management functions
to run periodically. Certain actions (for
example, protection switchover or alarm indication signaling) can be
associated with specific events, such as entering or clearing fault
states.Fault Detection (Continuity Check)
Proactive fault detection is performed by periodically monitoring the
reachability between service end points, i.e., MEPs in a given MA,
through the exchange of CCMs . The
reachability between any two arbitrary MEPs may be monitored for:
in-band, per-flow monitoring. This enables per-flow monitoring
between MEPs. EVPN Network OAM MUST support fault detection with
per-user flow granularity. EVPN Service OAM MAY support fault
detection with per-user flow granularity.
a representative path. This enables a liveness check of the nodes
hosting the MEPs, assuming that the loss of continuity (LOC) to the MEP is
interpreted as a failure of the hosting node. This, however, does
not conclusively indicate liveness of the path(s) taken by user
data traffic. This enables node failure detection but not path
failure detection through the use of a test flow. EVPN Network OAM
and Service OAM MUST support fault detection using test flows.
all paths. For MPLS/IP networks with ECMP, the monitoring of all unicast
paths between MEPs (on non-adjacent nodes) may not be possible since the
per-hop ECMP hashing behavior may yield situations where it is impossible
for a MEP to pick flow entropy characteristics that result in exercising
the exhaustive set of ECMP paths. The monitoring of all ECMP paths between
MEPs (on non-adjacent nodes) is not a requirement for EVPN OAM.
The fact that MPLS/IP networks do not enforce congruency between
unicast and multicast paths means that the proactive fault detection
mechanisms for EVPN networks MUST provide procedures to monitor the
unicast paths independently of the multicast paths. This applies to
EVPN Service OAM and Network OAM.Defect Indication
Defect indications can be categorized into two types: forward and
reverse, as described below. EVPN Service OAM MUST
support at least one of these types of event-driven defect indications
upon the detection of a connectivity defect.Forward Defect Indication (FDI)
FDI is used to signal a failure that is detected by a lower-layer
OAM mechanism. A server MEP (i.e., an actual or virtual MEP)
transmits a forward defect indication in a direction away
from the direction of the failure (refer to below).
Forward defect indication may be used for alarm suppression and/or
for the purpose of interworking with other layer OAM protocols. Alarm
suppression is useful when a transport-level or network-level fault translates
to multiple service- or flow-level faults. In such a scenario, it is
enough to alert a network management station (NMS) of the single
transport-level or network-level fault in lieu of flooding that NMS with a
multitude of Service or Flow granularity alarms. EVPN PEs SHOULD
support forward defect indication in the Service OAM mechanisms.Reverse Defect Indication (RDI)
RDI is used to signal that the advertising MEP has detected a LOC defect. RDI is transmitted in the direction of the
failure (refer to ).
RDI allows single-sided management, where the network operator can
examine the state of a single MEP and deduce the overall health of a
monitored service. EVPN PEs SHOULD support reverse defect indication
in the Service OAM mechanisms. This includes both the ability to
signal a LOC defect to a remote MEP as well as the ability to
recognize RDI from a remote MEP. Note that, in a multipoint MA, RDI
is not a useful indicator of unidirectional fault. This is because
RDI carries no indication of the affected MEP(s) with which the
sender had detected a LOC defect.On-Demand Fault Management Functions
On-demand fault management functions are initiated manually by the
network operator and continue for a bounded time period. These
functions enable the operator to run diagnostics to investigate a
defect condition.Connectivity Verification
EVPN Network OAM MUST support on-demand connectivity verification
mechanisms for unicast and multicast destinations. The connectivity
verification mechanisms SHOULD provide a means for specifying and
carrying the following in the messages:
variable-length payload/padding to test connectivity problems related to the Maximum Transmission Unit (MTU).
test frame formats as defined in to detect
potential packet corruption.
EVPN Network OAM MUST support connectivity verification at per-flow
granularity. This includes both user flows (to test a specific path
between PEs) as well as test flows (to test a representative path
between PEs).
EVPN Service OAM MUST support connectivity verification on test flows
and MAY support connectivity verification on user flows.
For multicast connectivity verification, EVPN Network OAM MUST
support reporting on:
the DF filtering status of a specific port(s) or all the ports in a
given bridge domain.
the split-horizon filtering status of a specific port(s) or all the
ports in a given bridge domain.
Fault Isolation
EVPN OAM MUST support an on-demand fault localization function. This
involves the capability to narrow down the locality of a fault to a
particular port, link, or node. The characteristic of forward/reverse path
asymmetry in MPLS/IP makes fault isolation a direction-sensitive
operation. That is, given two PEs A and B, localization of continuity
failures between them requires running fault-isolation procedures from PE A
to PE B as well as from PE B to PE A.
EVPN Service OAM mechanisms only have visibility to the PEs but not
the MPLS or IP P nodes. As such, they can be used to deduce whether
the fault is in the customer's own network, the local CE-PE segment,
or a remote CE-PE segment(s). EVPN Network and Transport OAM mechanisms
can be used for fault isolation between the PEs and P nodes.Performance Management
Performance management functions can be performed both proactively
and on demand. Proactive management involves a recurring function,
where the performance management probes are run continuously without
a trigger. We cover both proactive and on-demand functions in this
section.Packet Loss
EVPN Network OAM SHOULD provide mechanisms for measuring packet loss
for a given service -- for example, and .
Given that EVPN provides inherent support for multipoint-to-multipoint
connectivity, packet loss cannot be accurately measured by means of
counting user data packets. This is because user packets can be delivered
to more PEs or more ports than are necessary (e.g., due to broadcast,
unpruned multicast, or unknown unicast flooding). As such, a statistical
means of approximating the packet loss rate is required. This can be achieved
by sending "synthetic" OAM packets that are counted only by those ports
(MEPs) that are required to receive them. This provides a statistical
approximation of the number of data frames lost, even with
multipoint-to-multipoint connectivity.Packet Delay and Jitter
EVPN Service OAM SHOULD support measurement of one-way and two-way
packet delay and delay variation (jitter) across the EVPN network.
Measurement of one-way delay requires clock synchronization between
the probe source and target devices. Mechanisms for clock
synchronization are outside the scope of this document. Note that
Service OAM performance management mechanisms defined in can
be used. See also , , and .
EVPN Network OAM MAY support measurement of one-way and two-way
packet delay and delay variation (jitter) across the EVPN network.Security Considerations
EVPN OAM MUST prevent OAM packets from leaking outside of the EVPN
network or outside their corresponding Maintenance Domain. This can
be done for CFM, for example, by having MEPs implement a filtering
function based on the Maintenance Level associated with received OAM
packets.
EVPN OAM SHOULD provide mechanisms for implementation and optional
use to:
prevent denial-of-service attacks caused by exploitation of the OAM
message channel (for example, by forging messages to exceed a
Maintenance End Point's capacity to maintain state).
authenticate communicating end points (for example, MEPs and MIPs).
IANA Considerations
This document has no IANA actions.ReferencesNormative ReferencesInformative ReferencesIEEE Standard for Local and metropolitan area networks--Bridges and Bridged NetworksIEEEIEEE Standard for EthernetIEEEOperation, administration and maintenance (OAM) functions and mechanisms for Ethernet-based networksITU-TAcknowledgements
The authors would like to thank the following for their review of
this work and their valuable comments:
, , , , , , , , , , , and .