EVPN Multihoming – PACKETS AND STUFF

EVPN – All-active multihoming

So this is the fourth blog on EVPN, the previous blogs covered the following topics:

EVPN basics, route-types and basic L2 forwarding
EVPN IRB and Inter-VLAN routing
EVPN single-active multi-homing

This post will cover the ability of EVPN to provide all-active multi-homing for layer-2 traffic, where the topology contains two different active PE routers, connecting to a switch via a LAG, the setup is similar to the previous labs. Due to some restrictions and in the interests of simplicity, this lab will cover all-active multi-homing for a single VLAN only, (VLAN 100 in this case) consider the network topology:

Capture5

The topology and general connectivity is the same as the other previous examples, the two big differences are that only VLAN 100 is present here and the connectivity between MX-1 and MX-2 is now using MC-LAG.

The first consideration that needs to be made when running EVPN in all-active mode, is that it must connect to the upstream devices using some sort of LAG, or MC-LAG – consider the wording from the RFC 7432:

https://tools.ietf.org/html/rfc7432#section-14.1.2

“If a bridged network is multihomed to more than one PE in an EVPN network via switches, then the support of All-Active redundancy mode requires the bridged network to be connected to two or more PEs using a LAG.”

Essentially, this boils down to some basic facts around how switches work – you can’t have two different PE routers with active access-interfaces configured with the same mac-address, spanning two different control-planes, for the simple reason that you’ll create a duplicate mac-address in the layer-2 network, which will cause a nightmare.

Consider the below scenario:

Capture6

I tried this in a lab before I read the RFC, and discovered that EX4200-1 floods egress traffic to MX-1 and MX-2, resulting in lots of traffic duplication and flooding, simply because each time a packet lands on ge-0/0/0 or ge-0/0/1 from MX-1 or MX-2 with mac-address “X” the switch has to update it’s CAM table, so essentially the whole thing is broken – which explains the wording of the RFC in relation to all-active mode.

With Juniper the way to get around this problem is simply to convert the Ethernet interfaces connecting to EX4200-1 to a basic MC-LAG configuration, we don’t need to configure ICCP or any serious multi-chassis configuration – we just need to make sure the LACP system-id is identical on MX-1 and MX-2, so that the EX4200 think’s it’s connected to a single downstream device,

Lets check the LAG configuration on MX-1 and MX-2;

MX-1

tim@MX5-1> show configuration interfaces ae0
description “MCLAG to EX4500-1”;
flexible-vlan-tagging;
encapsulation flexible-ethernet-services;
esi {
00:11:22:33:44:55:66:77:88:99;
all-active;
}
aggregated-ether-options {
lacp {
system-id 00:00:00:00:00:01;
}
}
unit 100 {
encapsulation vlan-bridge;
vlan-id 100;
family bridge;
}

MX-2

tim@MX5-2> show configuration interfaces ae0
description “MCLAG to EX4500-1”;
flexible-vlan-tagging;
encapsulation flexible-ethernet-services;
esi {
00:11:22:33:44:55:66:77:88:99;
all-active;
}
aggregated-ether-options {
lacp {
system-id 00:00:00:00:00:01;
}
}
unit 100 {
encapsulation vlan-bridge;
vlan-id 100;
family bridge;
}

And finally on EX4200-1 we have a basic standard LAG configuration, with nothing fancy or sexy going on 🙂

EX4200-1

imtech@ex4200-1> show configuration interfaces ae0
aggregated-ether-options {
lacp {
active;
}
}
unit 0 {
family ethernet-switching {
port-mode trunk;
vlan {
members vlan-100;
}
}
}
{master:0}
imtech@ex4200-1>

From the perspective of the EX4200, it’s just a totally standard LAG with two interfaces running LACP, so long as we have EVPN all-active configured correctly on MX-1 and MX-2 everything is taken care of.

EX4200-1 verification:

imtech@ex4200-1> show lacp interfaces
Aggregated interface: ae0
LACP state: Role Exp Def Dist Col Syn Aggr Timeout Activity
ge-0/0/0 Actor No No Yes Yes Yes Yes Fast Active
ge-0/0/0 Partner No No Yes Yes Yes Yes Fast Passive
ge-0/0/1 Actor No No Yes Yes Yes Yes Fast Active
ge-0/0/1 Partner No No Yes Yes Yes Yes Fast Passive
LACP protocol: Receive State Transmit State Mux State
ge-0/0/0 Current Fast periodic Collecting distributing
ge-0/0/1 Current Fast periodic Collecting distributing
{master:0}
imtech@ex4200-1>

Aside from the fact we’ve converted the access Ethernet interfaces to MC-LAG on MX-1 and MX-2, lets check to see what’s changed with the EVPN configuration in order to get all-active EVPN working, first lets check MX-1:

tim@MX5-1> show configuration routing-instances
EVPN-100 {
instance-type virtual-switch;
route-distinguisher 1.1.1.1:100;
vrf-target target:100:100;
protocols {
evpn {
extended-vlan-list 100;
default-gateway do-not-advertise;
}
}
bridge-domains {
VL-100 {
vlan-id 100;
interface ae0.100;
routing-interface irb.100;
}
}
}
VPN-100 {
instance-type vrf;
interface irb.100;
route-distinguisher 100.100.100.1:100;
vrf-target target:1:100;
vrf-table-label;
}
tim@MX5-1>

The configuration is absolutely identical on MX-2, you’ll notice that the only thing which has changed on MX-1, is that the physical interface of ge-1/1/5 has changed to the new LAG interface of ae0.100 for VLAN 100, everything else is exactly the same as the previous single-active example from last week, lets take a closer look at the interface on MX-1

tim@MX5-1> show configuration interfaces ae0
description “MCLAG to EX4500-1”;
flexible-vlan-tagging;
encapsulation flexible-ethernet-services;
esi {
00:11:22:33:44:55:66:77:88:99;
all-active;
}
aggregated-ether-options {
lacp {
system-id 00:00:00:00:00:01;
}
}
unit 100 {
encapsulation vlan-bridge;
vlan-id 100;
family bridge;
}

It’s clear to see that under the interface ESI configuration, we’re changed the ESI mode from single-active, to “all-active” which again should be self explanatory to most readers 🙂 and again note, that this configuration is 100% identical on both Mx-1 and MX-2,

Lets check the EVPN instance and see what’s changed since the single-active example:

tim@MX5-1> show evpn instance extensive
Instance: EVPN-100
Route Distinguisher: 1.1.1.1:100
Per-instance MAC route label: 299776
MAC database status Local Remote
Total MAC addresses: 13 96
Default gateway MAC addresses: 1 0
Number of local interfaces: 1 (1 up)
Interface name ESI Mode Status
ae0.100 00:11:22:33:44:55:66:77:88:99 all-active Up
Number of IRB interfaces: 1 (1 up)
Interface name VLAN ID Status L3 context
irb.100 100 Up VPN-100
Number of bridge domains: 1
VLAN ID Intfs / up Mode MAC sync IM route label
100 1 1 Extended Enabled 300432
Number of neighbors: 2
10.10.10.2
Received routes
MAC address advertisement: 49
MAC+IP address advertisement: 0
Inclusive multicast: 1
Ethernet auto-discovery: 2
10.10.10.3
Received routes
MAC address advertisement: 60
MAC+IP address advertisement: 0
Inclusive multicast: 1
Ethernet auto-discovery: 0
Number of ethernet segments: 1
ESI: 00:11:22:33:44:55:66:77:88:99
Status: Resolved by IFL ae0.100
Local interface: ae0.100, Status: Up/Forwarding
Number of remote PEs connected: 1
Remote PE MAC label Aliasing label Mode
10.10.10.2 300416 300416 all-active
Designated forwarder: 10.10.10.1
Backup forwarder: 10.10.10.2
Advertised MAC label: 300400
Advertised aliasing label: 300400
Advertised split horizon label: 300416
Instance: __default_evpn__
Route Distinguisher: 10.10.10.1:0
Number of bridge domains: 0
Number of neighbors: 1
10.10.10.2
Received routes
Ethernet Segment: 1
tim@MX5-1>

So we can see that MX-1 has changed from single-active to all-active, and is in the up/forwarding state,

Lets check MX-2 to see what it looks like:

tim@MX5-2> show evpn instance extensive
Instance: EVPN-100
Route Distinguisher: 1.1.1.2:100
Per-instance MAC route label: 299776
MAC database status Local Remote
Total MAC addresses: 47 64
Default gateway MAC addresses: 1 0
Number of local interfaces: 1 (1 up)
Interface name ESI Mode Status
ae0.100 00:11:22:33:44:55:66:77:88:99 all-active Up
Number of IRB interfaces: 1 (1 up)
Interface name VLAN ID Status L3 context
irb.100 100 Up VPN-100
Number of bridge domains: 1
VLAN ID Intfs / up Mode MAC sync IM route label
100 1 1 Extended Enabled 300528
Number of neighbors: 2
10.10.10.1
Received routes
MAC address advertisement: 14
MAC+IP address advertisement: 1
Inclusive multicast: 1
Ethernet auto-discovery: 2
10.10.10.3
Received routes
MAC address advertisement: 60
MAC+IP address advertisement: 0
Inclusive multicast: 1
Ethernet auto-discovery: 0
Number of ethernet segments: 1
ESI: 00:11:22:33:44:55:66:77:88:99
Status: Resolved by IFL ae0.100
Local interface: ae0.100, Status: Up/Forwarding
Number of remote PEs connected: 1
Remote PE MAC label Aliasing label Mode
10.10.10.1 300400 300400 all-active
Designated forwarder: 10.10.10.1
Backup forwarder: 10.10.10.2
Advertised MAC label: 300416
Advertised aliasing label: 300416
Advertised split horizon label: 300432
Instance: __default_evpn__
Route Distinguisher: 10.10.10.2:0
Number of bridge domains: 0
Number of neighbors: 1
10.10.10.1
Received routes
Ethernet Segment: 1
tim@MX5-2>

Excellent! both MX-1 and MX-2 are in the up/forwarding state for VLAN 100, meaning that in theory – they can both send and receive traffic received on their access LAG interface, and the MPLS side – you’ll also notice how simple it is to get working.

I currently have 50x IXIA hosts sat behind MX-1 and MX-2, and a further 50x hosts sat behind MX-3, 50Mbps of traffic is being sent bi-bidirectionally between each IXIA host, lets recap the diagram:

Capture7

With an active-active configuration, traffic from multiple hosts at the top of the network, should be sent towards MX-1 and MX-2 by EX4200-1 according to it’s standard LAG hashing algorithm, (source/destination mac) because I have 100 hosts in total, there should be enough granularity at layer-2 to perform rough distribution of some traffic on MX-1 and some traffic on MX-2

Lets send the IXIA traffic:

IXIA

Now lets look at the physical access interfaces on MX-1 and Mx-2 to see how the traffic is being handled:

Mx-1

tim@MX5-1> show configuration interfaces ge-1/1/5 
gigether-options {
 802.3ad ae0;
}

tim@MX5-1> show interfaces ae0 | match pps 
 Input rate : 5404040 bps (484 pps)
 Output rate : 10384856 bps (929 pps)

So 5Mbps in and 10Mbps out on Mx-1

Lets check MX-2

tim@MX5-2> show configuration interfaces ge-1/0/5 
gigether-options {
 802.3ad ae0;
}

tim@MX5-2> show interfaces ae0 | match pps 
 Input rate : 19535296 bps (1750 pps)
 Output rate : 14546816 bps (1302 pps)

So it seems to be working – MX-1 and MX-2 are both sending and receiving traffic in the same layer-2 broadcast domain,

Lets check their MPLS facing interfaces:

MX-1

tim@MX5-1> show isis adjacency 
Interface System L State Hold (secs) SNPA
ge-1/1/0.0 m10i-1 2 Up 19

tim@MX5-1> show interfaces ge-1/1/0 | match pps 
 Input rate : 10415216 bps (930 pps)
 Output rate : 5404040 bps (484 pps)

tim@MX5-1>

MX-2

tim@MX5-2> show isis adjacency 
Interface System L State Hold (secs) SNPA
ge-1/1/0.0 m10i-2 2 Up 24

tim@MX5-2> show interfaces ge-1/1/0 | match pps 
 Input rate : 14583752 bps (1303 pps)
 Output rate : 19535576 bps (1751 pps)

tim@MX5-2>

And so all seems right with the world, traffic from the MPLS network is being sent from MX-3 to both MX-1 and MX-2, lets look at the EVPN BGP control-plane on MX-3 to see what’s going on with all-active – we’ll take a look at a slice of the BGP table for brevity:

2:1.1.1.1:100::100::00:00:66:cf:82:df/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.1
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300944
2:1.1.1.1:100::100::00:00:66:cf:82:e1/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.1
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300944
2:1.1.1.1:100::100::00:00:66:cf:82:e3/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.1
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300944
2:1.1.1.1:100::100::00:00:66:d0:5d:f3/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.1
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300944
2:1.1.1.2:100::100::00:00:2e:18:6d:e1/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300960
2:1.1.1.2:100::100::00:00:2e:18:f3:c4/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300960
2:1.1.1.2:100::100::00:00:66:cf:82:d1/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300960
2:1.1.1.2:100::100::00:00:66:cf:82:d3/304
*[BGP/170] 01:28:27, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.15 via ge-1/1/0.0, Push 300960

You’ll notice that in MX-3’s BGP EVPN table, it’s receiving those good old type-2 MAC routes, however some of them are being learnt from MX-1 and MX-2, which is exactly what we want and exactly what MX-3 needs in order for egress traffic to be sent towards MX-1 and MX-2 in the all-active fashion that we desire.

Remember that because EVPN maintains an forwarding-based layer-2 control plane, the determination on whether traffic should go to MX-1 or MX-2, from MX-3 depends on how EX4200-1 hashes egress traffic in the first place, see the below diagram for an at attempt at a better explanation:

Capture8

But what happens if the EX4200 switch has a really rubbish hashing algorithm, or there’s no granularity – to the point where nearly all the traffic comes from MX-1 and hardly any comes from MX-2, you’d end up with traffic polarisation and really bad load-balancing. EVPN solves this problem by using an aliasing label.

MX-3 for example has a full table of EVPN MAC routes, so it can load-balance traffic on a per-flow basis back to MX-1 and Mx-2 by making use of the aliasing label. In the case of the IXIA hosts at the top of the network, they’re all being advertised with an ESI of 00:11:22:33:44:55:66:77:88:99, which means they’re all coming from the same place – this means MX-3 will simply treat the aliasing route as a normal MAC route and send the traffic anyway.

If there’s a failure somewhere on either MX-1 or MX-2, the aliasing label gets withdrawn and you’re left with MAC routes for one site only – to prevent the black-holing of traffic.

The last thing to consider is the concept of “designated forwarder” lets re-check the EVPN instance output from earlier on:

tim@MX5-1> show evpn instance extensive
Instance: EVPN-100
Route Distinguisher: 1.1.1.1:100
Per-instance MAC route label: 299776
MAC database status Local Remote
Total MAC addresses: 13 96
Default gateway MAC addresses: 1 0
Number of local interfaces: 1 (1 up)
Interface name ESI Mode Status
ae0.100 00:11:22:33:44:55:66:77:88:99 all-active Up
Number of IRB interfaces: 1 (1 up)
Interface name VLAN ID Status L3 context
irb.100 100 Up VPN-100
Number of bridge domains: 1
VLAN ID Intfs / up Mode MAC sync IM route label
100 1 1 Extended Enabled 300432
Number of neighbors: 2
10.10.10.2
Received routes
MAC address advertisement: 49
MAC+IP address advertisement: 0
Inclusive multicast: 1
Ethernet auto-discovery: 2
10.10.10.3
Received routes
MAC address advertisement: 60
MAC+IP address advertisement: 0
Inclusive multicast: 1
Ethernet auto-discovery: 0
Number of ethernet segments: 1
ESI: 00:11:22:33:44:55:66:77:88:99
Status: Resolved by IFL ae0.100
Local interface: ae0.100, Status: Up/Forwarding
Number of remote PEs connected: 1
Remote PE MAC label Aliasing label Mode
10.10.10.2 300416 300416 all-active
Designated forwarder: 10.10.10.1
Backup forwarder: 10.10.10.2
Advertised MAC label: 300400
Advertised aliasing label: 300400
Advertised split horizon label: 300416
Instance: __default_evpn__
Route Distinguisher: 10.10.10.1:0
Number of bridge domains: 0
Number of neighbors: 1
10.10.10.2
Received routes
Ethernet Segment: 1
tim@MX5-1>

When running in all-active mode, it’s obvious that both PE routers are forwarding traffic, but it’s important to know that both PE’s can only forward unicast traffic in an all-active fashion. When two PE routers discover each other on the same EVI via the MPLS network, via BGP auto-discovery routes, they elect a “designated forwarder”

The primary role of the active designated forwarder is to forward BUM (broadcast multicast traffic) it would be highly undesirable for both PE’s to forward broadcasts and so only one is responsible for this in order to prevent traffic duplication.

Anyways, that’s about all I have time for tonight – I hope you found this useful!

EVPN Inter-VLAN routing + mobility

So in the last blog I essentially looked at one of the most basic aspects of EVPN – a multi-site layer-2 network with nothing fancy going on, with traffic forwarding occurring between multiple sites in the same VLAN. The fact of the matter is that there was nothing going on there that you couldn’t do with a traditional VPLS configuration, however the general idea was to demonstrate the basics and take a look at the basic control-plane first.

In this update we’ll be looking at some of the more exclusive and highly useful aspects of EVPNs which make it a very attractive technology for things such as data-centre interconnect, there are a few things which are possible with EVPN which cannot be done with VPLS.

Consider the revised topology:

Capture

It’s the same topology from the first blog post, however I’ve simply added an additional VLAN (VLAN 101) to ge-0/0/22 of each EX4200 LAN switch, and an additional IXIA host.

For this post we’re going to look at a rather cool way of performing inter-VLAN forwarding between hosts in VLAN100 and VLAN101. Not that I want to spend time teaching people how to suck eggs, but generally in a simple network with multiple VLANs you have 2 common ways of performing inter-VLAN forwarding:

Use a good ole’ fashioned router on a stick topology
Bolt some additional layer-3 functionality onto your layer-2 switch

As everyone knows, the latter method is by far the most common – the vast majority of switches support layer-3 routing functionality, usually in the form of IRB/BVI/SVI depending on the vendor in question.

In a service provider network, where we generally have a number of PE routers acting together as a large distributed switch, providing layer-2 connectivity – the old fashioned way of doing this would be with VPLS. In order to enable inter-VLAN forwarding we’d add a BVI interface to the VPLS instance, this enables a PE to do standard layer-2 switching and route between VLANs at layer-3 – which is very important for data-centre interconnect applications.

EVPN has a number of enhancements which make it more suitable for modern day data-centre interconnect designs, especially where things such as VM mobility are concerned. A company or organisation with a traditional MPLS based network, might require the ability to move hosts around between data centres seamlessly, without causing any real downtime.

Lets take a look at the basic interface configuration and routing-instance configuration:

interfaces {
irb {
unit 100 {
family inet {
address 192.168.100.1/24;
}
mac 00:00:19:21:68:10;
}
unit 101 {
family inet {
address 192.168.101.1/24;
}
mac 00:00:19:21:68:11;
}
}
routing-instances {
EVPN-100 {
instance-type virtual-switch;
route-distinguisher 1.1.1.1:100;
vrf-target target:100:100;
protocols {
evpn {
extended-vlan-list 100-101;
default-gateway do-not-advertise;
}
}
bridge-domains {
VL-100 {
vlan-id 100;
interface ge-1/1/5.100;
routing-interface irb.100;
}
VL-101 {
vlan-id 101;
interface ge-1/1/5.101;
routing-interface irb.101;
}
}
}
VPN-100 {
instance-type vrf;
interface irb.100;
interface irb.101;
route-distinguisher 100.100.100.1:100;
vrf-target target:1:100;
vrf-table-label;
}

First things first – lines 1 – 15 take care of the IRB interfaces for VLAN 100 and VLAN 101; more of that shortly.

Lines 16 – 39 form the configuration for the EVPN routing instance, you’ll note a couple of differences from the first EVPN blog post;

The extended-vlan-list has been increased to include both VLANs within the routing instance
A new command “default-gateway do-not-advertise” is present under the EVPN protocol configuration
An additional bridge-domain has been configured for Vlan 101 under the routing-instance, along with the IRB interface for each vlan
What looks like a totally standard L3VPN has been configured, albeit with different RTs and RDs – but it does contain the IRB interfaces from the EVPN routing instance.

The command “default-gateway do-not-advertise” is used to generate a new extended-community route. If on your PE routers you have different IRB MAC addresses and IPv4 addresses – the PE will generate a “default-gateway route” which tells other PEs in the EVPN that this route is a default-gateway somewhere, however in this example and in best practise – it’s simpler and easier to configure the same IRB MAC/IP on all your PEs, and so the command here is “do-not-advertise” as we don’t need it at this time.

But perhaps the coolest feature and one of the biggest advantages EVPN has over VPLS is the way the IRB interfaces are configured, in this topology the 3x PE routers, (MX5-1, MX5-2 and MX5-3) all have an identical IRB interface configuration for VLAN 100 and VLAN 101, each PE has the exact same IP address, and MAC address…:

MX5-1:

imtech@MX5-1# run show configuration interfaces irb
unit 100 {
family inet {
address 192.168.100.1/24;
}
mac 00:00:19:21:68:10;
}
unit 101 {
family inet {
address 192.168.101.1/24;
}
mac 00:00:19:21:68:11;
}

MX5-2

imtech@MX5-2# run show configuration interfaces irb
unit 100 {
family inet {
address 192.168.100.1/24;
}
mac 00:00:19:21:68:10;
}
unit 101 {
family inet {
address 192.168.101.1/24;
}
mac 00:00:19:21:68:11;
}

MX5-3

imtech@MX5-3# run show configuration interfaces irb
unit 100 {
family inet {
address 192.168.100.1/24;
}
mac 00:00:19:21:68:10;
}
unit 101 {
family inet {
address 192.168.101.1/24;
}
mac 00:00:19:21:68:11;
}

The first time you see it, you think:

15omtr

But it’s true! all the PEs in the network have the exact same IP address and MAC address on their IRB interfaces, why would we do that? and how does it work?

Consider the following scenario:

Capture2

Imagine a basic data-centre environment running things like VMware or openstack – basically we can provision servers and move them around all over the place using things like VMotion etc. If you can imagine the active server on the left hand portion of the data-centre and business as usual from a networks perspective, arp is learnt between the host and the left hand PE, the default-gateway is 192.168.100.1

Now, imagine that the DC admin flicks the switch, and that active VM on the left is immediately torn down and spun up inside the right hand DC (which could be many miles away) you’ll notice that the interface mac-address and the default-gateway are the same. This gives us the ability to move hosts around our data centres, without having to worry about different default-gateways, or incurring too much downtime whilst we wait for things to re-arp, because everything is identical at each DC site – there’s no problem moving things around between one site or the next.

Capture3

You cannot do this with VPLS as the implementation demands that you use unique MAC-addresses, which moves us on deeper into the technology – how does EVPN achieve this breakthrough?

It’s essentially boils down to the way that EVPN has been engineered to more closely integrate with the layer-3 world, essentially the software has a number of hooks which go between EVPN and L3VPN in a much more elegant fashion than VPLS, for example in the first blog post – it showed how MAC addresses were learnt and inserted into the BGP control-plane, in this example for Inter-VLAN forwarding, a few extra things are happening:

Firstly we have the BGP MAC advertisement from the L2 world,
Secondly, we get a new MAC/IP advertisement containing the PE’s IRB MAC and IP address – this is linked to the PE’s ARP table
Thirdly, we get a totally standard /32 IPv4 L3VPN route for the host’s /32 address, this is advertised to all remote PEs

Let’s recap a more basic version of the lab diagram and see what the control-plane looks like when we send some traffic between hosts in different VLANs:

Capture4

Now lets look at the BGP control-plane on MX-1 and see what’s going on:

imtech@MX5-1> show route protocol bgp table EVPN-100.evpn.0
EVPN-100.evpn.0: 8 destinations, 8 routes (8 active, 0 holddown, 0 hidden)
+ = Active Route, – = Last Active, * = Both
2:1.1.1.2:100::101::00:00:2e:e6:77:97/304
*[BGP/170] 00:04:38, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 299968
2:1.1.1.2:100::101::00:00:2e:e6:77:97::192.168.101.11/304
*[BGP/170] 00:04:38, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 299968
3:1.1.1.2:100::100::10.10.10.2/304
*[BGP/170] 00:04:38, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 299968
3:1.1.1.2:100::101::10.10.10.2/304
*[BGP/170] 00:04:38, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 299968
imtech@MX5-1> show route protocol bgp table VPN-100.inet.0
VPN-100.inet.0: 6 destinations, 9 routes (6 active, 0 holddown, 0 hidden)
+ = Active Route, – = Last Active, * = Both
192.168.100.0/24 [BGP/170] 00:04:44, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 16, Push 299968(top)
192.168.101.0/24 [BGP/170] 00:04:44, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 16, Push 299968(top)
192.168.101.11/32 [BGP/170] 00:04:44, localpref 100, from 10.10.10.2
AS path: I, validation-state: unverified
> to 192.169.100.11 via ge-1/1/0.0, Push 16, Push 299968(top)

You’ll immediatley notice that compared to the vanilla L2VPN implementation, there’s a lot more going on – lets break it down,

Line 6 is the standard MAC advertisement route, the same sort of advertisement we went over with the vanilla standard L2-only version of EVPN – this is for layer-2 connectivity only.
Line 10 is an EVPN MAC/IP route, which is basically the ARP mapping learnt directly from MX2 – this route makes it possible for all PEs in the network to synchronise their arp tables with each other!
Line 34 is a standard L3VPN route, containing the /32 host behind MX2

Line 10 essentially means, that as soon as you move a host from one place to another – the moment a packet lands on the ingress PE interface – it generates a new MAC/IP ARP route, and all other PE’s synchronise accordingly, meanwhile the host that’s moved doesn’t need to do anything else – other than keep sending packets at the exact same gateway IP/MAC as it did before it was moved, essentially we have layer-2 and layer-3 working together in harmony.

Line 34 is a standard L3VPN /32 host route for the host behind MX2, this means that if you have EVPN running across numerous data-centres in various places, if this is connected to a wider layer-3 network – such as traditional residential/business PE routers, these other routers don’t need to have any awareness of EVPN whatsoever – so long as they can participate in regular L3VPN then packets will always be delivered to the right place when things get moved around, because these routes are dynamically generated and advertised accordingly. This is a massive advantage over VPLS, as you don’t need to configure it in every corner of the network for it to be useful, it simply lives on your DC edge – the rest is left to vanilla L3VPN.

There are a few more enhancements due at some point soon, including quite an interesting one which is the “MAC mobility extended-community” which is essentially a safeguard to prevent a few rather nasty situations from arising:

A layer-2 loop, where two PEs constantly advertise the same MAC addresses – which could overwhelm the BGP control-plane
A situation where a pair of hosts each in a different DC are mis-configured with the same MAC address – if they’re both sending data then each PE will be generating route advertisements,

The MAC mobility extended community drafted in RFC 7432 introduces a sequence number, where if the same route is advertised a certain number of times within a specific period, it’s assumed that something is broken and the routers should perform some sort of damping and alerting procedure to prevent network meltdown.

I hope you found this useful! the next one I’ll be looking at some of the redundant designs including single-active and all-active multi-homing.

	Sales Psychology, BE… on EVPN – the basics
	stnzzzz on EVPN – All-active m…
	Sukhjit Hayre on EVPN vs PBB-EVPN
	stnzzzz on EVPN – All-active m…
	menger9 on Subscriber management on Junip…