BGP – PACKETS AND STUFF

Segment-routing + Opendaylight SDN + Pathman-SR + PCEP

opendaylight_logo

This is a second technical post related to segment-routing, I did a basic introduction to this technology on Juniper MX here;

https://tgregory.org/2016/08/13/segment-routing-on-junos-the-basics/

For this post I’m looking at something a bit more advanced and fun – performing Segment-routing traffic-engineering using an SDN controller, in this case OpenDaylight Beryllium – an open source SDN controller with some very powerful functionality.

This post will use Cisco ASR9kV virtual routers running on a Cisco UCS chassis, mostly because Cisco currently have the leading-edge support for Segment-routing at this time, Juniper seem to be lagging behind a bit on that front!

Lets check out the topology;

odl1

It’s a pretty simple scenario – all of the routers in the topology are configured in the following way;

XRV-1 to XRV-8; PE routers (BGP IPv4)
XRV 2 to XRV7; P routers (ISIS-Segment-routing)
XRV4 is an in-path RR connecting to the ODL controller

odl2

The first thing to look at here is BGP-LS “BGP Link-state” which is an extension of BGP that allows IGP information (OSPF/ISIS) to be injected into BGP, this falls conveniently into the world of centralised path computation – where we can use a controller of some sort to look at the network’s link-state information, then compute a path through the network. The controller can then communicate that path back down to a device within the network using a different method, ultimately resulting in an action of some sort – for example, signalling an LSP.

Some older historic platforms such as HP Route analytics – which enabled you to discover the live IGP topology by running ISIS or OSPF directly with a network device, however IGPs tend to be very intense protocols and also require additional effort to support within an application, rather than a traditional router. IGPs are only usually limited to the domain within which they operate – for example if we have a large network with many different IGP domains or inter-domain MPLS, the IGP’s view becomes much more limited. BGP on the other hand can bridge many of these gaps, and when programmed with the ability to carry IGP information – can be quite useful.

The next element is PCE or Path computation element – which generally contains two core elements;

PCC – Path computation client – In the case of this lab network, a PCC would be a PE router
PCE – Path computation element – In the case of this lab network, the PCE would be the ODL controller

These elements communicate using PCEP (Path computation element protocol) which allows a central controller (in this case ODL) to essentially program the PCC with a path – for example, by signalling the actual LSP;

Basic components;

yeee

Basic components plus an application (in this case Pathman-SR) which can compute and signal an LSP from ODL to the PCC (XRV-1);

pathman

In the above example, an opensource application (in this case Pathman-SR) is using the information about the network topology obtained via BGP-LS and PCE, stored inside ODL – to compute and signal a Segment-routing LSP from XRV-1 to XRV-8, via XRV3, XRV5 and XRV7.

Before we look at the routers, lets take a quick look at OpenDaylight, general information can be found here; https://www.opendaylight.org I’m running Beryllium 0.4.3 which is the same Cisco’s DCloud demo – it’s a relatively straightforward install process, I’m running my copy on top of a standard Ubuntu install.

yang

From inside ODL you can use the YANG UI to query information held inside the controller, which is essentially a much easier way of querying the data, using presets – for example, I can view the link-state topology learnt via BGP-LS pretty easily;

topology

There’s a whole load of functionality possible with ODL, from BGP-Flowspec, to Openflow, to LSP provisioning, for now we’re just going to keep it basic – all of this is opensource and requires quite a bit of “playing” to get working.

Lets take a look at provisioning some segment-routing TE tunnels, first a reminder of the diagram;

odl1

And an example of some configuration – XRv-1

ISIS;

router isis CORE-SR
is-type level-2-only
net 49.0001.0001.0001.00
address-family ipv4 unicast
metric-style wide
mpls traffic-eng level-2-only
mpls traffic-eng router-id Loopback0
redistribute static
segment-routing mpls
!
interface Loopback0
address-family ipv4 unicast
prefix-sid index 10
!
!
interface GigabitEthernet0/0/0/0.12
point-to-point
address-family ipv4 unicast
!
!
interface GigabitEthernet0/0/0/1.13
point-to-point
address-family ipv4 unicast
!
!
!

A relatively simple ISIS configuration, with nothing remarkable going on,

Line 9 enabled Segment-Routing for ISIS
Line 13 injects a SID (Segment-identifier) of 10 into ISIS for loopback 0

The other aspect of the configuration which generates a bit of interest, is the PCE and mpls traffic-eng configuration;

mpls traffic-eng
pce
peer source ipv4 49.1.1.1
peer ipv4 192.168.3.250
!
segment-routing
logging events peer-status
stateful-client
instantiation
!
!
logging events all
auto-tunnel pcc
tunnel-id min 1 max 99
!
reoptimize timers delay installation 0
!

Line 1 enables basic traffic-engineering, an important point to note – to do MPLS-TE for Segment-routing, you don’t need to turn on TE on every single interface like you would if you were using RSVP, so long as ISIS TE is enabled and
Lines 2, 3 and 4 connect the router from it’s loopback address, to the opendaylight controller and enable PCE
Line 6 through 9 specify the segment-routing parameters for TE
Line 14 specifies the tunnel ID for automatically generated tunnels – for tunnels spawned by the controller

Going back to the diagram, XRv-4 was also configured for BGP-LS;

router bgp 65535
bgp router-id 49.1.1.4
bgp cluster-id 49.1.1.4
address-family ipv4 unicast
!
address-family link-state link-state
!
neighbor 49.1.1.1
remote-as 65535
update-source Loopback0
address-family ipv4 unicast
route-reflector-client
!
!
neighbor 49.1.1.8
remote-as 65535
update-source Loopback0
address-family ipv4 unicast
route-reflector-client
!
!
neighbor 192.168.3.250
remote-as 65535
update-source GigabitEthernet0/0/0/5
address-family ipv4 unicast
route-reflector-client
!
address-family link-state link-state
route-reflector-client
!
!
!

Line 6 enables the BGP Link-state AFI/SAFI
Lines 8 through 19 are standard BGP RR config for IPv4
Line 22 is the BGP peer for the Opendaylight controller
Line 28 turns on the link-state AFI/SAFI for Opendaylight

Also of Interest on XRv-4 is the ISIS configuration;

router isis CORE-SR
is-type level-2-only
net 49.0001.0001.0004.00
distribute bgp-ls
address-family ipv4 unicast
metric-style wide
mpls traffic-eng level-2-only
mpls traffic-eng router-id Loopback0
redistribute static
segment-routing mpls
!
interface Loopback0
address-family ipv4 unicast
prefix-sid index 40
!
!

Line 4 copies the ISIS link-state information into BGP-link state

If we do a “show bgp link-state link-state” we can see the information taken from ISIS, injected into BGP – and subsequently advertised to Opendaylight;

RP/0/RP0/CPU0:XRV9k-4#show bgp link-state link-state
Thu Dec 1 21:40:44.032 UTC
BGP router identifier 49.1.1.4, local AS number 65535
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0x0 RD version: 78
BGP main routing table version 78
BGP NSR Initial initsync version 78 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
Status codes: s suppressed, d damped, h history, * valid, > best
i – internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i – IGP, e – EGP, ? – incomplete
Prefix codes: E link, V node, T IP reacheable route, u/U unknown
I Identifier, N local node, R remote node, L link, P prefix
L1/L2 ISIS level-1/level-2, O OSPF, D direct, S static/peer-node
a area-ID, l link-ID, t topology-ID, s ISO-ID,
c confed-ID/ASN, b bgp-identifier, r router-ID,
i if-address, n nbr-address, o OSPF Route-type, p IP-prefix
d designated router address
Network Next Hop Metric LocPrf Weight Path
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0001.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0002.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0003.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0004.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0005.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0006.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0007.00]]/328
0.0.0.0 0 i
*> [V][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0008.00]]/328
0.0.0.0 0 i
*> [E][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0001.00]][R[c65535][b0.0.0.0][s0001.0001.0002.00]][L[i10.10.12.0][n10.10.12.1]]/696
0.0.0.0 0 i
*> [E][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0001.00]][R[c65535][b0.0.0.0][s0001.0001.0003.00]][L[i10.10.13.0][n10.10.13.1]]/696
0.0.0.0 0 i
*> [E][L2][I0x0][N[c65535][b0.0.0.0][s0001.0001.0002.00]][R[c65535][b0.0.0.0][s0001.0001.0001.00]][L[i10.10.12.1][n10.10.12.0]]/696

With this information we can use an additional app on top of OpenDaylight to provision some Segment-routing LSPs, in this case I’m going to use something from Cisco Devnet called Pathman-SR – it essentially connects to ODL using REST to program the network, Pathman can be found here; https://github.com/CiscoDevNet/pathman-sr

Once it’s installed and running, simply browse to it’s url (http://192.168.3.250:8020/cisco-ctao/apps/pathman_sr/index.html) and you’re presented with a nice view of the network;

pathman

From here, it’s possible to compute a path from one point to another – then signal that LSP on the network using PCEP, in this case – lets program a path from XRv9k-1 to XRv9k-8

In this case, lets program a path via XRV9k-2, via 4, via 7 to 8;

pathman3

Once Pathman has calculated the path – hit deploy, Pathman sends the path to ODL – which then connects via PCEP to XRV9kv-1 and provisions the LSP;

pathman2

Once this is done, it’s check XRV9k-1 to check out the SR-TE tunnel;

RP/0/RP0/CPU0:XRV9k-1#sh ip int bri
Thu Dec 1 22:05:38.799 UTC
Interface IP-Address Status Protocol Vrf-Name
Loopback0 49.1.1.1 Up Up default
tunnel-te1 49.1.1.1 Up Up default
GigabitEthernet0/0/0/0 unassigned Up Up default
GigabitEthernet0/0/0/0.12 10.10.12.0 Up Up default
GigabitEthernet0/0/0/1 unassigned Up Up default
GigabitEthernet0/0/0/1.13 10.10.13.0 Up Up default
GigabitEthernet0/0/0/2 100.1.0.1 Up Up default
GigabitEthernet0/0/0/3 192.168.3.248 Up Up default
GigabitEthernet0/0/0/4 unassigned Shutdown Down default
GigabitEthernet0/0/0/5 unassigned Shutdown Down default
GigabitEthernet0/0/0/6 unassigned Shutdown Down default
MgmtEth0/RP0/CPU0/0 unassigned Shutdown Down default

We can see from the output of “show ip int brief” on line 5, that interface tunnel-te1 has been created, but it’s nowhere in the config;

RP/0/RP0/CPU0:XRV9k-1#sh run interface tunnel-te1
Thu Dec 1 22:07:41.409 UTC
% No such configuration item(s)
RP/0/RP0/CPU0:XRV9k-1#

PCE signalled LSPs never appear in the configuration, they’re created, managed and deleted by the controller – it is possible to manually add an LSP then delegate it to the controller, but that’s beyond the scope here (that’s technical speak for “I couldn’t make it work 🙂 )

Lets check out the details of the SR-TE tunnel;

RP/0/RP0/CPU0:XRV9k-1#show mpls traffic-eng tunnels
Thu Dec 1 22:09:56.983 UTC
Name: tunnel-te1 Destination: 49.1.1.8 Ifhandle:0x8000064 (auto-tunnel pcc)
Signalled-Name: XRV9k-1 -> XRV9k-8
Status:
Admin: up Oper: up Path: valid Signalling: connected
path option 10, (Segment-Routing) type explicit (autopcc_te1) (Basis for Setup)
G-PID: 0x0800 (derived from egress interface properties)
Bandwidth Requested: 0 kbps CT0
Creation Time: Thu Dec 1 22:01:21 2016 (00:08:37 ago)
Config Parameters:
Bandwidth: 0 kbps (CT0) Priority: 7 7 Affinity: 0x0/0xffff
Metric Type: TE (global)
Path Selection:
Tiebreaker: Min-fill (default)
Protection: any (default)
Hop-limit: disabled
Cost-limit: disabled
Path-invalidation timeout: 10000 msec (default), Action: Tear (default)
AutoRoute: disabled LockDown: disabled Policy class: not set
Forward class: 0 (default)
Forwarding-Adjacency: disabled
Autoroute Destinations: 0
Loadshare: 0 equal loadshares
Auto-bw: disabled
Path Protection: Not Enabled
BFD Fast Detection: Disabled
Reoptimization after affinity failure: Enabled
SRLG discovery: Disabled
Auto PCC:
Symbolic name: XRV9k-1 -> XRV9k-8
PCEP ID: 2
Delegated to: 192.168.3.250
Created by: 192.168.3.250
History:
Tunnel has been up for: 00:08:37 (since Thu Dec 01 22:01:21 UTC 2016)
Current LSP:
Uptime: 00:08:37 (since Thu Dec 01 22:01:21 UTC 2016)
Segment-Routing Path Info (PCE controlled)
Segment0[Node]: 49.1.1.2, Label: 16020
Segment1[Node]: 49.1.1.4, Label: 16040
Segment2[Node]: 49.1.1.7, Label: 16070
Segment3[Node]: 49.1.1.8, Label: 16080
Displayed 1 (of 1) heads, 0 (of 0) midpoints, 0 (of 0) tails
Displayed 1 up, 0 down, 0 recovering, 0 recovered heads
RP/0/RP0/CPU0:XRV9k-1#

Points of interest;

Line 4 shows the name of the LSP as configured by Pathman
Line 7 shows that the signalling is Segment-routing via autoPCC
Lines 33 and 34 show the tunnel was generated by the Opendaylight controller
Lines 39 shows the LSP is PCE controlled
Lines 40 through 43 show the programmed path
Line 44 basically shows XRV9k-1 being the SR-TE headend,

Lines 40-43 show some of the main benefits of Segment-routing, we have a programmed traffic-engineered path through the network, but with far less control-plane overhead than if we’d done this with RSVP-TE, for example – lets look at the routers in the path (xrv-2 xrv-4 and xrv-7)

RP/0/RP0/CPU0:XRV9k-2#show mpls traffic-eng tunnels
Thu Dec 1 22:14:38.855 UTC
RP/0/RP0/CPU0:XRV9k-2#
RP/0/RP0/CPU0:XRV9k-4#show mpls traffic-eng tunnels
Thu Dec 1 22:14:45.915 UTC
RP/0/RP0/CPU0:XRV9k-4#
RP/0/RP0/CPU0:XRV9k-7#show mpls traffic-eng tunnels
Thu Dec 1 22:15:17.873 UTC
RP/0/RP0/CPU0:XRV9k-7#

Essentially – the path that the SR-TE tunnel takes contains no real control-plane state, this is a real advantage for large networks as the whole thing is much more efficient.

The only pitfall here, is that whilst we’ve generated a Segment-routed LSP, like all MPLS-TE tunnels we need to tell the router to put traffic into it – normally we do this with autoroute-announce or a static route, at this time OpenDaylight doesn’t support the PCEP extensions to actually configure a static route, so we still need to manually put traffic into the tunnel – this is fixed in Cisco’s openSDN and WAE (wan automation engine)

router static
address-family ipv4 unicast
49.1.1.8/32 tunnel-te1
!
!

I regularly do testing and development work with some of the largest ISPs in the UK – and something that regularly comes up, is where customers are running a traditional full-mesh of RSVP LSPs, if you have 500 edge routers – that’s 250k LSPs being signalled end to end, the “P” routers in the network need to signal and maintain all of that state. When I do testing in these sorts of environments, it’s not uncommon to see nasty problems with route-engine CPUs when links fail, as those 250k LSPs end up having to be re-signalled – indeed this very subject came up in a conversation at LINX95 last week.

With Segment-routing, the traffic-engineered path is basically encoded into the packet with MPLS labels – the only real difficulty is that it requires the use of more labels in the packet, but once the hardware can deal with the label-depth, I think it’s a much better solution than RSVP, it’s more efficient and it’s far simpler.

From my perspective – all I’ve really shown here is a basic LSP provisioning tool, but it’s nice to be able to get the basics working, in the future I hope to get my hands on a segment-routing enabled version of Northstar, or Cisco’s OpenSDN controller – (which is Cisco productised version of ODL) 🙂

iBGP for PE-CE

I’ve worked on many large-scale MPLS VPN solutions, some with as many as 20k-30k managed CPEs, and as everybody knows – where you run BGP with this sort of setup. It’s almost always eBGP with a single AS across all sites using AS-override, or each site gets a different AS number, to get around the age-old eBGP loop prevention mechanisms which tend to get in the way when we use L3VPNs.

Recently I came across RFC 6368 which describes how iBGP can actually be used as a PE-CE protocol, in order to make the provider network more transparent from a BGP perspective. Usually there’s no problem running eBGP and 99% of networks seem to operate perfectly fine with it, however if the customer CE routers have a large BGP element behind them, the provider’s AS numbers and interactions with the BGP updates can in some cases cause problems.

Recently Cisco added support to run iBGP for PE-CE with the addition of a new command placed under the VRF – “neighbor <x.x.x.x> internal-vpn-client” in JUNOS the command is “independent-domain” which goes under the routing-options for the routing-instance.

For this configuration, consider the following basic topology:

Untitled-2

CE-1 and CE-2 are both Cisco routers, MX-1 and MX-2 are Juniper MX’s running inet-vpn unicast between loopbacks, with ISIS-L2 and LDP configured in the simplest way possible, with all devices inside BGP-AS 100.

The routing instances on MX-1 and MX-2 are identical, apart from the peering IP address and the route-distinguishers.

routing-instances {
    as100 {
        instance-type vrf;
        interface ge-0/0/4.0;
        route-distinguisher 100:100;
        vrf-target target:100:100;
        routing-options {
            autonomous-system 100 independent-domain;
        }
        protocols {
            bgp {
                group iBGP-CE {
                    type internal;
                    neighbor 10.10.11.0 {
                        family inet {
                            unicast;
                        }

Notice the command “independent-domain” present under the autonomous-system configuration under the routing-instance on each MX, this essentially allows the device to run iBGP for PE-CE.

The Cisco routers are running a simple configuration, again they’re both identical except for the peering address and LAN interface range:

router bgp 100
bgp log-neighbor-changes
network 10.10.100.0 mask 255.255.255.0
neighbor 10.10.10.1 remote-as 100

BGP comes up as expected on both devices, and the LAN range is reachable from each CE:

CE-1#sh ip route
Codes: L – local, C – connected, S – static, R – RIP, M – mobile, B – BGP
D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area
N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2
E1 – OSPF external type 1, E2 – OSPF external type 2
i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2
ia – IS-IS inter area, * – candidate default, U – per-user static route
o – ODR, P – periodic downloaded static route, H – NHRP, l – LISP
a – application route
+ – replicated route, % – next hop override

Gateway of last resort is not set

10.0.0.0/8 is variably subnetted, 6 subnets, 3 masks
B        10.10.10.0/31 [200/0] via 10.10.11.1, 00:06:08
C        10.10.11.0/31 is directly connected, Ethernet0/0
L        10.10.11.0/32 is directly connected, Ethernet0/0
C        10.10.100.0/24 is directly connected, Ethernet0/1
L        10.10.100.1/32 is directly connected, Ethernet0/1
B        10.10.200.0/24 [200/0] via 10.10.11.1, 00:06:08
CE-1#ping 10.10.200.1 source 10.10.100.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.200.1, timeout is 2 seconds:
Packet sent with a source address of 10.10.100.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 14/21/32 ms
CE-1#

From the perspective of CE1 and CE2, and also MX-1 and MX-2 all sessions are iBGP sessions, so no AS numbers have been appended to the AS-Sequence, from the perspective of CE-1 and CE-2, it’s business as usual as far as iBGP is concerned – however it must be noted that MX-1 and MX-2 are changing the next-hops when BGP routes are loaded into the RIB – but this is normal VRF L3VPN behaviour anyway.

One of the interesting aspects of this particular feature, apart from the ability to run iBGP for PE-CE sessions, is that it uses a new attribute (optional transitive 128) to effectively hide specific BGP customer attributes and tunnel them through the provider core between PE routers. This means that internal BGP settings set by customers such as local-preference can be tunnelled through the provider core without it interfering with the provider’s best-path selection process. This system is analogous to running OSPF as the PE-CE protocol, where route-types are encoded into VPNv4 and transported across the core, as the OSPF domain-tag.

To demonstrate this, if we modify the local-preference on CE1, so that outgoing routes are set with a local-preference of 250, MX-1 should hide the local-pref value in it’s L3VPN advertisement to MX-2, it’s only on the routes subsequent advertisement from MX-2 to CE-2 that the local-preference value is unmasked.

Set the local-preference to 250 on CE1:

router bgp 100
bgp log-neighbor-changes
network 10.10.100.0 mask 255.255.255.0
neighbor 10.10.11.1 remote-as 100
neighbor 10.10.11.1 route-map lpref out
!
route-map lpref permit 10
set local-preference 250
!

We also see the route (10.10.100.0/24), complete with a local-preference of 250 received intact on CE-2:

CE-2#sh ip bgp
BGP table version is 31, local router ID is 10.10.10.0
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i – IGP, e – EGP, ? – incomplete
RPKI validation codes: V valid, I invalid, N Not found

Network          Next Hop            Metric LocPrf Weight Path
*>i 10.10.11.0/31    10.10.10.1                    100      0 i
*>i 10.10.100.0/24   10.10.10.1              0    250      0 i
*> 10.10.200.0/24   0.0.0.0                  0         32768 i
CE-2#

When we take a look at the RIB-IN on MX-1, we clearly see the route coming in with a local-pref of 250:

root@PE1> show route 10.10.100.0/24

as100.inet.0: 5 destinations, 5 routes (5 active, 0 holddown, 0 hidden)
+ = Active Route, – = Last Active, * = Both

10.10.100.0/24 *[BGP/170] 00:21:48, MED 0, localpref 250
AS path: I, validation-state: unverified
> to 10.10.11.0 via ge-0/0/4.0

root@PE1>

However, the big difference is in the L3VPN advertisement from MX-1 to MX-2, the local-preference of 250 is tunnelled inside a new attribute set, and route has the local-preference of whatever the provider’s core is using (in this case the default 100):

Output taken on the 10.10.100.0/24 route, received on MX-2 from MX-1

100:100:10.10.100.0/24 (1 entry, 0 announced)
*BGP Preference: 170/-101
Route Distinguisher: 100:100
Next hop type: Indirect
Address: 0x940db4c
Next-hop reference count: 8
Source: 4.4.4.4
Next hop type: Router, Next hop index: 531
Next hop: 192.169.1.3 via ge-0/0/2.0, selected
Label operation: Push 299840
Label TTL action: prop-ttl
Load balance label: Label 299840: None;
Session Id: 0x3
Protocol next hop: 4.4.4.4
Label operation: Push 299840
Label TTL action: prop-ttl
Load balance label: Label 299840: None;
Indirect next hop: 0x972c220 1048576 INH Session ID: 0x10
State:
Local AS: 100 Peer AS: 100
Age: 20:11 Metric2: 1
Validation State: unverified
Task: BGP_100.4.4.4.4+55077
AS path: I
Communities: target:100:100
Import Accepted
VPN Label: 299840
Localpref: 100
Router ID: 4.4.4.4
Secondary Tables: as100.inet.0
Indirect next hops: 1
Protocol next hop: 4.4.4.4 Metric: 1
Label operation: Push 299840
Label TTL action: prop-ttl
Load balance label: Label 299840: None;
Indirect next hop: 0x972c220 1048576 INH Session ID: 0x10
Indirect path forwarding next hops: 1
Next hop type: Router
Next hop: 192.169.1.3 via ge-0/0/2.0
Session Id: 0x3
4.4.4.4/32 Originating RIB: inet.3
Metric: 1 Node path count: 1
Forwarding nexthops: 1
Nexthop: 192.169.1.3 via ge-0/0/2.0

What’s interesting, is if we disable the “independent-domain” feature on MX-1 and re-check the output from MX-2:

100:100:10.10.100.0/24 (1 entry, 0 announced)
*BGP Preference: 170/-251
Route Distinguisher: 100:100
Next hop type: Indirect
Address: 0x940fcc4
Next-hop reference count: 4
Source: 4.4.4.4
Next hop type: Router, Next hop index: 531
Next hop: 192.169.1.3 via ge-0/0/2.0, selected
Label operation: Push 299856
Label TTL action: prop-ttl
Load balance label: Label 299856: None;
Session Id: 0x3
Protocol next hop: 4.4.4.4
Label operation: Push 299856
Label TTL action: prop-ttl
Load balance label: Label 299856: None;
Indirect next hop: 0x972c220 1048576 INH Session ID: 0x11
State:
Local AS: 100 Peer AS: 100
Age: 25 Metric: 0 Metric2: 1
Validation State: unverified
Task: BGP_100.4.4.4.4+54013
AS path: I
Communities: target:100:100
Import Accepted
VPN Label: 299856
Localpref: 250
Router ID: 4.4.4.4
Secondary Tables: as100.inet.0
Indirect next hops: 1
Protocol next hop: 4.4.4.4 Metric: 1
Label operation: Push 299856
Label TTL action: prop-ttl
Load balance label: Label 299856: None;
Indirect next hop: 0x972c220 1048576 INH Session ID: 0x11
Indirect path forwarding next hops: 1
Next hop type: Router
Next hop: 192.169.1.3 via ge-0/0/2.0
Session Id: 0x3
4.4.4.4/32 Originating RIB: inet.3
Metric: 1 Node path count: 1
Forwarding nexthops: 1
Nexthop: 192.169.1.3 via ge-0/0/2.0

Essentially this breaks the feature, and the customer’s iBGP attributes are no-longer tunneled, if the provider has any sort of common policy for modifying the best-path selection process using local-preference for L3VPNs, it would obviously conflict with the customer’s setting.

The RFC states that customer specific iBGP attributes are encoded by the receiving PE router, using the “ATTR_SET” attribute, these are applied using the attribute-flags the same as vanilla L3VPNs.

I haven’t used this feature in any designs yet, whilst the RFC was finished in 2011 Cisco only released support for this last year – however it has been present in Juniper for a while, but it could be very effective for simplifying some requirements where customers have significant BGP setups behind PE routers.

Happy Easter!

	Sales Psychology, BE… on EVPN – the basics
	stnzzzz on EVPN – All-active m…
	Sukhjit Hayre on EVPN vs PBB-EVPN
	stnzzzz on EVPN – All-active m…
	menger9 on Subscriber management on Junip…

routing-instances { as100 { instance-type vrf; interface ge-0/0/4.0; route-distinguisher 100:100; vrf-target target:100:100; routing-options { autonomous-system 100 independent-domain; } protocols { bgp { group iBGP-CE { type internal; neighbor 10.10.11.0 { family inet { unicast; }

router bgp 100 bgp log-neighbor-changes network 10.10.100.0 mask 255.255.255.0 neighbor 10.10.10.1 remote-as 100

Gateway of last resort is not set

router bgp 100 bgp log-neighbor-changes network 10.10.100.0 mask 255.255.255.0 neighbor 10.10.11.1 remote-as 100 neighbor 10.10.11.1 route-map lpref out ! route-map lpref permit 10 set local-preference 250 !

root@PE1> show route 10.10.100.0/24

10.10.100.0/24 *[BGP/170] 00:21:48, MED 0, localpref 250 AS path: I, validation-state: unverified > to 10.10.11.0 via ge-0/0/4.0

routing-instances {
as100 {
instance-type vrf;
interface ge-0/0/4.0;
route-distinguisher 100:100;
vrf-target target:100:100;
routing-options {
autonomous-system 100 independent-domain;
}
protocols {
bgp {
group iBGP-CE {
type internal;
neighbor 10.10.11.0 {
family inet {
unicast;
}

router bgp 100
bgp log-neighbor-changes
network 10.10.100.0 mask 255.255.255.0
neighbor 10.10.10.1 remote-as 100

router bgp 100
bgp log-neighbor-changes
network 10.10.100.0 mask 255.255.255.0
neighbor 10.10.11.1 remote-as 100
neighbor 10.10.11.1 route-map lpref out
!
route-map lpref permit 10
set local-preference 250
!

10.10.100.0/24 *[BGP/170] 00:21:48, MED 0, localpref 250
AS path: I, validation-state: unverified
> to 10.10.11.0 via ge-0/0/4.0