EVPN-VXLAN: Symmetrical IRB versus Asymmetrical IRB

EVPN-VXLAN: Symmetrical IRB versus Asymmetrical IRB
Photo by Kasturi Roy / Unsplash

Now that we've covered the two flavours of IRB in depth, I want to share more of a discussion piece. Technical details are interesting, sometimes even fun, but what about real-world operational considerations?

"Everyone has a plan..."

Viewing the intimidating assortment of pikes, swords, sharp, and bashy objects in the Tower of London armoury during a recent visit, I was reminded of that Tyson quote "Everyone has a plan until they get punched in the mouth."
My train of thought was, "being a knight riding around on horseback would be fun and all until an encounter with a big stick with a pointy metal end."
Similarly (or maybe not similar at all, but I hope you get where I'm going here), playing around with the various types of IRB for EVPN has been enlightening and, at times, fun; but what about on a real-world network with its everyday concerns and risk of outages - the pokey, hurty things in my analogy.
What works on paper might not be feasible on a live network, when the focus is primarily on reliability and deploying networks that the NetOps team can realistically support.

Symmetrical IRB - it scales, but at what cost?

If you've read over my post about symmetrical IRB, it should be apparent that the whole deal with this approach is scalability.
All the RT-5, EVPN Router's MAC, extra BGP address-family config; it is all so that the VTEPs do not have to carry information about the destination L2 networks. That keeps the tables sizes down, and allows for greater scale.
Now, that does sound like a good thing, and, for some, the ability to scale
will be a major consideration when deploying EVPN.
However, this might not be the deciding factor for others.

But it isn't all good news.
Looking over symmetrical IRB, there are, as mentioned, even more details to learn and look ups to be aware of.
To unlock scale, there is a cost in terms of more burden upon the NetOps team. Their processes and quick-recall knowledge about the network (for 'middle of the night' troubleshooting) now needs to encompass deep understanding of an EVPN table of RT-2, 3 and RT-5s. To understand what it means to see, say, just RT-5s when users complain of loss of comms to a remote site; to understand what on earth that L3VNI is doing.

Asymmetrical IRB - not just the poor cousin.

Thus, with the considerations about scale in mind, asymmetrical IRB might actually 'do the job' for some deployments.
With this design you avoid piling more technical details onto an already complex technology. No need to consider the somewhat opaque concepts of L2VNI versus L3VNI, just configure all VTEPs with all networks.
It certainly means that the NetOps troubleshooting processes can be a little more concise.

That comment about L2VNI and L3VNI is not just a fatuous, throw-away line; I know from experience that this concept, in particular, has some experienced networkers quite confused when configuring EVPN.

The right tool for the job

It might seem like a obvious thing to say, but it is worth reiterating, sometimes the most 'bright and shiny' of the new technology might not be the best fit for the customer.
I experienced this first-hand during one of my very first EVPN technical presentations. Brimming with enthusiasm for elegance of symmetrical IRB and its various knobs to alleviate the network engineer of the burden of a high config file line count, and free up precious cache resources - I was initially surprised when the lead engineer in the audience declared a preference for asymmetrical IRB.

Why?

The line of argument for asymmetrical IRB was that it would enable uniform configuration files across their networking estate. Each VTEP would have all the necessary VLANs from the start, no need to worry about which VLAN to configure against what device.
The engineer, with their eye for process rather than just new tech, was viewing the choice with a different set of priorities to my own, and the following discussion was all the better for it.

The possibility of automation

Moreover, with an network automation hat on, we often espouse standardized configs, and a need to reduce corner-case sites. Uniform configs make templating that much easier because there's more in the base template, and less that needs to be added in as conditional to a specific site.
When automating, the more corner cases in the network, the more logic that gets pushed into code to fix, resulting in more potential for error in amongst all those if/else statements.

'...practicality beats purity'

While it might seem like anathema to the over-caffeinated network engineers amongst us to consider a "less technical" approach, it should be self-evident that no real-world network design choices are made solely based on technical choices. Or, at least, they really shouldn't be.
Indeed, the first question that any of us should be asking is, why do we need this technology at all?
Do we really need to stretch our L2, can't we just keep those broadcast domain neat, tidy and compact?
If not, then you've got to evaluate EVPN versus static VXLAN.
Do you really need to dive into the world of the BGP state machine, route-targets and route types?
If you are pressing on with EVPN you then have this decision to make, asymmetrical versus symmetrical.
As always, and as the subtitle suggests, these decisions are finalized as a series of trade-offs; scale versus complexity being the most obvious one that comes to my mind.
Finally, with all these decisions, special consideration must be made to the most important resource of all, your people.
Can your NetOps team support such a network? Inherent to that question is the training and enablement required to take such a step.

Closing words

This piece is very much written from my experiences within Enterprise networking, and, as such, I'm attempting to be a much of a realist as possible.
There's no absolute truths in this environment (except do not run VTP), and calling it for one side or the other in the face-off between symmetrical and asymmetrical is beyond my intellect.
Personally, I lean towards symmetrical because that seems more elegant of an overall solution, but writing the Explainer post for asymmetrical gave me an appreciation for that approach which I lacked before.
Finally, the most important piece of advice I can give to anyone working the field, and discussing such technology with potential customers, is to listen to the customer.
In my case, I'm certainly glad I did. Their fresh take, coming, as it was, from those with daily experience of running live networks, gave me pause to reflect and evaluate - hence this post.

Thanks for reading.

Note: The "...practicality beats purity" quote is from the The Zen of Python

๐Ÿฆ@joeneville_