Border Gateway Protocol (BGP) is an internet routing protocol built 20 years ago. It is used to exchange route reachability of one party with another party. For example, let’s say there is an enterprise which has reachability in Bangalore and has a local ISP. Now the traffic is destined to somewhere in Delhi for which IT teams need to do a peering with the main controlling unit – the national backbone or taking the traffic to lets say Madhya Pradesh. BGP in such cases help enterprise networking teams to figure out the route propagation and thereby distributing the subnets (typically the internet subnets). The main forte of BGP are – its highly scalable and efficient when you have to deal with lots and lots of route across on the Autonomous Sytems (AS). BGP does a great job for enterprise networking teams to keep the internet running. However, the present demands and challenges of enterprise networks are different. And as enterprise network expands, BGP’s job is becoming increasingly harder as the number of AS continues to grow.
Is BGP getting obsolete?
No! It is a misquotation.
Enterprise underlay networks still require BGP. For example, enterprises using broadband to interconnect sites, need to create an overlay path. Thereby an overlay tunnel is responsible for sending traffic from one location to another. But when we create a tunnel at one end (lets say one branch is using Airtel Broadband, another branch is using Vodafone broadband and the DC is using reliance or Tata) what matters is the end points. Now in the underlay, it is the same local ISP giving it to another ISP, so the routing needs to be there, as in the ISP level routing and for that what they do is, for everything which is exterior.
The routing world works in two terminologies, interior routing and exterior routing. In our own personal view, interior works well when it is limited. Because BGP uses something which is called the distant vector which is an interior routing protocol that uses a link state. Link state doesn’t scale well, but distant vectors do scale well. So because of that, predominantly if somebody is doing auto discovery of subnets within an enterprise network or with small local ISPs, they use interior protocols like OSPF. If they are trying to peer, lets say Hathway is giving traffic to Airtel, and Airtel is giving traffic to NEC Japan, and then traffic is finally leading to California, so those peering happens predominantly via eBGP.
Unfortunately, relying just on classical workhorse protocols like BGP makes it hard to make WAN traffic responsive to the vagaries of Internet congestion. What matters today is only being congestion aware while routing traffic.
The Internet or MPLS networks today faces the same problem that rapid urbanisation poses to highway traffic in any of the new economies – India, China, Brazil, Africa. What is better these days when you navigate? Google Maps trying to help you avoid traffic with the fastest time to destination or your older on-board navigation maps which just tries to find the shortest distance to your destination.
Limitations of BGP today
Networking today is more than just calculating the best available path and key performance indicators. To deal with evolving traffic pattern of the cloud era, enterprises can no more rely on legacy network architectures. Relying on BGP makes it hard for enterprise networking teams to make WAN traffic responsive to the unexpected and inexplicable changes of Internet congestion.
But that’s just one fold of it.
The second fold addresses the difficulty in BGP configuration when networks need to scale. The configurations of BGP makes it complex for enterprise networking teams to make any changes in their WAN. Consult any networking team, using more than one legacy solution providers (e.g MPLS), they’ll surely be able to tell you about the ordeal in adding routers and devices and how it hampers network performance.
Thirdly, BGP doesn’t address network performance while making routing decisions. It has no clue to understand the size of the AS i.e. whether the path is long or short. It routes traffic via paths with higher RTT (Round Trip Time) duration even when better paths are available. In addition, BGP cannot detect Packet loss, network saturation, traffic bursts or any other performance related issues over the network. This easily leads to drastic surges in cost.
Why use a sword when you need a needle?
BGP is a very powerful protocol when you’re dealing with various autonomous systems, and doing a route compression etc. But enterprises do not need to route traffic at scale as ISPs do. If you understand how regular internet works, you would know that the internet is controlled by various ISPs. When we click Google or Yahoo, the kind of routing that takes place is mind boggling, which is next to impossible without BGP.
On contrary, lets suppose your enterprise has thousand different branches. Each branch is using /28 routing segment. It is a known number of segments that you have. In this case, why will you need to run a BGP instance there which is actually collaborating the routes on an overlay? Let us consider an example here. An enterprise has three branches. One is a DC and other two branches both with 24 networks. Its private networks which are just interested to set communicating across. At the DC site you’ve 10 subnets, at branch 1, you’ve 1 subnet, in branch 2 you’ve another subnet, so total you’ve twelve subnets. You are interested to reach from one subnet to other subnets. That’s all you’re routing requirement is. So for this, if you’re running BGP, it is much like using a sword in case of a needle.
“On a recent conversation with a Service Provider, we learnt how using BGP is getting harder even with their (ISPs) resources. It’s the right time for enterprises to learn how fragile BGP is and how moving to an SDN architecture can eliminate the rising ordeal with BGP”
Not only enterprises but ISPs too are raising concerns over BGP. Starting from Youtube’s BGP blunder 11 years back, to recent ones like the Google’ BGP hijack case. Regular outages have compelled ISPs to no longer consider BGP salutary in modern internet architecture. Lack of traffic encryption methods, automatic measures to prevent threats and attacks, and a rigid architecture all have challenged BGP’s trustworthy status quo.
Take a quick look at the recent BGP hijacking events in Wikipedia, you will notice a surge of attacks in the last two years. If you are wondering why? then here is the answer. Managing AS with BGP is getting harder and complex as Enterprises and ISPs are growing in size digitally. The limitations of BGP (Automation, encryption and Threat prevention) makes it difficult for enterprises and ISPs to manage AS in high demanding situations. This leads to route leaks, outages and Hijacks – which can easily jolt the business bottomline. Whether ISP or Enterprise, frequent outages result in increased latency, packet loss, and possible MITM attacks which can jolt business operations.
To put simply, with every node you add, configuring BGP becomes complex. Enterprises with features like Zero Peering Protocol in SD-WAN allow networking teams to easily scale when configuring WAN. You don’t need to configure BGP with zero peering protocol. Check out Lavelle Networks ScaleAOn – an alternative to classical protocol driven WAN solutions.