A few weeks back, the great folks at Intel DPDK, called us to be part of their panel discussion on Packet Acceleration. As someone who has built software network products most of the last 15 years, Randhir (our Chief Product Architect) and myself headed out to the Summit at the Intel office, Bangalore, India.
DPDK has grown in leaps and bounds ever since it’s first release, when most of us were just thrilled at the prospect of user land packet processing, zero copy buffers, and gigabit line rate packet I/O. At the 2019 Summit, there were amazing sessions on the path breaking progress made with DPDK. Teams world over, powered by the open source community of DPDK (now part of the Linux Foundation), are breaking new ground in terms of speed. 40Gbps processing, no problem. 30+ Million packets per second, no problem. It’s all being done, as we speak. Brilliant stuff.
One of the most thoughtful sessions was by MJ, and he spoke eloquently about how to map the concepts in DPDK to the terminologies in Kubernetes, and how DPDK plugins are now going to power Kubernetes networking performance.
5G was a big focus, and from QoS to GTP Tunnel processing, every session was a deep dive with source code references, Github previews, and demonstrations. In short, a brief day of heavenly bliss if you are a network software engineer. Everyone was talking packets, header sizes, processor cache lines. Wow..!
The panel was awesome. And I am sure you can read about it when the Intel DPDK team publishes the videos, and session transcripts. We will point out here though, the big question that came our way in the panel.
What are the challenges we face in the brave, new world of NFV?
At Lavelle Networks, we see three broad challenges, and we outlined each one of them.
Buying IT compute gear instead of Network ports, line cards, chassis
All Network teams know how to operate in the world of network boxes – line cards, chassis, interfaces, cables. They are still somewhat bewildered to translate all of this know how into operating what is fundamentally a virtualised server farm in the case of NFV. It’s not uncommon for the NFV vendor to ship servers when doing a POC, because the Telco Network team at times does not even have purchase processes in place for server compute gear. Field trials are done on Telco IT data centres in VMWare environments. This is a challenge that is being slowly resolved, but is an important one.
Trouble shooting NFV deployments is hard.
When you have physical equipment, the procedures and tools have been standardised and there is ownership of who should do what. If the packet is not exiting a Cisco or a Juniper box, you know what to do. In NFV, it is not that straightforward. Packet enters a physical server network interface, you might have NIC or fancy NIC (offload et al), then a NIC driver, then a hypervisor switching layer (and there are different flavours – accelerated, custom, OVS, OVS on Steroids, commercial overlay like NSX or Contrail), then it gets to the VNF. Now brace yourself. The VNF is a complete stack in itself and often a code virtual replica of the VNF vendor’s physical appliance logic. So the VNF has a VNIC I/O, another driver or user-land function, and a bunch of its own packet processing. And did we hear you say Service Chain? That of course means this has to happen multiple times, because you have more than one VNF. This space requires a lot of attention, and lot of useful work has begun, but we need to reach the maturity level of physical networking troubleshooting.
High speed, stateful, distributed packet processing is the need of the hour
High speed packet I/O is great, but we now need to focus as a community on high speed network services, especially stateful intelligent network services. As an industry, we need to stop being obsessed with speeds and feeds alone, and solve the difficult problems of distributed stateful packet processing. How can we get a VNF to share blobs of network state with another VNF? How can we get multiple VNFs to combine their intelligence to improve security or performance or user experience of a single session? Today each VNF does its packet dissection all on its own, and extremely fast high speed userland I/O comes to a choking crawl at an application detection engine or a content filtering service. To be smart, the network has to move beyond L2/L3/L4 header processing, and vanilla service chains of firewall, NAT, Load Balancer to real service chains of application and content processing. All intelligence can only be applied when you know what application you are transporting, and even within the application what specific content is being exchanged. Building high performance, service chained, distributed stateful network engines will change the course of our future. On the other hand, if we remain obsessed about speeds and feeds, transport networking will become a commodity and utility business, like our home water, gas or electricity connections.
The discussion gravitated to 5G and I was unable to stop myself from pointing out what I have made a habit of discussing in any industry panel. Telcos and network vendors need to innovate rapidly to stay relevant, otherwise the deep disappointment awaiting the industry is that 5G will become yet another super fast connectivity pipe, with even more action moving to OTT. I was at a Telco conference in Europe a year back, and there was a lot of questions on what exactly will happen to revenue lines post 5G? On the one hand, operators are deploying 5G so they don’t become obsolete or lose the technology advantage. On the other hand, customers – enterprises or consumers, are expecting that with 5G bandwidth prices will be even lower, and faster networks will be available at lower costs. A double whammy for the Telco world. We need to work together as an industry to deliver advanced services that we cannot imagine without a 5G network. Intel reported that an autonomous car might generate upwards of 1000GB of data every day from all the IoT chatter, can we create smart use cases in the network for these? I remembered the exhilarating example of a fireman wearing a VR helmet connected via 5G – where the 3D floor plan is embossed in real time on his vision, using the high speed, low latency 5G network.
Shyamal Kumar, Founder, Lavelle Networks live at DPDK Summit 2019 discussing about Edge Computing, Packet Acceleration and more!
Towards the end of the panel, the most anticipated question became the following:
Telcos and SIs are using open source code repos. But then the community does not always fix things in the timelines expected. What do we do?
- Well, at Lavelle Networks we have been very clear about this. Use open source software only when you can also build the right R&D team which can debug, patch and maintain that code for your project or product. Simply using open source software with the in-house capability to only integrate, compile or tweak the code is a risky turf to be in, when you are providing mission critical services. Innovation cannot be free in a business life cycle. Service providers world over are enamoured by white label hardware, open source software and while this is all good and will drive down TCO, it will also drastically reduce the innovation from the network product vendors as there is no money available to fund the innovation. This either means that Telcos have to create innovative products themselves, like the Web Scale folks (Amazon, Google, Facebook) who by using open source software and commodity hardware, pretty much drove companies like IBM, Dell, or HP out of the IT innovation ecosystem by refusing to pay them for anything and building everything on their own. I sometimes read about the incumbent network equipment vendors and their executives rant about how they are great backers of open source, commodity hardware and I wonder how they plan to be innovative if no one is going to pay for anything, except for support.
- There is the business of technology innovation, and the there is the business of open source based technology commodisation. As a software product company, we embrace both worlds, but I don’t know if we all have the answers for both worlds. What is also troubling is that open source has become a way for competitors to fight battles they could not fight otherwise. I can jeopardise the commercial success of your technology, by providing commercial support to the open source community version of your technology. There you go. Open source was the holy grail of software engineering, folks like me grew up admiring the brave, new world of open source, I have read “The Cathedral and the Bazaar” by Eric S Raymond so many times with acute excitement. We now stand at a cross road, where we need to preserve the sheer elevated altitude of open source, without it becoming a tool for underground warfare in commercial software.
But, it’s all good news at the DPDK Summit 2019. Packet processing software is changing in leaps and bounds, and the vision of a 100% software network is near, so very near. Since 2015, I have been building Lavelle Networks as a network software powerhouse, and rubbing shoulders among the best DPDK community stalwarts, during a warm sunny day at Intel’s beautiful Adarsh Palm campus, was awesome. Thank you – Team Intel DPDK. We look forward to the possibilities in 2020.