Some in IT might wince when I say this, having a modern IT network isn’t about the new and the shiny. It’s about the foundational.
Before an IT organization can pursue Software Defined Networking, Converged Infrastructure (CI), Digital Transformation, Platform as a Service (PaaS), or any other emerging technology, it needs a network in which its people, technology and processes support the foundational elements of its existing technology and services.
Three and a half years ago, my Global Networking Services team at Dell took on the challenge of modernizing and transforming Dell’s networking infrastructure to improve reliability, simplify, lower costs which set the stage to adopt new IT innovations. While it was an effort that had been attempted many times before, we succeeded in delivering a more stable and flexible network that is now poised to support Digital Transformation as the company integrates Dell’s and EMC’s networks.
Some of the factors that helped us get there may surprise you.
Patience, coffee and company investment
When I was hired by Dell as a network architect three and a half years ago, my initial task was to architect the integration of 16 smaller companies that Dell had acquired in a series of M&As. The acquired companies were functioning disparately from Dell and each other, creating a lot of user and business frustration and inefficiencies with things like separate email and HR systems and countless workarounds in place to enable users to collaborate, poorly. They also had inadequate backup and data recovery systems; security concerns; little capacity management process, user performance or app performance issues; disparate content management solutions; and an inability to comply with Dell Corporate IT policies.
As I began the two-year effort to resolve these issues and integrate these companies’ systems with Dell’s IT network (a program internally named as Red/Blue), it became apparent that there was an opportunity to improve stability and performance issues in Dell’s global corporate network as well.
Dell IT was experiencing frequent network outages. In fact, IT was having to address several major incidents a week. Initially, there didn’t seem to be a pattern of any one thing causing the network outages. The first incident could have been Wide Area Network (WAN) or MPLS related; next the Local Area Network (LAN) switches or cabling failed. Defective load balancers. Wireless problems. It was just everything.
What was a clear pattern was the excessively long time it took to fix the outages. If a switch went down in our Dublin office, it could take considerable time to find the engineer who knew the password needed to fix it and then still more time to address the problem if the service records or data center documentation were out of date, wrong or missing. Complicating this lack of basic knowledge management was the fact that there was a lack of basic operational processes. Follow-the-sun-support, incident handovers and simple peer reviews for quality control needed upgrading.
The right people, a plan and a defining moment
To stabilize the network, firefight outages and manage basic programs, I needed to hire people with the right skills. In addition to hiring new team members; I placed the correct people and their skillsets in the right leadership roles and then had them come up with strategies around their specific areas such as network design, architecture, deployment, operational excellence. From there, we created a three-year network transformation plan which got good response and funding from senior leadership at Dell.
A key part of the transformation focused on knowledge management, documentation and quality control, including seemingly simple things like creating a password repository, documenting support processes, instituting follow-the-sun coverage, setting on-call notice procedures. All these things that some organizations take for granted were missing and were clearly important to our ability to provide timely support to the larger Dell organization. While these are not innovative technology strategies per se, they were definitely transformational. As we progressed, we were now responding to outages within a few minutes rather than 30 – we were becoming operationally slick.
We were gaining ground, but it wasn’t happening fast enough. It was a year and a half into the program and the outage incidents continued. People were getting tired.
Then something occurred that became a defining moment in the transformation. One morning in July 2015, an incident brought down manufacturing in several locations in the Far East and India for several hours. Management was concerned and I didn’t have answers to their tough questions around this why this kept occurring.
At this point, my choices were limited as to what action to take next – I issued an order to the engineers that no more changes could occur on the network without my express permission. We formed a peer review panel to look at the quality of any changes being proposed on the network. The results were staggering. During the first peer review session, all the proposed network changes presented were rejected due to quality problems. The second session also rejected all proposed changes. And the third session approved only one of the five to ten changes proposed after reviewing their quality.
I had made the false assumption that there was not a pattern to our incessant outages. The pattern was human error. Quality control and engineering best practice were our deficiencies.
We introduced proper ITIL (Information Technology Infrastructure Library) change management practices and review processes, mixed with extreme levels of technical rigor and scrutiny. Within two weeks, we had reduced network outages to single digit occurrences per week. Eight weeks later, network outages were literally a thing of the past. As of today, we ended up with a 75 percent reduction in network incidents year over year.
That for me was when the transformation actually began. With quality standards and best practices in place and the network stabilized, we started to lay the network foundation for Dell’s future. We were able to go into the data center and introduce new equipment and topologies with a high degree of confidence, and flawless execution–even on the most complex changes.
Now other teams that didn’t have confidence in the Network Team’s ability to react quickly began to have confidence in them almost overnight. The Network Team is seen today as an enabler of the business, and a shining light of IT engineering best practice and discipline.
Bringing the network out of past
With the network stabilized, we could then make the technology changes needed to bring our decade-old infrastructure design up to date. We built more robust connectivity, adding best-in-breed WAN technology, wireless capabilities and mobility technology at our Round Rock, Texas, campus and worldwide branches. We modernized our hardware platforms and introduced new software functionality such as Dynamic MultiPoint VPN to lower costs… We also modernized and simplified our LAN system, deploying only best-in-class Dell switches.
These upgrades were done on a rolling basis over the life of the project and, while the work is ongoing, we have passed the critical mass and continue to develop our capabilities in the network to enable a more Digital Experience for our end users and customers.
We also undertook extensive work in our data centers to stabilize and modernize them, collapsing unnecessary layers and future-proofing our system with the introduction of Dell networking and Open Networking technology. We architected our data centers for SDN capabilities and modern DC architectures, such as Spine Leaf topologies, paving the way for a truly SDDC Software Defined Data Center (SDDC).
Augmented Reality, Virtual Reality, Artificial Intelligence, Machine Learning, Digital Transformation, Hybrid Cloud, CI/ HCI and Blockchain all bring massive opportunities and challenges to Dell’s global network and our customers. I am confident in this: we are ready for the new wave of emerging technologies that will hit in our network. And, of course, we are providing a stable network that delivers existing services to provide business value to our customers without fail.
Check out Stephen Stack’s Technology Breakout Session at the 2017 Dell World in Las Vegas on Networking Solutions for the Future-Ready Enterprise on May 8th at 3:00 to 4:00 p.m. and May 9th at 12:00 to 1:00 p.m.