Industry analyst firm 451 Research recently published a Business Impact Brief on how monitoring tool sprawl, which has long been a pervasive problem for large enterprises and government institutions, is creating even more challenges in modern IT environments.
The report begins:
“The industry is trending toward the concept of observability as organizations aim to aggregate an array of operational data from their IT environments to generate deeper insights to monitor and optimize applications and infrastructure. However, achieving holistic observability is possible when the strategy is underpinned by effective monitoring and management practices, which is being complicated by the ongoing challenge of tool sprawl.”
Organizations of all sizes have been plagued by this problem called ‘monitoring tool sprawl’. Over periods of time often covering decades IT teams accumulate many monitoring tools. There are tools designed to monitor specific technologies such as servers, network gear, storage systems, etc. Even worse, there are tools that monitor a specific vendor’s gear, like Cisco switches or Dell servers for example. As new technologies have taken hold, there are monitoring tools specific to those environments, like cloud, ephemeral systems, hyperconverged infrastructure, and software-defined whatever. Sometimes business units or DevOps teams adopt their own monitoring tools. Over long periods of time, it’s typical for companies to have amassed 30 or more monitoring tools for their IT infrastructure.
The key problem here is that all of these monitoring tools are effectively silos. They’re silos of data that have the potential to provide rich context for resolving and even preventing IT outages. But that potential can only be fully realized when the data can be analyzed in aggregate. For example, when there is an issue with a disk in a storage array, the storage monitoring software may recognize that. But when that disk is one of hundreds of components making up the IT service that supports a given application, it is nearly impossible to ascertain that the failing disk is the root cause of a performance problem with that application. The only way to reliably pinpoint and prevent IT disruptions is to have service-level visibility, aggregating all monitoring data in combined context.
A lesser issue, although one that is probably more widely understood, is that monitoring tool sprawl also inevitably leads to inefficiencies from a cost perspective as well as operational inefficiencies because using more tools correlates to more time and expertise necessary to manage them.
Even when organizations recognize they have a problem, reinventing your monitoring strategy can be a nontrivial task. However, things are reaching a breaking point. For most companies, IT is no longer simply there to support employees who are doing a company’s business - IT organizations are there to drive the business. As companies across all industries are digitizing, the ability to troubleshoot and prevent IT outages has become a matter of survival of the business, and the inability to do it has become a death knell.
According to the report:
“One way that organizations are trying to counteract tool sprawl is to leverage existing vendors to help consolidate future tool choices. In data from the same VotE study, 83% of organizations indicated that they prefer to buy as many monitoring tools from a single vendor as possible. For many organizations, this preference is fueled by a desire to reduce complexity in their IT environment, meet emerging needs, take a monitoring approach focused on applications or services, and ultimately gain monitoring and incident response capabilities that are stronger than the sum of their parts or of equivalent siloed tooling.”
There is a matter of more diverse personas leveraging monitoring tools - IT Ops teams, DevOps engineers, site reliability engineers, and business executives as well, who need to understand the positive or negative impacts that IT performance is having on the customers. These various personas can often leverage the same types of operational data for their own needs, but it must be made useful to each persona beyond the silos, so it provides a unified contextual view of the IT landscape. However, a key inhibitor in the advancement of gaining this context is monitoring tool sprawl.
Download the full 451 Research Business Impact Brief to get an in-depth look at how silos are more prevalent than ever, cloud native technologies are amplifying complexity, and tool consolidation is becoming a priority.