Service Impact Update

shutterstock_137968031In our new hybrid IT world, the word “service” may be the most overburdened term being used. It can be a combination of several applications or it can be one of several workflows that make up an application. A vice president in charge “X” line of business will probably define a service differently than a DBA (Database Administrator). For the most part, it’s a matter of perspective.

Regardless of how you define your services/applications, you have a pressing need to monitor them and come up with strategies to make sure they’re running smoothly. And if you’re that VP, you do want to keep tabs on the performance of the services you rely on, but you don’t want to be alerted that “Oracle Database R25” is down. All you care is whether your service is down or is suffering from performance degradation. And if that service is critical to running your business, you want the root cause found and the issue fixed as quickly as possible.

Zenoss Service Impact, part of Zenoss Service Dynamics (ZSD), makes the task of service-centric monitoring a reality. Instead of having to piece together different domains of your hybrid IT environment to keep abreast of problems or potential issues, it provides you with the ability to view your services, as a whole. It doesn’t matter whether the building blocks of your service are contained within your datacenter or spread across different geos. Service Impact lets you organize these different pieces of infrastructure and classify them such that they make sense to you and allow you to respond to problems before they impact your customers.

Get to the Root Cause

In a nutshell, Service Impact lets you define and track services from different rollups. It sits on top of ZSD Resource Manager, takes the data collected by Resource Manager and organizes it by service context.

I spoke with Chuck Priddy, senior product manager at Zenoss and a longtime software architect, about Service Impact. He said the value of Service Impact lies in the fact that different people within an organization need different models of information on an aggregated conceptual level.

Explained Chuck:

  • Think of it like a hierarchy. I’ve got a team that reports to a manager who reports to a director who reports to a vice president and so forth. Each of these people has objectives by which they’re being measured. If I’m the vice president, I’m not interested in how that team reporting to the manager is doing individually. I want to make sure this manager is holding true to his commitments [and] is on schedule for everything he is responsible for. But I want to know if anything at a lower level is causing a cascading effect. If one person on the team is late completing a project, that means the manager isn’t meeting his objectives. What is the root cause of this? One person down below slipped on this particular task.

Service Impact not only automates root-cause analysis (RCA) to help you move towards a resolution, it also helps you prioritize or “Triage.”

Perform Effective Triage

Once you’ve isolated a problem, Service Impact then helps you perform triage based on the (again) impact that root cause is having on your business. For example, if you’re that DBA, and you get events from Oracle databases R42 and R43, Service Impact can tell you which one of them is running a departmental print server and which one is attached to your e-commerce site that, if it goes down, will cost the business $100,000 per minute of downtime. If R43 is attached to the latter, you’re going to work on fixing that problem before you fix the other.

Said Chuck:

  • Database administrators are often not aware of which workflows or applications any one database is involved with, so if they get two different ones that are in trouble, they don’t know how to prioritize without having to call around or [use] a cheat sheet. With Service Impact, they can see, “Look, database R43 runs the customer purchasing service, so we’re going to work on that one first,” because they have that service-level perspective.

Furthermore, Service Impact helps you determine whether your service is being affected, even if it isn’t yet experiencing an outage. ZSD integrates with IT service management offerings like ServiceNow so that you can follow your incident throughout the entire incident management lifecycle. Instead of having to constantly check the status of every Oracle database, our DBA wants to focus on the service-level ramifications of these databases.

Added Chuck:

  • There may be high availability/redundancy for this service, so it’s still up, but if it has two servers, and one of them is down, that service is vulnerable. You want to know that [the service] has been partially impacted and then be able to track it at all levels when you get into your incident and problem management [solution], so that you can figure out who’s going to work on it and estimate how long it’ll take to get repaired.

Improved Change Management

Service Impact also helps you to carry out change management processes by letting you see ahead of time what services are relying on what hardware, so that those services aren’t inadvertently disrupted and the appropriate people sign off before the change occurs.

Imagine you’re the DBA, and it’s time to upgrade your Oracle databases from 11G to 12G. You can first check all the service contexts that each database is involved in.

Said Chuck:

  • You have this one mission-critical e-commerce site, so first of all, you’re going to submit a change management request to take down each of those services. If your vice president gets this change request for taking down the e-commerce site down tonight, that will have a much different impact on his attention than telling him you want to take down database R43. As you plan your IT change, you need to know that you have the appropriate authorizations because that one database may underpin three services and may require three different sign offs.

Latest Enhancements

You may be wondering why I’ve devoted a post specifically to Service Impact. After all it has been around since I’ve been blogging for Zenoss, and I’ve always been impressed by its functionality, so why didn’t I think to describe its benefits sooner?

Well, Zenoss has made some great architectural enhancements to Service Impact that allow them to deliver capabilities in a new more flexible manner.  They include:

  • Deployment flexibility – Service Impact can now be deployed at a different machine than Resource Manager accommodating the need for remote deployments.
  • Better Performance – Architectural enhancements enable 300% faster event processing and significantly reduces network overhead.
  • Improved transactional integrity – Databases within the solution are in constant sync to ensure transactional integrity.
  • Enhanced scalability – Service Impact benefits from architectural improvements to achieve significant increase in scalability and maintain consistent performance at scale.
  • Training resources – Zenoss is expanding their self-help training resources, including video tutorials for customers.

If you haven’t gotten around to trying Service Impact, now is the time to do so. The impact it will have on your ability to deliver services could transform your business!



Enter your email address in the box below to subscribe to our blog.

Zenoss Cloud Product Overview: Intelligent Application & Service Monitoring
Analyst Report
451 Research: New Monitoring Needs Are Compounding Challenges Related to Tool Sprawl

Enabling IT to Move at the Speed of Business

Zenoss is built for modern IT infrastructures. Let's discuss how we can work together.

Schedule a Demo

Want to see us in action? Schedule a demo today.