IT Operations Are Still Key to Success.
Excitement | Today was an exciting day! You received your shiny new Converged Infrastructure package from your leading technology partner. There was a buzz surrounding the delivery, everything was magical - the unboxing of the huge crate, the wheeling of these beautiful package into your datacenter, the plugging in and having all the servers, storage arrays and network switches instantly
You know, your CI package is leading edge technology that will bring unprecedented agility and efficiency to your users. Authorized users will be able to request and receive IT resources in minutes, not days and the business leaders will be thrilled by your ability to push out exciting new applications quickly. It was only 4 weeks ago that you had approved the quote and here it was facing you.
And yet no champagne corks are hitting the ceiling.
Reality | As an IT professional you know what the addition of this technology will mean to you and your newly formed Virtualization team. IT Colleagues from various departments will be knocking down your door to have their applications running on your new toy. You will find yourself serving more customers and supporting more applications than you ever did before. But there will be more scrutiny. Your executives, who signed off on your million dollar purchase will be rooting for your success but will want reassuring justification for your decision. You have just acquired technology that will define your career path.
You cannot afford failures.
Challenge | You have been managing a server farm for years and know it inside and out, you have built and nurtured a team of top performers and that is what got you the leadership role. But this is different. For starters, this is not just servers, it is the storage, the networking, the software all working together to deliver a workload ready infrastructure. Suddenly the skills matrix has changed on you.
Unlike in a traditional environment where the relationships and dependencies are stagnant (at least for long periods of time), within a CI, by design the relationships between your workloads and your infrastructure are constantly changing. Additional VMs are spun up to support the temporary surge on a workload, additional storage is provisioned to store the data that is being spit out by this increased demand, load balancers work tirelessly to make sure the network traffic is kept moving smoothly. But what if in the middle of all this a component goes awry?
Very quickly, the simplicity of the Converged Infrastructure can become the root of its complexity. This mix of blades, SAN storage and switches gives you an integrated system and delivers immense power in a compact form factor, but does not provide you with the “nicely laid out wiring diagram” of the traditional infrastructure. (Below you will find a great video to convey this point to your non-technical brethren)
How would you go about identifying the rogue component that is causing incidents all over your environment? To be sure, you have element level management tools, in fact, you already had a plethora before and now you have just received another set from your CI provider. But they don’t inter-operate; they do not provide you with a holistic view of your environment. How will you determine which component might be causing a disruption in the service and even worse, how will you know if it is safe to take a component off-line? Will taking out TOR Switch1 kill another mission critical service? Unfortunately for you, there are no tell-tale cables that would set you on the right path. It is just you and your dashboard.
As you continue to toil through the logs from your management tools, your customers will wait, patiently, at least at first. And the cost of downtime will continue to mount.
Without effective IT operations your CI dream can rapidly turn into a nightmare.
Downtime in a CI environment! | While a well-engineered Converged Infrastructure can provide a highly reliable foundation, the architecture of your specific deployment - high availability, disaster recovery, mirrored etc. would eventually determine the amount of downtime your experience. The hourly impact of the downtime, however, remains unchanged. A recent study by Forrester Research indicated that downtime costs for most companies can run anywhere from $10,000 to $1 million per hour. In an upcoming post I will discuss ways to determine the downtime costs for your organization.
Given that a typical Converged stack will host multiple services, the impact of downtime will be effectively magnified.
Solution | So what is needed to effectively manage your CI environment?
Well, with CI for most part, things don’t change. All the core principles of IT still apply. What does change is the span of responsibility. Your team is no longer just responsible for servers, or networking, or storage, instead you are responsible for all of it. You can no longer have Operating Level Agreements (OLAs) with your storage group and consider that addressed, this is all yours. And that means you need something that will provide you with a comprehensive view of your environment not just pieces. This is no longer a challenge of finding yet another tool this is now about finding the right tool. As a recent study conducted by Forrester revealed, having too many tools within your environment can be a major impediment to productivity. You need to be mindful of tool proliferation.
Here are some best practices to manage your newly updated datacenter:
Look for integrated monitoring solution such as Zenoss Service Dynamics so that you
Have a complete view of your environment to expedite problem resolution & reduce downtime
Can be handled by generalists freeing up specialists to engaging high value- projects
Leverage a real time asset-service dependency model so you know what services are at risk
Check the uncontrolled proliferation of tools for “Too Many Tools” can encumber productivity (link to webinar)
Seek tools that can manage your disparate environment – introduction of Converged Infrastructure maybe the first step in the direction, but your legacy environment is not going away anytime soon. Make sure your solution can manage legacy, cloud and hybrid environment.
Look for tools that don’t require local agent deployment
Expedite deployment and easier to scale
Simplify maintenance – applying patches, upgrading the solution
Identify solutions that are Robust & Flexible – as your environment grows new workloads will be introduced, look for a solution that is flexible and extensible
If IT management is not your core competence, look for cloud based solutions – for organizations that seek to benefit from IT without taken on the overhead of managing it, look for solutions that are hosted in the cloud such as Zenoss as a Service (ZaaS)
Converged Infrastructure is the exciting new consumption model for IT infrastructure, it is pre-tested and pre-built and arrives ready to run. It however, is not immune to laws of physics - it can only deliver value when it is up and running. Issues will surface and without efficient IT operations, the issues will linger. They will cause downtime, blown SLAs, irate customers and finally infuriated bosses. To see how Zenoss Service Dynamics or Zenoss as a Service can help you get control on your CI environment, please click here