Service model definition process

The key to defining accurate service models is thorough dependency discovery. Environmental factors influence the process of defining service models, such as the level of
  • organizational maturity (processes and tooling)
  • automation service life cycles (provisioning and management)
  • application and infrastructure standardization

Despite these factors, the following steps outline a repeatable process for defining accurate service models.

  1. Define the service to model.

    In Service Impact graphs, a service and its members are represented as service nodes. A service is defined by its boundaries and type.

    • A web service (for example, a human resources portal) or an IaaS platform (an Amazon EC2 instance) depends on specific internal or external resources.
    • Software as a service (for example, ServiceNow) is defined by its deployment architecture.
  2. Define service models for subservices.

    In Service Impact graphs, subservices are represented as service nodes. (Service node roles are contextual.) A subservice has a direct relationship with one or more services. If the subservice fails or degrades, the service fails or degrades based on the propagation rules for that service context.

    Often, a subservice represents an infrastructure tier, such as a gateway or database service. At a lower level, a tier might be configured with redundant members for high availability. However, Zenoss recommends modeling subservices as single points of failure.

  3. Add device nodes, component nodes, logical nodes, and organizing groups to service and subservice nodes.

    After you define subservices, you can add the infrastructure resources that make up the subservices.

  4. Define global policies on subservices.

    Policies capture domain knowledge about services, and enable the automated dependency tracking and RCA computation that make Service Impact so valuable. To facilitate re-use of subservices, use global policies instead of contextual policies wherever possible.

    Contextual policies are valuable, but their use cases are relatively rare because most deployment scenarios use shared resources. For example, a hypervisor hosts virtual machines that belong to separate services. One service member with a global policy can apply to all virtual machines, across all service models. The service relationship, not service ownership, is the key to determining the relevance of events and sending state data to parent members.

  5. Repeat steps 2 - 4 as needed.
  6. To identify gaps in monitoring processes, analyze failure scenarios.

    Are key measurement points of each member in the service model properly monitored? For example, are synthetic transactions in place for web servers? Are ping checks being performed against host operating systems? Service Impact can only act on events that flow through Resource Manager.

  7. Test failure scenarios.

    The zensendevent command generates synthetic events, which are invaluable for validating service relationships, policies, and even monitoring functions, before real events or event storms occur.

    For more information about zensendevent, refer to the Zenoss Resource Manager Administration Guide.
  8. Refine the service model.

    If a test or gap analysis reveals missing policies or service members, add them and test again.