Life With Zenoss Cloud Gives You Intelligent Insights
EXPLORE THE PLATFORM
Life With Zenoss Cloud Gives You ControlPrevent IT Outages and Predict Issues
Life With Zenoss Cloud Gives You Massive Scale
Zenoss Partner Portal
Become a Partner
Learn. Discuss. Participate.Join thousands of Zenoss users and experts to learn, discuss and participate in the Zenoss Community.
Customer Support Portal
Zenoss Learning Center
Analyst ReportForrester: Top 5 Focus Areas to Succeed With DevOps
Customers for LifeAt Zenoss, our customers are at the core of everything we do.
Request A Demo
Monitoring of OSI Layer 2 networking infrastructure.
This ZenPack provides support to model OSI Layer 2 (or data link layer) topology. That topology information is used to suppress events from devices connection to which was lost because they are connected to broken devices. Data collection is performed using SNMP.
The features added by this ZenPack can be summarized as follows. They are each detailed further below.
Assigning the zenoss.snmp.CDPLLDPDiscover modeler plugin to device(s) will result in SNMP discovery of neighbor switches using a combination of CISCO-CDP-MIB and LLDP-MIB. The discovered neighbor switch information will be shown as Neighbor Switches in the device's component list.
Assigning the zenoss.snmp.ClientMACs modeler plugin to device(s) will result in SNMP discovery of the device's forwarding tables using BRIDGE-MIB. This information will be stored on existing Network Interfaces, and won't result in any new components being created.
This ZenPack performs no monitoring.
This ZenPack supports two types of event suppression.
We will use the term symptomatic event to refer to events that are a symptom of a problem, but not the root cause.
Suppression of ping events can be enabled on a per-device or device class basis by enabling the zL2SuppressIfPathsDown configuration property. This mode of suppression requires that the zenoss.snmp.ClientMACs modeler plugin be enabled and successfully functioning on all network devices such as switches and routers that you believe could be a root cause of other monitored devices becoming unreachable.
There are two ways symptomatic ping events can be suppressed. By manually configuring the ultimate gateway(s) of the device(s) using the zL2Gateways property, or by leaving the zL2Gateways property empty and setting the zL2PotentialRootCause property appropriately so that the gateway(s) can be automatically discovered.
The diagram above depicts a common data center network topology where each rack has a redundant pair of access switches sometimes referred to as top-of-rack switches. Each of those top-of-rack switches connect to a redundant pair of end-of-row switches. Each of those end-of-row switches connect to a redundant pair of core switches for the data center. Then perhaps the pair of core switches connect to a pair of gateway routers to connect the data center to the Internet or other data centers over private links. In this kind of topology the layer 3 gateway for hosts is often the core switches.
In this type of topology the gateways for host-1-1-1 can be automatically discovered to be rack-1-1a and rack-1-1b if zL2PotentialRootCause is enabled for the switches, and disabled for the hosts. zL2PotentialRootCause should be enabled for devices that could potentially be a root cause for other devices becoming unreachable, and disabled for devices that cannot be a root cause. This property is important to prevent root caused events from incorrectly being suppressed.
By relying on this automatic discovery of gateways we can only achieve suppression of events from the hosts. We'd get all of the host events suppressed in the case of an entire data center outage, but all of the rack, row, core, and gateway events would remain unsuppressed and it would be left as manual identification of the gateways as the root cause.
To achieve multi-hop suppression the zL2Gateways property must be configured. Despite the name of the property containing "L2", the configured gateways need not be restricted to the layer 2 gateways. In the example topology above, the best value for zL2Gateways would likely be gw-a and gw-b (one per line). It's important to use the Zenoss device id(s) for the gateways, and to enter one per line in zL2Gateways. There's no limit to the number of gateways, but more than two probably isn't realistic.
With zL2Gateways set to gw-a and gw-b in the above topology, a complete failure of the data center would result in all events being suppressed except for two events: a ping failure on each of gw-a and gw-b. This is assuming that zL2SuppressIfDeviceDown is enabled. See ''Non-Ping Event Suppression' below for more information on zL2SuppressIfDeviceDown.
Suppression of non-ping events can be enabled on a per-device or device class basis by enabling the zL2SuppressIfDeviceDown configuration property. No other configuration or modeling is necessary. Events will only be suppressed for a device with this property enabled when they have a new, acknowledged, or suppressed critical event in the /Status/Ping event class. This suppression is effective at reducing the potential clutter of symptomatic events when a device is no longer reachable on the network either because it has failed, or because the Zenoss collector is no longer able to reach it.
This suppression can be used together with ping event suppression for the most complete reduction of symptomatic event clutter.
All forms of event suppression as described above have a cost in terms of event processing performance. When zL2SuppressIfDeviceDown is enabled, there is a small additional overhead for processing all events. When zL2SuppressIfPathsDown is enabled and first-hop suppression is performed using either automatic gateway discovery or manual gateway configuration, there is another small overhead for processing ping failure events.
In worst case scenario testing the effective processing rate for non-ping events with the zL2SuppressIfDeviceDown configuration is approximately 80%, 75% for processing ping failure events in the case of a first-hop switch failure, and 70% in the case of a third-hop gateway failure.
All suppression is performed by an event plugin executed within zeneventd processes. Given that zeneventd can be scaled by adding more workers/instances, this additional event processing overhead can be offset by running more zeneventd instances as event processing throughput needs require.
In order to achieve acceptable event processing performance, a variety of caches are used within zeneventd processes. These caches can lead to events not being suppressed in some cases when the configuration, model, or status of devices is coming from stale cache information. The following types of caches are used with different timeouts.
The network map can be used to see connections between devices. The network map can be found in two places. The first is under Infrastructure -> Network Map where you can manually select the device from which to draw the network map, or from individual devices by clicking on ''Network Map'' from the device's left navigation pane. This will present a network map centered on the current device.
There are several controls that can be used to filter and otherwise control what you see on the network map. You must click the "Apply" button after adjusting any of these controls to see the resulting network map.
The network map must start with a node from which connections can be followed. Setting the "Root device or component" is what allows that starting node to be chosen.
The maximum number of hops controls how many hops outward from the root node will be followed. This is the primary mechanism to reduce the size of the resulting network map.
The "Show MAC addresses" option allows more detail to be seen about layer2 connections at the expense of a much busier map. When "Show MAC addresses" is not selected, the map will attempt to consolidate bridge domains into a single cloud node that connects all nodes in the bridge domain. This emulates what you see with layer3 networks. When "Show MAC addresses" isn't selected, individual MAC address nodes used to make connections from node to node will be shown. These MAC addresses can often be clicked to link directly to the network interface associated with the MAC address.
The "Show dangling connections" option allows connector-type nodes such as MAC addresses and IP networks that don't connect other nodes to be displayed. By default these are filtered out to prevent the network map from being cluttered by MAC addresses and IP networks that are only connected to a single device.
Note: The network map will only display a maximum of 1,000 nodes to avoid performance issues both on the Zenoss server, and in the web browser. If you attempt to view a network map with more than 1,000 nodes, a error message will appear to inform you that the map contains too many nodes, and to adjust the filters.
The network map can be filtered by layers. Layers are tags that Zenoss automatically adds to each link between devices and components. For example, when Zenoss identifies that host is connected to a switch, it will create nodes and links such as the following.
(host) -> (host MAC address) -> (switch MAC address) -> (switch)
Each of the arrows above represents a link, and in this case each of those links will have the "layer2" tag.
In the same way, if Zenoss identifies that a host is on the same IP network as a router that's its default gateway, it will create nodes and links such as the following.
(host) -> (192.0.2.0/24) -> (router)
Each of the arrows above represents a link, and in this case each of those links will have the "layer3" tag.
These layers can be used to filter the network map to just the kind of links you're interested in.
The VLAN and VXLAN layers have special handling. If any VLAN or VXLAN layer is selected, the layer2 layer will automatically be included. This is done because you likely wouldn't see the VLAN or VXLAN layer(s) chosen without also following layer2 links.
The selected layers operate as an "OR" filter on the map. Choosing the layer2 and layer3 layers will cause all nodes to be displayed that have at least one of the selected filters. There is currently no support for "AND" filters, or negations.
Different colors and shapes are used on the network map to convey information about the nodes and links on the map.
The fill color of each node's circle depends on the highest severity event currently open on the node. The colors only differ from Zenoss' normal event colors for info, debug, and clear severity events for higher clarity on the map.
The map's current root node will be circled with a purple band.
The links between nodes each have a color and a shape.
You can interact with the map using your pointer in a number of ways.
The map can be panned by clicking and dragging on the map's background. Each node can be moved by clicking and dragging the node. Panning the map won't cause nodes to reorganize, but moving nodes will.
Scrolling, pinching, or mouse-wheeling can all be used to zoom in and out.
Left-clicking on a node will navigate to that node's default page in Zenoss. This only works for nodes that have a page in Zenoss such as devices, components, IP networks, and some MAC addresses. Nothing will happen if a node with no default page is left-clicked.
Right-clicking a node will open its context menu. See below for node context menu details.
Each node on the network map can be right-clicked to open its context menu. Some of the following options may be available depending on the node.
The "Pin Down" option freezes the selected node in place on the network map. It will stay wherever you place it, and any unpinned nodes will reorganize around it.
Choosing "Put Map Root Here" is equivalent to changing the "Root device or component" option, but saves typing when you see the node you want to be the center on the map. Some types of nodes such as MAC addresses can't be the root.
The "Device Info" option opens a small pop-up over the network map with more information about the selected node. This option is only available for device and component nodes.
The "Open Node in a New Tab" option will open another tab in your browser to the default Zenoss page for the selected device, component, or IP network. Some types of nodes such as MAC addresses can't be opened in a new tab.
To update catalog with connections for network map, zenmapper daemon is used. It runs every 5 minutes by default, but this option could be changed by passing desired number of seconds to the --cycletime argument.
By default zenmapper configured to start 2 workers. This may be changed in config file by setting "workers" option value. Consider to use more than 2 workers in case you have >1000 devices monitored in Zenoss system. In small or test environment one may disable workers by setting it's value to 0. This affects memory used by zenmapper as well as speed of indexing L2 connections.
zenmapper connects to the ZODB and indexes all the connections provided from providers in ZODB catalog. On 4.2.x RM, running zenmapper on the remote collectors will do nothing because zenmapper runs against the hub. If desired, the additional zenmapper can be disabled by updating /opt/zenoss/etc/daemon.txt on the remote collector.
Imagine, for example that we want to display on the network map connections of VMware NSX components. They are modeled in NSX ZenPack.
We need to create new class, called for example NSXConnectionsProvider, which inherit from BaseConnectionsProvider, like this:
# our provider will inherit from this:
from ZenPacks.zenoss.Layer2.connections_provider import BaseConnectionsProvider
# and will yield this:
from ZenPacks.zenoss.Layer2.connections_provider import Connection
# self.context is a entity for which we will provide connections
for switch in self.context.nsxvirtualSwitchs():
# so, our device is called NSXManager, and it has switches
# yield connections to the switches
yield Connection(self.context, (switch, ), ('layer3', 'nsx'))
# each switch has interfaces:
for i in switch.nsxinterfaces():
# yield connection to the interfaces
yield Connection(switch, (i, ), ['layer3', 'nsx'])
# and each interface has many to one connection to edges:
yield Connection(i, (i.nsxedge(), ), ['layer3', 'nsx'])
So, we described how to get connections, now we need to tell zenoss, that this will be connections provider for any NSXManager devices. We do it by registering adapter in our ZenPack's configure.zcml:
<configure zcml:condition="installed ZenPacks.zenoss.Layer2.connections_provider">
<!-- Add this adapters only when module connections_provider is possible to import
(Which means that there is installed recent version of Layer2). -->
Another way to include adapters, is to put them in separate file, called for example layer2.zcml:
<?xml version = "1.0" encoding = "utf-8"?>
and than include that file conditionally:
zcml:condition="installed ZenPacks.zenoss.Layer2.connections_provider" />
To test connections that your provider yields, you could run
zenmapper run -v10 -d <name or id of your modeled device>
And then look it up on the network map.
This ZenPack has the following special circumstances that affect its installation.
If you are re-installing or updating this ZenPack on Zenoss 5.0, you should first check in control center that zenmapper daemon is stopped, and if not - stop it. It should be stopped automatically, but while this issue is not fixed, you should do that by hand.
Open vSwitch ZenPack version prior to 1.1.1 should be updated or removed before Layer2 ZenPack installation.
This ZenPack has two separate capabilities. The first is to collect clients connected to switch ports so that event suppression can be done when the switch fails, and the second is to discover neighbor relationships between network devices using the CDP (Cisco Discovery Protocol) and LLDP (Link Layer Discover Protocol).
To enable discovery of clients connected to switch ports you must enable the zenoss.snmp.ClientMACs modeler plugin for the switch devices. There is no need to enable this plugin for hosts, servers, or other endpoint devices. It is recommended to only assign the modeler plugin to access switch to which monitored servers are connected.
The discovery is done using BRIDGE-MIB forwarding tables, so it's a prerequisite that the switch supports BRIDGE-MIB.
To collect neighbor information from network devices that support CDP or LLDP, you must enable the zenoss.snmp.CDPLLDPDiscover modeler plugin for the devices.
When combined with the Zenoss Service Dynamics product, this ZenPack adds built-in service impact capability based on Layer 2 data. The following service impact relationships are automatically added. These will be included in any services that contain one or more of the explicitly mentioned entities.
In case index for certain device is broken, one may force zenmapper to reindex this specific device. Daemon should be run with --force option.
Let's discuss Layer2 connections in particular.
The essential mechanism that distinguishes network switches from network hubs is the MAC forwarding table. Instead of broadcasting incoming link layer frames to all it's interfaces, as hubs do, switches look into the forwarding table to find out which particular interface is connected to the destination device. The switch learns which devices are connected to which interface by looking at the source MAC address of incoming frames. Those MAC addresses are called "client MAC addresses".
For zenoss to discover Layer 2 connection between some devices, MAC address of some interface of one device should be equal to client MAC address of some interface of other device. You could check if client MAC addresses for interface are modeled by looking at it's "Clients MAC addresses" display. It there are none, check that zenoss.snmp.ClientMACs modeler plugin is bound to device, and remodel device.
It is also possible that there are no MAC address required to discover connection in forwarding table. To check that, you could run debug utility bridge_snmp.py:
python bridge_snmp.py clientmacs -c <community_string> <host>
and see if your client mac address is visible at switch at all.
Records in forwarding table are aged pretty fast, by default in 5 minutes. So, when there were no network activity on connection for more than 5 minutes, entry will be removed from switch forwarding table. You could check dot1dTpAgingTime object to know exact timeout period in seconds:
$ snmpget -v2c -c <community_string> <host> 126.96.36.199.188.8.131.52.2.0
SNMPv2-SMI::mib-184.108.40.206.0 = INTEGER: 300
This ZenPack also adds impact relation for layer2 connections. Switches impact devices connected to them. But this will work only when such connection is present on network map (see two previous sections for guide on troubleshooting that).
If there is connection on network map, but still, no impact relation, than, probably impact relations were not rebuilt. You could do that by indexing device, for example by changing some field on overview and saving it. Or modeling device again.
There are no client MACs data on interfaces modeled for the first time. This happens because zenoss.snmp.ClientMACs plugin runs before interfaces are modeled by another network modeler plugin (for example cisco.snmp.Interfaces or zenoss.snmp.InterfaceMap), so there is no entities to save this attribute on. Currently it is not possible to define order of modeler execution, so this remains a limitation.
Possible workaround is to wait for next model cycle or just model the device again manually.
Cisco UCS infrastructure will only add layer 2 (Ethernet or MAC address) connections to the network map. Layer 3 (IP) connections will not exist. This is scheduled to be fixed (ZPS-2465) in version 2.6.3 of the Cisco UCS ZenPack.
If you cannot find the answer in the documentation, then Resource Manager (Service Dynamics)
users should contact Zenoss Customer Support. Core users can use the #zenoss IRC channel or the community.zenoss.org forums.
Installing this ZenPack will add the following items to your Zenoss system.
View the discussion thread.
This ZenPack is developed and supported by Zenoss Inc. Contact Zenoss to request more information regarding this or any other ZenPacks. Click here to view all available Zenoss Open Source ZenPacks.