Using Entuity for root cause analysis – Entuity

ICMP availability monitoring
Network outages
Application availability
If a managed object becomes unavailable

Further information on specific circumstances

Introduction:

Entuity root cause analysis monitors the end-to-end delivery of IT as a service, whilst at the same time monitoring each component of the infrastructure that together make up that service. By integrating these monitoring capabilities, IT operations are able to isolate infrastructure problems at the same time as understanding their impact on business activity.

Monitoring, incidents and events:

Root cause analysis extends the network monitoring capabilities of Entuity by alarming on both component and service failures. Entuity raises stateful alarms to the operator which automatically track ongoing problems through to resolution. Focusing on availability and latency (round trip response time) of devices and applications:

ICMP availability monitoring:

Entuity ICMP availability monitoring pings IP addresses and maps these addresses to managed devices and ports so events and incidents are raised against devices and ports rather than IP addresses. Where Entuity does not manage the IP address, Entuity associates it with the first managed port that is downstream of that IP address and indicates that the actual cause of the failure is upstream of the port.

Network outages:

For every network outage that is identified, Entuity uses data derived from its ICMP availability monitoring (traceroute) to identify the layer 3 network object closest to the Entuity server involved in the outage. Entuity can then raise Network Outage incidents and events on the object.

From Entuity v21.0 P03 upwards, the Network Outage incident and event details provides information on how many devices were affected by the outage, and the IP address of the root cause device. Note, this number does not include the number of ports. The Impacted Objects field of the Event Details form displays the impacted devices and their IP addresses.

From Entuity v21.0 P03 upwards, you can also specify a parameter in entuity.cfg enabling you to view corresponding Network Outage incidents and events from the Incidents dashboard of impacted devices, and not just the device on which the incident/event originated. Please see the includeImpactedRootCause parameter of the [events] section under entuity.cfg for further information on this.

r 5.png

r 4.png

You can also access Impacted Device Details via the right-click Context Menu of a Network Outage incident:

r 3.png

Application availability:

Entuity monitors application availability by testing the response of defined applications to its TCP connect request. Entuity considers an application as available if it can connect to the application’s open socket. By default, Entuity attempts to connect to monitored applications every two minutes.

If a managed object becomes unavailable:

If a managed object becomes unavailable, Entuity can use the discovered route to determine at what point the network failed or degraded and then raise the appropriate events and incidents. Entuity can potentially raise these events and incidents:

after pinging of the IP address which occurs every two minutes:
- AvailMonitor High Latency and AvailMonitor Normal Latency.
- Network Outage. Entuity raises Network Outage events against three different network
  objects:
  - devices, when all of the IP addresses on the device are not responding (node
    down).
  - ports, when Entuity determines that the outage is on a managed port.
  - IP addresses, when Entuity determines that the outage is at a point in the
    traceroute path not managed by the Entuity server.
    When Entuity raises a Network Outage event, Impacted displays a breakdown of how
    many devices, servers and applications are impacted by the root cause of the outage.
after the TCP connect to an application which occurs every two minutes:
- AvailMonitor Application Unavailable and AvailMonitor Application Available.
- AvailMonitor High Latency Reaching Application and AvailMonitor High Latency Reaching Application Cleared.
on hourly rolled up data and so can only be raised hourly. They also require thresholds
to be set:
- AvailMonitor Falling Average Latency.
- AvailMonitor Low View Device Reachability and AvailMonitor Normal View Device
  Reachability.
- AvailMonitor Rising Average Latency.
- AvailMonitor Rising Trend in Average Latency.

Further information on specific circumstances:

Please see the following articles for information regarding specific circumstances:

Introduction:

Monitoring, incidents and events:

Further information on specific circumstances:

Related articles