Event management
Events, and the graphs generated from performance monitoring, are the primary operational tools for understanding the state of your environment.
Event fields
To enter the event management system, an event must contain values for the device, severity, and summary fields. Collection Zone rejects events that are missing any of these fields.
Basic event fields are as follows:
- Summary
- Device
- Component
- Severity
- Event Class Key
- Event Class
- Collector
Events include numerous other standard fields. Some control how an event is mapped and correlated; others provide information about the event.
Device field
The device field is a free-form text field that allows up to 255 characters. Collection Zone accepts any value for this field. If the device field contains an IP address or a hostname, then the system will automatically identify and add the event to the corresponding device.
Collection Zone automatically adds information to incoming events that match a device. Fields added are:
- prodState - Specifies the device's current production state.
- Location - Specifies the location (if any) to which the device is assigned.
- DeviceClass - Classifies the device.
- DeviceGroups - Specifies the groups (if any) to which the device is assigned.
- Systems - Systems (if any) to which the device is assigned.
- DevicePriority - Priority assigned to the device.
Status field
The Status field defines the current state of an event. This field is often updated after an event has been created. Values for this numeric field are 0-6, defined as follows:
Number | Name | Description |
---|---|---|
0 | New | Initial state upon creation |
1 | Acknowledged | A user has seen and marked the event |
2 | Suppressed | A transform has suppressed the event |
3 | Closed | A user action has closed the event |
4 | Cleared | A corresponding clear event has cleared the event |
5 | Dropped | A transform has dropped an event, so the event it not persisted |
6 | Aged | Automatically closed because of the severity and last seen time values |
Severity field
The following table maps event severity levels to their labels and colors.
Level | Label | Color |
---|---|---|
5 | Critical | Red |
4 | Error | Orange |
3 | Warning | Yellow |
2 | Info | Blue |
1 | Debug | Grey |
0 | Clear | Green |
Summary and message fields
The summary and message fields are free-form text fields. The summary field allows up to 255 characters. The message field allows up to 4096 characters. These fields usually contain similar data.
The system handles these fields differently, depending on whether one or both are present on an incoming event:
- If only summary is present, then the system copies its contents into message and truncates summary contents to 128 characters.
- If only message is present, then the system copies its contents into summary and truncates summary contents to 128 characters.
- If summary and message are both present, then the system truncates summary contents to 128 characters.
As a result, data loss is possible only if the message or summary content exceeds 65535 characters, or if both fields are present and the summary content exceeds 128 characters.
To ensure that enough detail can be contained within the 128-character summary field limit, avoid reproducing information in the summary that exists on other fields (such as device, component, or severity).
Other fields
The following table lists additional event fields.
Field | Description |
---|---|
dedupid | Dynamically generated fingerprint that allows the system to perform de-duplication on repeating events that share similar characteristics. |
component | Free-form text field (maximum 255 characters) that allows additional context to be given to events (for example, the interface name for an interface threshold event). |
eventClass | Name of the event class into which this event has been created or mapped. |
eventKey | Free-form text field (maximum 128 characters) that allows another specificity key to be used to drive the de-duplication and auto-clearing correlation process. |
eventClassKey | Free-form text field (maximum 128 characters) that is used as the first step in mapping an unknown event into an event class. |
eventGroup | Free-form text field (maximum 64 characters) that can be used to group similar types of events. This is primarily an extension point for customization. Currently not used in a standard system. |
stateChange | Last time that any information about the event changed. |
firstTime | First time that the event occurred. |
lastTime | Most recent time that the event occurred. |
count | Number of occurrences of the event between the firstTime and lastTime. |
prodState | Production state of the device, updated when an event occurs. This value is not changed when a device's production state is changed; it always reflects the state when the event was received by the system. |
agent | Typically the name of the daemon that generated the event. For example, an SNMP threshold event will have zenperfsnmp as its agent. |
DeviceClass | Device class of the device that the event is related to. |
Location | Location of the device that the event is related to. |
Systems | Pipe-delimited list of systems that the device is contained within. |
DeviceGroups | Pipe-delimited list of systems that the device is contained within. |
facility | Only present on events coming from syslog. The syslog facility. |
priority | Only present on events coming from syslog. The syslog priority. |
ntevid | Only present on events coming from Windows event log. The NT Event ID. |
ownerid | Name of the user who acknowledged this event. |
clearid | Only present on events in the archive that were auto-cleared. The evid of the event that cleared this one. |
DevicePriority | Priority of the device that the event is related to. |
eventClassMapping | If this event was matched by one of the configured event class mappings, contains the name of that mapping rule. |
monitor | In a distributed setup, contains the name of the collector from which the event originated. |
In addition to the standard fields, the system also allows events to add an arbitrary number of additional name/value pairs to events to give them more context.
Deduplication
Collection Zone uses an event "de-duplication" feature, based on the concept of an event's fingerprint. Within the system, this fingerprint is the "dedupid." All of the standard events that the system creates as a result of its polling activities are de-duplicated, with no setup required. However, you can apply de-duplicating to events that arrive from other sources, such as syslog, SNMP traps, or a Windows event log.
The most important de-duplication concept is the fingerprint. An event's fingerprint (or dedupid) is composed of a pipe-delimited string that contains these event fields:
- device
- component (can be blank)
- eventClass
- eventKey (can be blank)
- severity
- summary (omitted from the dedupid if eventKey is non-blank)
When the component and eventKey fields are blank, a dedupid appears similar to:
www.example.com||/Status/Web||4|WebTx check failed
When the component and eventKey fields are present, a dedupid appears similar to:
router1.example.com|FastEthernet0/1|/Perf/Interface|threshName|4
When a new event is received by the system, the dedupid is constructed. If it matches the dedupid for any active event, the existing event is updated with properties of the new event occurrence and the event's count is incremented by one, and the lastTime field is updated to be the created time of the new event occurrence. If it does not match the dedupid of any active events, then it is inserted into the active event table with a count of 1, and the firstTime and lastTime fields are set to the created time of the new event.
The following illustration depicts a de-duplication scenario in which an identical event occurs three times, followed by one that is different in a single aspect of the dedupid fingerprint.
If you want to change the way de-duplication behaves, you can use an event transform to alter one of the fields used to build the dedupid. You also can use a transform to directly modify the dedupid field, for more powerful cross-device event de-duplication.
Auto-clear correlation
The auto-clearing feature is similar to the de-duplication feature. It also is based on the event's fingerprint. The difference is which event fields make up the fingerprint, and what happens when a new event matches an existing event's fingerprint.
All of the standard events created as a result of polling activities do auto-clearing by themselves. As with de-duplication, you would invoke auto-clearing manually only to handle events that come from other sources, such as syslog, a Windows event log, or SNMP traps.
If a component has been identified for the event, then the auto-clear fingerprint consists of these fields:
- If component UUID exists:
- component UUID
- eventClass (including zEventClearClasses from event class configuration properties)
- eventKey (can be blank)
- If component UUID does not exist:
- device
- component (can be blank)
- eventKey (can be blank)
- eventClass (including zEventClearClasses from event class configuration properties)
When a new event comes into the system with a special 0 (Clear) severity, Collection Zone checks all active events to see if they match the auto-clear fingerprint of the new event. All active events that match the auto-clear fingerprint are updated with a Cleared state, and the clearid field is set to the UUID of the clear event. After a configurable period of time, all events in a closed state (Closed, Cleared, and Aged) are moved from the active events table to the event archive.
If an event is cleared by the clear event, it is also inserted into the active events table with a status of Closed; otherwise,it is dropped. This is done to prevent extraneous clear messages from filling your events database.
The following illustration depicts a standard ping down event and its associated clear event.
If you need to manually invoke the auto-clearing correlation system, you can use an event transform to make sure that the clear event has the 0 (Clear) severity set. You also need to ensure that the device, component, and eventClass fields match the events you intend to clear.
Tip
To prevent inadvertently clearing a wider range of events than intended, avoid making clear events too generic.
Creating events manually
Manually-created events are useful for testing event mappings, event transforms, and triggers/notifications.
Follow these steps:
-
Navigate to EVENTS > EVENT CONSOLE.
-
Click the Add an event icon.
-
In the Create Event dialog box, add event information.
Field Required? Description Summary Yes A text message describing the event. Device Yes The IP address or hostname of a device; the subject of the event. Component No A device component contained in the subject of the event. Severity Yes The event severity level. Event Class Key No The event class key. Event Class No The event class. Collector Yes The Zenoss Cloud collector that contains the subject of the event (the device). -
Click Submit.
Event class mappings are applied only for events that do not already have an event class.
Event sources and classes
Events enter the system as follows:
- Generated events are created as a result of active polling.
- Captured events are transmitted by external actions into the system.
Generated events
The following standard services generate events. They automatically perform appropriate de-duplication and auto-clearing.
- zenping - Ping up/down events
- zenstatus - TCP port up/down events
- zenperfsnmp - SNMP agent up/down events, threshold events
- zencommand - Generic status events, threshold events
- zenprocess - Process up/down events, threshold events
- zenwin - Windows service up/down events
Captured events
Captured events are those events that the system does not specifically know will occur in advance. De-duplication is performed on these events, but might require tuning. By default, no auto-clearing is done on captured events. Event transforms must be used to create the auto-clear correlations.
The following services collect captured events:
- zensyslog - Events created from syslog messages.
- zentrap - Events created from SNMP traps and informs.
ZenPacks that you install might include their own services.
Event classes
Event classes are a simple organizational structure for the different types of events that the system generates and receives. This organization is useful for driving alerting and reporting. You can, for example, create an alerting rule that sends you an email or pages you when the availability of a Web site or page is affected by filtering on the /Status/Web event class.
Following is a subset of the default event classes. You can create additional event classes as needed.
- /App - Application-related events.
- /Change - Events created when the system finds changes in your environment.
- /Perf - Used for performance threshold events.
- /Perf/CPU - CPU utilization events
- /Perf/Memory - Memory utilization or paging events
- /Perf/Interface - Network interface utilization events
- /Perf/Filesystem - File system usage events
- /Status - Used for events affecting availability.
- /Status/Ping - Ping up/down events
- /Status/Snmp - SNMP up/down events
- /Status/Web - Web site or page up/down events
- /ZenossRM - Collection Zone system events, including key metrics for nodes in the metric collection, event generation, and modeling processes
For more information, see Configuration properties.