Testing alerts and alarms
About this task
After alarms and alert handlers are configured, verify that the server takes the appropriate action when an alarm state changes by manually increasing the severity of a gauge. Alarms and alerts can be verified with the status
tool.
Steps
-
Configure a gauge with
dsconfig
and set theoverride-severity
property to critical. The following example uses the CPU Usage (Percent) gauge.$ dsconfig set-gauge-prop \ --gauge-name "CPU Usage (Percent)" \ --set override-severity:critical
-
Run the
status
tool to verify that an alarm was generated with corresponding alerts. Thestatus
tool provides a summary of the server’s current state with key metrics and a list of recent alerts and alarms. The sample output has been shortened to show just the alarms and alerts information.$ bin/status
--- Administrative Alerts --- Severity : Time : Message ---------:----------------:----------------------------------------------- Error : 11/Aug/2016 : Alarm [CPU Usage (Percent). Gauge CPU Usage (Percent) : 15:41:00 -0500 : for Host System has : : a current value of '18.583333333333332'. : : The severity is currently OVERRIDDEN in the : : Gauge's configuration to 'CRITICAL'. : : The actual severity is: The severity is : : currently 'NORMAL', having assumed this severity : : Mon Aug 11 15:41:00 CDT 2016. If CPU use is high, : : check the server's current workload and make any : : needed adjustments. Reducing the load on the system : : will lead to better response times. : : Resource='Host System'] : : raised with critical severity Shown are alerts of severity [Info,Warning,Error,Fatal] from the past 48 hours Use the --maxAlerts and/or --alertSeverity options to filter this list
--- Alarms --- Severity : Severity : Condition : Resource : Details : Start Time : : : ---------:------------:-----------:-------------:------------------------- Critical : 11/Aug/2016: CPU Usage : Host System : Gauge CPU Usage (Percent) for : 15:41:00 : (Percent) : : Host System : -0500 : : : has a current value of : : : : '18.785714285714285'. : : : : The severity is currently : : : : 'CRITICAL', having assumed : : : : this severity Mon Aug 11 : : : : 15:49:00 CDT 2016. If CPU use : : : : is high, check the server's : : : : current workload and make any : : : : needed adjustments. Reducing : : : : the load on the system will : : : : lead to better response times Shown are alarms of severity [Warning,Minor,Major,Critical Use the --alarmSeverity option to filter this list