Troubleshooting Guide
Chapter 7 - Maintenance Troubleshooting

Table Of Contents

Maintenance Troubleshooting

Introduction

Maintenance Events and Alarms

MAINTENANCE (1)

MAINTENANCE (2)

MAINTENANCE (3)

MAINTENANCE (4)

MAINTENANCE (5)

MAINTENANCE (6)

MAINTENANCE (7)

MAINTENANCE (8)

MAINTENANCE (9)

MAINTENANCE (10)

MAINTENANCE (11)

MAINTENANCE (12)

MAINTENANCE (13)

MAINTENANCE (14)

MAINTENANCE (15)

MAINTENANCE (16)

MAINTENANCE (17)

MAINTENANCE (18)

MAINTENANCE (19)

MAINTENANCE (20)

MAINTENANCE (21)

MAINTENANCE (22)

MAINTENANCE (23)

MAINTENANCE (24)

MAINTENANCE (25)

MAINTENANCE (26)

MAINTENANCE (27)

MAINTENANCE (28)

MAINTENANCE (29)

MAINTENANCE (30)

MAINTENANCE (31)

MAINTENANCE (32)

MAINTENANCE (33)

MAINTENANCE (34)

MAINTENANCE (35)

MAINTENANCE (36)

MAINTENANCE (37)

MAINTENANCE (38)

MAINTENANCE (39)

MAINTENANCE (40)

MAINTENANCE (41)

MAINTENANCE (42)

MAINTENANCE (43)

MAINTENANCE (44)

MAINTENANCE (45)

MAINTENANCE (46)

MAINTENANCE (47)

MAINTENANCE (48)

MAINTENANCE (49)

MAINTENANCE (50)

MAINTENANCE (51)

MAINTENANCE (52)

MAINTENANCE (53)

MAINTENANCE (54)

MAINTENANCE (55)

MAINTENANCE (56)

MAINTENANCE (57)

MAINTENANCE (58)

MAINTENANCE (59)

MAINTENANCE (60)

MAINTENANCE (61)

MAINTENANCE (62)

MAINTENANCE (63)

MAINTENANCE (64)

MAINTENANCE (65)

MAINTENANCE (66)

MAINTENANCE (67)

MAINTENANCE (68)

MAINTENANCE (69)

MAINTENANCE (70)

MAINTENANCE (71)

MAINTENANCE (72)

MAINTENANCE (73)

MAINTENANCE (74)

MAINTENANCE (75)

MAINTENANCE (76)

MAINTENANCE (77)

MAINTENANCE (78)

MAINTENANCE (79)

MAINTENANCE (80)

MAINTENANCE (81)

MAINTENANCE (82)

MAINTENANCE (83)

MAINTENANCE (84)

MAINTENANCE (85)

MAINTENANCE (86)

MAINTENANCE (87)

MAINTENANCE (88)

MAINTENANCE (89)

MAINTENANCE (90)

MAINTENANCE (91)

MAINTENANCE (92)

MAINTENANCE (93)

MAINTENANCE (94)

MAINTENANCE (95)

MAINTENANCE (96)

MAINTENANCE (97)

MAINTENANCE (98)

MAINTENANCE (99)

MAINTENANCE (100)

MAINTENANCE (101)

MAINTENANCE (102)

MAINTENANCE (103)

MAINTENANCE (104)

MAINTENANCE (105)

MAINTENANCE (106)

MAINTENANCE (107)

MAINTENANCE (108)

MAINTENANCE (109)

MAINTENANCE (110)

MAINTENANCE (111)

MAINTENANCE (112)

MAINTENANCE (113)

MAINTENANCE (114)

MAINTENANCE (115)

MAINTENANCE (116)

MAINTENANCE (117)

MAINTENANCE (118)

MAINTENANCE (119)

MAINTENANCE (120)

MAINTENANCE (121)

MAINTENANCE (122)

MAINTENANCE (123)

Monitoring Maintenance Events

Test Report—Maintenance (1)

Report Threshold Exceeded—Maintenance (2)

Local Side has Become Faulty—Maintenance (3)

Mate Side has Become Faulty—Maintenance (4)

Changeover Failure—Maintenance (5)

Changeover Timeout—Maintenance (6)

Mate Rejected Changeover—Maintenance (7)

Mate Changeover Timeout—Maintenance (8)

Local Initialization Failure—Maintenance (9)

Local Initialization Timeout—Maintenance (10)

Switchover Complete—Maintenance (11)

Initialization Successful—Maintenance (12)

Administrative State Change—Maintenance (13)

Call Agent Administrative State Change—Maintenance (14)

Feature Server Administrative State Change—Maintenance (15)

Process Manager: Starting Process—Maintenance (16)

Invalid Event Report Received—Maintenance (17)

Process Manager: Process has Died—Maintenance (18)

Process Manager: Process Exceeded Restart Rate—Maintenance (19)

Lost Connection to Mate—Maintenance (20)

Network Interface Down—Maintenance (21)

Mate is Alive—Maintenance (22)

Process Manager: Process Failed to Complete Initialization—Maintenance (23)

Process Manager: Restarting Process—Maintenance (24)

Process Manager: Changing State—Maintenance (25)

Process Manager: Going Faulty—Maintenance (26)

Process Manager: Changing Over to Active—Maintenance (27)

Process Manager: Changing Over to Standby—Maintenance (28)

Administrative State Change Failure—Maintenance (29)

Element Manager State Change—Maintenance (30)

Process Manager: Sending Go Active to Process—Maintenance (32)

Process Manager: Sending Go Standby to Process—Maintenance (33)

Process Manager: Sending End Process to Process—Maintenance (34)

Process Manager: All Processes Completed Initialization—Maintenance (35)

Process Manager: Sending All Processes Initialization Complete to Process—Maintenance (36)

Process Manager: Killing Process—Maintenance (37)

Process Manager: Clearing the Database—Maintenance (38)

Process Manager: Cleared the Database—Maintenance (39)

Process Manager: Binary Does not Exist for Process—Maintenance (40)

Administrative State Change Successful with Warning—Maintenance (41)

Number of Heartbeat Messages Received is Less Than 50% of Expected—Maintenance (42)

Process Manager: Process Failed to Come Up in Active Mode—Maintenance (43)

Process Manager: Process Failed to Come Up in Standby Mode—Maintenance (44)

Application Instance State Change Failure—Maintenance (45)

Network Interface Restored—Maintenance (46)

Thread Watchdog Counter Expired for a Thread—Maintenance (47)

Index Table Usage Exceeded Minor Usage Threshold Level—Maintenance (48)

Index Table Usage Exceeded Major Usage Threshold Level—Maintenance (49)

Index Table Usage Exceeded Critical Usage Threshold Level—Maintenance (50)

A Process Exceeds 70% of Central Processing Unit Usage—Maintenance (51)

Central Processing Unit Usage is Now Below the 50% Level—Maintenance (52)

The Central Processing Unit Usage is Over 90% Busy—Maintenance (53)

The Central Processing Unit has Returned to Normal Levels of Operation—Maintenance (54)

The Five Minute Load Average is Abnormally High—Maintenance (55)

The Load Average has Returned to Normal Levels—Maintenance (56)

Memory and Swap are Consumed at Critical Levels—Maintenance (57)

Memory and Swap are Consumed at Abnormal Levels—Maintenance (58)

No Heartbeat Messages Received Through the Interface—Maintenance (61)

Link Monitor: Interface Lost Communication—Maintenance (62)

Outgoing Heartbeat Period Exceeded Limit—Maintenance (63)

Average Outgoing Heartbeat Period Exceeds Major Alarm Limit—Maintenance (64)

Disk Partition Critically Consumed—Maintenance (65)

Disk Partition Significantly Consumed—Maintenance (66)

The Free Inter-Process Communication Pool Buffers Below Minor Threshold—Maintenance (67)

The Free Inter-Process Communication Pool Buffers Below Major Threshold—Maintenance (68)

The Free Inter-Process Communication Pool Buffers Below Critical Threshold—Maintenance (69)

The Free Inter-Process Communication Pool Buffer Count Below Minimum Required—Maintenance (70)

Local Domain Name System Server Response Too Slow—Maintenance (71)

External Domain Name System Server Response Too Slow—Maintenance (72)

External Domain Name System Server not Responsive—Maintenance (73)

Local Domain Name System Service not Responsive—Maintenance (74)

Mismatch of Internet Protocol Address Local Server and Domain Name System—Maintenance (75)

Mate Time Differs Beyond Tolerance—Maintenance (77)

Bulk Data Management System Admin State Change—Maintenance (78)

Resource Reset—Maintenance (79)

Resource Reset Warning—Maintenance (80)

Resource Reset Failure—Maintenance (81)

Average Outgoing Heartbeat Period Exceeds Critical Limit—Maintenance (82)

Swap Space Below Minor Threshold—Maintenance (83)

Swap Space Below Major Threshold—Maintenance (84)

Swap Space Below Critical Threshold—Maintenance (85)

System Health Report Collection Error—Maintenance (86)

Status Update Process Request Failed—Maintenance (87)

Status Update Process Database List Retrieval Error—Maintenance (88)

Status Update Process Database Update Error—Maintenance (89)

Disk Partition Moderately Consumed—Maintenance (90)

Internet Protocol Manager Configuration File Error—Maintenance (91)

Internet Protocol Manager Initialization Error—Maintenance (92)

Internet Protocol Manager Interface Failure—Maintenance (93)

Internet Protocol Manager Interface State Change—Maintenance (94)

Internet Protocol Manager Interface Created—Maintenance (95)

Internet Protocol Manager Interface Removed—Maintenance (96)

Inter-Process Communication Input Queue Entered Throttle State—Maintenance (97)

Inter-Process Communication Input Queue Depth at 25% of its Hi-Watermark—Maintenance (98)

Inter-Process Communication Input Queue Depth at 50% of its Hi-Watermark—Maintenance (99)

Inter-Process Communication Input Queue Depth at 75% of its Hi-Watermark—Maintenance (100)

Switchover in Progress—Maintenance (101)

Thread Watchdog Counter Close to Expiry for a Thread—Maintenance (102)

Central Processing Unit is Offline—Maintenance (103)

Aggregration Device Address Successfully Resolved—Maintenance (104)

Unprovisioned Aggregration Device Detected—Maintenance (105)

Aggregration Device Address Resolution Failure—Maintenance (106)

No Heartbeat Messages Received Through Interface From Router—Maintenance (107)

A Log File Cannot be Transferred—Maintenance (108)

Five Successive Log Files Cannot be Transferred—Maintenance (109)

Access to Log Archive Facility Configuration File Failed or File Corrupted—Maintenance (110)

Cannot Login to External Archive Server—Maintenance (111)

Congestion Status—Maintenance (112)

Central Processing Unit Load of Critical Processes—Maintenance (113)

Queue Length of Critical Processes—Maintenance (114)

Inter-Process Communication Buffer Usage Level—Maintenance (115)

Call Agent Reports the Congestion Level of Feature Server—Maintenance (116)

Side Automatically Restarting Due to Fault—Maintenance (117)

Domain Name Server Zone Database does not Match Between the Primary Domain Name Server and the Internal Secondary Authoritative Domain Name Server—Maintenance (118)

Periodic Shared Memory Database Backup Failure—Maintenance (119)

Periodic Shared Memory Database Backup Success—Maintenance (120)

Invalid SOAP Request—Maintenance (121)

Northbound Provisioning Message is Retransmitted—Maintenance (122)

Northbound Provisioning Message Dropped Due To Full Index Table—Maintenance (123)

Troubleshooting Maintenance Alarms

Local Side has Become Faulty—Maintenance (3)

Mate Side has Become Faulty—Maintenance (4)

Changeover Failure—Maintenance (5)

Changeover Timeout—Maintenance (6)

Mate Rejected Changeover—Maintenance (7)

Mate Changeover Timeout—Maintenance (8)

Local Initialization Failure—Maintenance (9)

Local Initialization Timeout—Maintenance (10)

Process Manager: Process has Died—Maintenance (18)

Process Manager: Process Exceeded Restart Rate—Maintenance (19)

Lost Connection to Mate—Maintenance (20)

Network Interface Down—Maintenance (21)

Process Manager: Process Failed to Complete Initialization—Maintenance (23)

Process Manager: Restarting Process—Maintenance (24)

Process Manager: Going Faulty—Maintenance (26)

Process Manager: Binary Does not Exist for Process—Maintenance (40)

Number of Heartbeat Messages Received is Less Than 50% of Expected—Maintenance (42)

Process Manager: Process Failed to Come Up in Active Mode—Maintenance (43)

Process Manager: Process Failed to Come Up in Standby Mode—Maintenance (44)

Application Instance State Change Failure—Maintenance (45)

Thread Watchdog Counter Expired for a Thread—Maintenance (47)

Index Table Usage Exceeded Minor Usage Threshold Level—Maintenance (48)

Index Table Usage Exceeded Major Usage Threshold Level—Maintenance (49)

Index Table Usage Exceeded Critical Usage Threshold Level—Maintenance (50)

A Process Exceeds 70% of Central Processing Unit Usage—Maintenance (51)

The Central Processing Unit Usage is Over 90% Busy—Maintenance (53)

The Five Minute Load Average is Abnormally High—Maintenance (55)

Memory and Swap are Consumed at Critical Levels—Maintenance (57)

No Heartbeat Messages Received Through the Interface—Maintenance (61)

Link Monitor: Interface Lost Communication—Maintenance (62)

Outgoing Heartbeat Period Exceeded Limit—Maintenance (63)

Average Outgoing Heartbeat Period Exceeds Major Alarm Limit—Maintenance (64)

Disk Partition Critically Consumed—Maintenance (65)

Disk Partition Significantly Consumed—Maintenance (66)

The Free Inter-Process Communication Pool Buffers Below Minor Threshold—Maintenance (67)

The Free Inter-Process Communication Pool Buffers Below Major Threshold—Maintenance (68)

The Free Inter-Process Communication Pool Buffers Below Critical Threshold—Maintenance (69)

The Free Inter-Process Communication Pool Buffer Count Below Minimum Required—Maintenance (70)

Local Domain Name System Server Response Too Slow—Maintenance (71)

External Domain Name System Server Response Too Slow—Maintenance (72)

External Domain Name System Server not Responsive—Maintenance (73)

Local Domain Name System Service not Responsive—Maintenance (74)

Mate Time Differs Beyond Tolerance—Maintenance (77)

Average Outgoing Heartbeat Period Exceeds Critical Limit—Maintenance (82)

Swap Space Below Minor Threshold—Maintenance (83)

Swap Space Below Major Threshold—Maintenance (84)

Swap Space Below Critical Threshold—Maintenance (85)

System Health Report Collection Error—Maintenance (86)

Status Update Process Request Failed—Maintenance (87)

Status Update Process Database List Retrieval Error—Maintenance (88)

Status Update Process Database Update Error—Maintenance (89)

Disk Partition Moderately Consumed—Maintenance (90)

Internet Protocol Manager Configuration File Error—Maintenance (91)

Internet Protocol Manager Initialization Error—Maintenance (92)

Internet Protocol Manager Interface Failure—Maintenance (93)

Inter-Process Communication Input Queue Entered Throttle State—Maintenance (97)

Inter-Process Communication Input Queue Depth at 25% of Its Hi-Watermark—Maintenance (98)

Inter-Process Communication Input Queue Depth at 50% of Its Hi-Watermark—Maintenance (99)

Inter-Process Communication Input Queue Depth at 75% of Its Hi-Watermark—Maintenance (100)

Switchover in Progress—Maintenance (101)

Thread Watchdog Counter Close to Expiry for a Thread—Maintenance (102)

Central Processing Unit is Offline—Maintenance (103)

No Heartbeat Messages Received Through Interface From Router—Maintenance (107)

Five Successive Log Files Cannot be Transferred—Maintenance (109)

Access to Log Archive Facility Configuration File Failed or File Corrupted—Maintenance (110)

Cannot Login to External Archive Server—Maintenance (111)

Congestion Status—Maintenance (112)

Side Automatically Restarting Due to Fault—Maintenance (117)

Domain Name Server Zone Database does not Match Between the Primary Domain Name Server and the Internal Secondary Authoritative Domain Name Server—Maintenance (118)

Periodic Shared Memory Database Backup Failure—Maintenance (119)


Maintenance Troubleshooting


Revised: December 11, 2008, OL-8723-17

Introduction

This chapter provides the information needed to monitor and troubleshoot maintenance events and alarms. This chapter is divided into the following sections:

Maintenance Events and Alarms—Provides a brief overview of each maintenance event and alarm.

Monitoring Maintenance Events—Provides the information needed to monitor and correct the maintenance events.

Troubleshooting Maintenance Alarms—Provides the information needed to troubleshoot and correct the maintenance alarms.

Maintenance Events and Alarms

This section provides a brief overview of the maintenance events and alarms for the Cisco BTS 10200 Softswitch in numerical order. Table 7-1 lists all of the maintenance events and alarms by severity.


Note Click the maintenance message number in Table 7-1 to display information about the event.


Table 7-1 Maintenance Events and Alarms by Severity 

Critical
Major
Minor
Warning
Info
Not Used

MAINTENANCE (40)

MAINTENANCE (3)

MAINTENANCE (18)

MAINTENANCE (29)

MAINTENANCE (1)

MAINTENANCE (31)

MAINTENANCE (43)

MAINTENANCE (4)

MAINTENANCE (24)

MAINTENANCE (41)

MAINTENANCE (2)

MAINTENANCE (59)

MAINTENANCE (44)

MAINTENANCE (5)

MAINTENANCE (48)

MAINTENANCE (75)

MAINTENANCE (11)

MAINTENANCE (60)

MAINTENANCE (47)

MAINTENANCE (6)

MAINTENANCE (67)

MAINTENANCE (105)

MAINTENANCE (12)

MAINTENANCE (76)

MAINTENANCE (50)

MAINTENANCE (7)

MAINTENANCE (83)

MAINTENANCE (106)

MAINTENANCE (13)

 

MAINTENANCE (53)

MAINTENANCE (8)

MAINTENANCE (86)

MAINTENANCE (108)

MAINTENANCE (14)

 

MAINTENANCE (57)

MAINTENANCE (9)

MAINTENANCE (90)

MAINTENANCE (123)

MAINTENANCE (15)

 

MAINTENANCE (61)

MAINTENANCE (10)

MAINTENANCE (98)

 

MAINTENANCE (16)

 

MAINTENANCE (65)

MAINTENANCE (19)

   

MAINTENANCE (17)

 

MAINTENANCE (69)

MAINTENANCE (20)

   

MAINTENANCE (22)

 

MAINTENANCE (70)

MAINTENANCE (21)

   

MAINTENANCE (25)

 

MAINTENANCE (73)

MAINTENANCE (23)

   

MAINTENANCE (27)

 

MAINTENANCE (74)

MAINTENANCE (26)

   

MAINTENANCE (28)

 

MAINTENANCE (82)

MAINTENANCE (42)

   

MAINTENANCE (30)

 

MAINTENANCE (85)

MAINTENANCE (45)

   

MAINTENANCE (32)

 

MAINTENANCE (91)

MAINTENANCE (49)

   

MAINTENANCE (33)

 

MAINTENANCE (97)

MAINTENANCE (51)

   

MAINTENANCE (34)

 

MAINTENANCE (100)

MAINTENANCE (55)

   

MAINTENANCE (35)

 

MAINTENANCE (101)

MAINTENANCE (62)

   

MAINTENANCE (36)

 

MAINTENANCE (102)

MAINTENANCE (63)

   

MAINTENANCE (37)

 

MAINTENANCE (103)

MAINTENANCE (64)

   

MAINTENANCE (38)

 

MAINTENANCE (107)

MAINTENANCE (66)

   

MAINTENANCE (39)

 

MAINTENANCE (111)

MAINTENANCE (68)

   

MAINTENANCE (46)

 

MAINTENANCE (117)

MAINTENANCE (71)

   

MAINTENANCE (52)

 

MAINTENANCE (118)

MAINTENANCE (72)

   

MAINTENANCE (54)

 

MAINTENANCE (119)

MAINTENANCE (77)

   

MAINTENANCE (56)

 
 

MAINTENANCE (84)

   

MAINTENANCE (58)

 
 

MAINTENANCE (87)

   

MAINTENANCE (78)

 
 

MAINTENANCE (88)

   

MAINTENANCE (79)

 
 

MAINTENANCE (89)

   

MAINTENANCE (80)

 
 

MAINTENANCE (92)

   

MAINTENANCE (81)

 
 

MAINTENANCE (93)

   

MAINTENANCE (94)

 
 

MAINTENANCE (99)

   

MAINTENANCE (95)

 
 

MAINTENANCE (109)

   

MAINTENANCE (96)

 
 

MAINTENANCE (110)

   

MAINTENANCE (104)

 
 

MAINTENANCE (112)

   

MAINTENANCE (113)

 
       

MAINTENANCE (114)

 
       

MAINTENANCE (115)

 
       

MAINTENANCE (116)

 
       

MAINTENANCE (120)

 
       

MAINTENANCE (121)

 
       

MAINTENANCE (122)

 

MAINTENANCE (1)

For additional information, refer to the "Test Report—Maintenance (1)" section.

DESCRIPTION

Test Report

SEVERITY

Information (INFO)

THRESHOLD

10000

THROTTLE

0


MAINTENANCE (2)

For additional information, refer to the "Report Threshold Exceeded—Maintenance (2)" section.

DESCRIPTION

Report Threshold Exceeded

SEVERITY

INFO

THRESHOLD

0

THROTTLE

0

DATAWORDS

Report Type-TWO_BYTES
Report Number-TWO_BYTES
Threshold Level-TWO_BYTES

PRIMARY
CAUSE

Issued when the threshold for a given report type and number is exceeded.

PRIMARY
ACTION

No action is required since this is an information report. The root cause event report and the threshold setting should be investigated to determine if there is a service affecting situation.


MAINTENANCE (3)

To troubleshoot and correct the cause of the alarm, refer to the "Local Side has Become Faulty—Maintenance (3)" section.

DESCRIPTION

Keep Alive Module: Local Side has Become Faulty (KAM: Local Side Has Become Faulty)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]
Reason-STRING [80]
Probable Cause-STRING [80]

PRIMARY
CAUSE

Can result from maintenance reports 5, 6, 9, 10, 19, or 20.

PRIMARY
ACTION

Review the information from the command line interface (CLI) log report. Usually a software problem; restart the software using the installation and startup procedure.

SECONDARY
CAUSE

Manually shutting down the system using platform stop command.

SECONDARY
ACTION

Reboot the host machine, reinstall all applications and restart all applications. If the fault state is a commonly occurring problem, then the operating system (OS) or a hardware failure may be the problem.


MAINTENANCE (4)

To troubleshoot and correct the cause of the alarm, refer to the "Mate Side has Become Faulty—Maintenance (4)" section.

DESCRIPTION

Keep Alive Module: Mate Side Has Become Faulty (KAM: Mate Side Has Become Faulty)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]
Reason-STRING [80]
Probable Cause-STRING [80]
Mate Ping-STRING [50]

PRIMARY
CAUSE

The local side has detected the mate side going to the faulty state.

PRIMARY
ACTION

Display the event summary on the faulty mate side, using the report event-summary command (see the CLI Guide for command details).

SECONDARY
ACTION

Review the information in the event summary. This is usually a software problem.

TERNARY
ACTION

After confirming the active side is processing traffic, restart software on the mate side. Log in to the mate platform as root user. Enter the platform stop command and then the platform start command.

SUBSEQUENT
ACTION

If software restart does not resolve the problem or if the platform goes immediately to faulty again, or does not start, contact Cisco Technical Assistance Center (TAC). It may be necessary to reinstall software. If problem is commonly occurring, then the OS or a hardware failure may the problem. Reboot the host machine, then reinstall and restart all applications. If you reboot, this will bring down other applications running on this machine. Contact Cisco TAC for assistance.



Note Refer to the "Obtaining Documentation and Submitting a Service Request" section on page lvi for detailed instructions on contacting Cisco TAC and opening a service request.


MAINTENANCE (5)

To troubleshoot and correct the cause of the alarm, refer to the "Changeover Failure—Maintenance (5)" section.

DESCRIPTION

Keep Alive Module: Changeover Failure (KAM: Changeover Failure)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

Issued when changing from an active processor to a standby and the changeover fails.

PRIMARY
ACTION

Review information from CLI log report.

SECONDARY
CAUSE

This alarm is usually caused by a software problem on the specific platform identified in the alarm report.

SECONDARY
ACTION

Restart the platform identified in the alarm report.

TERNARY
ACTION

If platform restart is not successful, reinstall the application for this platform, and then restart platform again.

SUBSEQUENT
ACTION

If necessary, reboot host machine this platform is located on. Then reinstall and restart all applications on this machine. If faulty state is a commonly occurring event, then the OS or a hardware failure may be the problem. Contact Cisco TAC for assistance. It may also be helpful to gather information event/alarm reports that were issued before and after this alarm report.



Note Refer to the "Obtaining Documentation and Submitting a Service Request" section on page lvi for detailed instructions on contacting Cisco TAC and opening a service request.


MAINTENANCE (6)

To troubleshoot and correct the cause of the alarm, refer to the "Changeover Timeout—Maintenance (6)" section.

DESCRIPTION

Keep Alive Module: Changeover Timeout (KAM: Changeover Timeout)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

The system failed to changeover within the required time period. Soon after this event is issued, one platform will go to the faulty state.

PRIMARY
ACTION

Review the information from CLI log report.

SECONDARY
CAUSE

This alarm is usually caused by a software problem on the specific platform identified in the alarm report.

SECONDARY
ACTION

Restart the platform identified in the alarm report.

TERNARY
ACTION

If platform restart is not successful, reinstall the application for this platform, and then restart the platform again.

SUBSEQUENT
ACTION

If necessary, reboot the host machine this platform is located on. Then reinstall and restart all applications on this machine. If faulty state is a commonly occurring event, then the operating system (OS) or a hardware failure may be the problem. Contact Cisco TAC for assistance. It may also be helpful to gather information event/alarm reports that were issued before and after this alarm report.



Note Refer to the "Obtaining Documentation and Submitting a Service Request" section on page lvi for detailed instructions on contacting Cisco TAC and opening a service request.


MAINTENANCE (7)

To troubleshoot and correct the cause of the alarm, refer to the "Mate Rejected Changeover—Maintenance (7)" section.

DESCRIPTION

Keep Alive Module: Mate Rejected Changeover (KAM: Mate Rejected Changeover)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

Mate is not yet in stable state.

PRIMARY
ACTION

Enter the status command to get information on the two systems in the pair (primary and secondary Element Management System (EMS), Call Agent (CA) or Feature Server (FS)).

SECONDARY
CAUSE

The mate detects itself faulty during changeover and then rejects changeover.

Note This attempted changeover could be caused by a forced (operator) switch, or could be caused by secondary instance rejecting changeover as primary is being brought up.

SECONDARY
ACTION

If the mate is faulty (not running), then perform the corrective action steps listed for the MAINTENANCE (4) event.

TERNARY
ACTION

If both systems (local and mate) are still running, diagnose whether both instances are operating in a stable state (one in active and the other in standby). If both are in a stable state, wait 10 minutes and try the control command again.

SUBSEQUENT
ACTION

If the standby side is not in stable state, bring down the standby side and restart software using the platform stop and platform start commands. If software restart does not resolve the problem, or if the problem is commonly occurring, contact Cisco TAC. It may be necessary to reinstall software. Additional OS or hardware problems may also need to be resolved.



Note Refer to the "Obtaining Documentation and Submitting a Service Request" section on page lvi for detailed instructions on contacting Cisco TAC and opening a service request.


MAINTENANCE (8)

To troubleshoot and correct the cause of the alarm, refer to the "Mate Changeover Timeout—Maintenance (8)" section.

DESCRIPTION

Keep Alive Module: Mate Changeover Timeout (KAM: Mate Changeover Timeout)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

The mate is faulty.

PRIMARY
ACTION

Review the information from CLI log report concerning the faulty mate.

SECONDARY
ACTION

This alarm is usually caused by a software problem on the specific mate platform identified in the alarm report.

TERNARY
ACTION

Restart the mate platform identified in the alarm report.

SUBSEQUENT
ACTION

If mate platform restart is not successful, reinstall the application for this mate platform, and then restart the mate platform again. If necessary, reboot the host machine this mate platform is located on. Then reinstall and restart all applications on that machine.


MAINTENANCE (9)

To troubleshoot and correct the cause of the alarm, refer to the "Local Initialization Failure—Maintenance (9)" section.

DESCRIPTION

Keep Alive Module: Local Initialization Failure (KAM: Local Initialization Failure)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

The local initialization has failed.

PRIMARY
ACTION

When this event report is issued, the system has failed and the re-initialization process has failed.

SECONDARY
ACTION

Check that the binary files are present for the unit (Call Agent, Feature Server, Element Manager).

TERNARY
ACTION

If the files are not present, then re-install the files from the initial or backup media. Then restart the failed device.


MAINTENANCE (10)

To troubleshoot and correct the cause of the alarm, refer to the "Local Initialization Timeout—Maintenance (10)" section.

DESCRIPTION

Keep Alive Module: Local Initialization Timeout (KAM: Local Initialization Timeout)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

The local initialization has timed out.

PRIMARY
ACTION

Check that the binary files are present for the unit (Call Agent, Feature Server, or Element Manager).

SECONDARY
CAUSE

When the event report is issued, the system has failed and the re-initialization process has failed.

SECONDARY
ACTION

If the files are not present, then re-install the files from the initial or backup media. Then restart the failed device.


MAINTENANCE (11)

For additional information, refer to the "Switchover Complete—Maintenance (11)" section.

DESCRIPTION

Switchover Complete

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

Acknowledges that the changeover has successfully completed.

PRIMARY
ACTION

This is an informational event report and no further action is required.


MAINTENANCE (12)

For additional information, refer to the "Initialization Successful—Maintenance (12)" section.

DESCRIPTION

Keep Alive Module: Initialization Successful (KAM: Initialization Successful)

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]

PRIMARY
CAUSE

The local initialization has been successfully completed.

PRIMARY
ACTION

This an informational event report and no further action is required.


MAINTENANCE (13)

For additional information, refer to the "Administrative State Change—Maintenance (13)" section.

DESCRIPTION

Administrative State Change (Admin State Change)

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Facility Type-STRING [40]
Facility ID-STRING [40]
Initial Admin State-STRING [20]
Target Admin State-STRING [20]
Current Admin State-STRING [20]

PRIMARY
CAUSE

The administrative state of a managed resource has changed.

PRIMARY
ACTION

No action is required, since this informational event report is given after a user has manually changed the administrative state of a managed resource.


MAINTENANCE (14)

For additional information, refer to the "Call Agent Administrative State Change—Maintenance (14)" section.

DESCRIPTION

Call Agent Administrative State Change

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Call Agent ID-STRING [40]
Current Local State-STRING [40]
Current Mate State-STRING [20]

PRIMARY
CAUSE

Indicates that the call agent has changed operational state as a result of a manual switchover (control command in CLI).

PRIMARY
ACTION

No action is required.


MAINTENANCE (15)

For additional information, refer to the "Feature Server Administrative State Change—Maintenance (15)" section.

DESCRIPTION

Feature Server Administrative State Change

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Feature Server ID-STRING [40]
Feature Server Type-STRING [40]
Current Local State-STRING [20]
Current Mate State-STRING [20]

PRIMARY
CAUSE

Indicates that the call agent has changed operational state as a result of a manual switchover (control command in CLI).

PRIMARY
ACTION

No action is required.


MAINTENANCE (16)

For additional information, refer to the "Process Manager: Starting Process—Maintenance (16)" section.

DESCRIPTION

Process Manager: Starting Process (PMG: Starting Process)

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Process Name-STRING [40]
Restart Type-STRING [40]
Restart Mode-STRING [32]
Process Group-ONE_BYTE

PRIMARY
CAUSE

A process is being started as the system is being brought up.

PRIMARY
ACTION

No action is required.


MAINTENANCE (17)

For additional information, refer to the "Invalid Event Report Received—Maintenance (17)" section.

DESCRIPTION

Invalid Event Report Received

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Report Type-TWO_BYTES
Report Number-TWO_BYTES
Validation Failure-STRING [30]

PRIMARY
CAUSE

Indicates that a process has sent an event report that cannot be found in the database.

PRIMARY
ACTION

If during system initialization a short burst of these event reports are issued prior to the database initialization, then these event reports are informational and can be ignored.

SECONDARY
ACTION

Otherwise, contact Cisco TAC technical support for more information. (Contact Cisco TAC.)



Note Refer to the "Obtaining Documentation and Submitting a Service Request" section on page lvi for detailed instructions on contacting Cisco TAC and opening a service request.


MAINTENANCE (18)

To troubleshoot and correct the cause of the alarm, refer to the "Process Manager: Process has Died—Maintenance (18)" section.

DESCRIPTION

Process Manager: Process has Died (PMG: Process has Died)

SEVERITY

MINOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Process Name-STRING [40]
Process Group-FOUR_BYTES

PRIMARY
CAUSE

This alarm is caused by a software problem.

PRIMARY
ACTION

If problem persists, contact Cisco TAC technical support. (Contact Cisco TAC.)



Note Refer to the "Obtaining Documentation and Submitting a Service Request" section on page lvi for detailed instructions on contacting Cisco TAC and opening a service request.


MAINTENANCE (19)

To troubleshoot and correct the cause of the alarm, refer to the "Process Manager: Process Exceeded Restart Rate—Maintenance (19)" section.

DESCRIPTION

Process Manager: Process Exceeded Restart Rate (PMG: Process Exceeded Restart Rate)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Process Name-STRING [40]
Restart Rate-FOUR_BYTES
Process Group-ONE_BYTE

PRIMARY
CAUSE

This alarm is usually caused by a software problem on the specific platform identified in the alarm report. Soon after this event is issued, one platform will go to the faulty state.

PRIMARY
ACTION

Review the information from CLI log report.

SECONDARY
ACTION

Restart the platform identified in the alarm report.

TERNARY
ACTION

If platform restart is not successful, reinstall the application for this platform, and then restart platform again.

SUBSEQUENT
ACTION

If necessary, reboot the host machine this platform is located on. Then reinstall and restart all applications on this machine.


MAINTENANCE (20)

To troubleshoot and correct the cause of the alarm, refer to the "Lost Connection to Mate—Maintenance (20)" section.

DESCRIPTION

Keep Alive Module: Lost Connection to Mate (KAM: Lost KAM Connection to Mate)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Mate Ping-STRING [50]

PRIMARY
CAUSE

Network interface hardware problem.

PRIMARY
ACTION

Check whether or not the network interface is down. If it is down, restore network interface and restart the software.

SECONDARY
CAUSE

The alarm can be caused by a router problem.

SECONDARY
ACTION

If the alarm is caused by a router problem, repair the router and reinstall.

TERNARY
CAUSE

Soon after this event is issued, one platform may go to the faulty state.


MAINTENANCE (21)

To troubleshoot and correct the cause of the alarm, refer to the "Network Interface Down—Maintenance (21)" section.

DESCRIPTION

Keep Alive Module: Network Interface Down (KAM: Network Interface Down)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

IP Address-STRING [50]

PRIMARY
CAUSE

The alarm is caused by a network interface hardware problem.

PRIMARY
ACTION

Subsequently system goes faulty.

SECONDARY
CAUSE

Soon after this event is issued, one platform may go to the faulty state.

SECONDARY
ACTION

Check whether or not the network interface is down. If the interface is down, restore network interface and restart the software.


MAINTENANCE (22)

For additional information, refer to the "Mate is Alive—Maintenance (22)" section.

DESCRIPTION

Keep Alive Module: Mate is Alive (KAM: Mate is Alive)

SEVERITY

INFO

THRESHOLD

100

THROTTLE

0

DATAWORDS

Local State-STRING [30]
Mate State-STRING [30]


MAINTENANCE (23)

To troubleshoot and correct the cause of the alarm, refer to the "Process Manager: Process Failed to Complete Initialization—Maintenance (23)" section.

DESCRIPTION

Process Manager: Process Failed to Complete Initialization (PMG: Process Failed to Complete Initialization)

SEVERITY

MAJOR

THRESHOLD

100

THROTTLE

0

DATAWORDS

Process Name-STRING [40]
Process Group-ONE_BYTE

PRIMARY
CAUSE

The specified process failed to complete the initialization during the restoral process.

PRIMARY
ACTION

Verify that the specified process's binary image is installed. If not, install it and restart the platform.


MAINTENANCE (24)

To troubleshoot and correct the cause of the alarm, refer to the <