Prerequisites for Online Diagnostics

Restrictions for Online Diagnostics

None.

Information About Online Diagnostics

With online diagnostics, you can test and verify the hardware functionality of the switch while the switch is connected to a live network.

The online diagnostics contain packet switching tests that check different hardware components and verify the data path and control signals. Disruptive online diagnostic tests, such as the built-in self-test (BIST) and the disruptive loopback test, and nondisruptive online diagnostic tests, such as packet switching, run during bootup, module online insertion and removal (OIR), and system reset. The nondisruptive online diagnostic tests run as part of background health monitoring. Either disruptive or nondisruptive tests can be run at the user's request (on-demand).

The online diagnostics detect problems in the following areas:

•Hardware components

•Interfaces (GBICs, Ethernet ports, and so forth)

•Connectors (loose connectors, bent pins, and so forth)

•Solder joints

•Memory (failure over time)

Online diagnostics is one of the requirements for the high availability feature. High availability is a set of quality standards that seek to limit the impact of equipment failures on the network. A key part of high availability is detecting hardware failures and taking corrective action while the switch runs in a live network. Online diagnostics in high availability detect hardware failures and provide feedback to high availability software components to make switchover decisions.

Online diagnostics are categorized as bootup, on-demand, schedule, or health-monitoring diagnostics. Bootup diagnostics run during bootup; on-demand diagnostics run from the CLI; schedule diagnostics run at user-designated intervals or specified times when the switch is connected to a live network; and health-monitoring runs in the background.

How to Configure Online Diagnostics

Setting Bootup Online Diagnostics Level

You can set the bootup diagnostics level as minimal or complete or you can bypass the bootup diagnostics entirely. Enter the complete keyword to run all diagnostic tests; enter the minimal keyword to run only EARL tests and loopback tests for all ports in the switch. Enter the no form of the command to bypass all diagnostic tests. The default bootup diagnositcs level is minimal.

To set the bootup diagnostic level, perform this task:

Command

Purpose

Router(config)# diagnostic bootup level {minimal | complete}

Sets the bootup diagnostic level.

This example shows how to set the bootup online diagnostic level:

Router(config)# diagnostic bootup level complete

Router(config)#

This example shows how to display the bootup online diagnostic level:

Router(config)# show diagnostic bootup level

Current bootup diagnostic level: complete

Router(config)#

Configuring On-Demand Online Diagnostics

You can run the on-demand online diagnostic tests from the CLI. You can set the execution action to either stop or continue the test when a failure is detected or to stop the test after a specific number of failures occur by using the failure count setting. You can configure a test to run multiple times using the iteration setting.

You should run packet-switching tests before memory tests.

Note Do not use the diagnostic start all command until all of the following steps are completed.

Because some on-demand online diagnostic tests can affect the outcome of other tests, you should perform the tests in the following order:

1. Run the nondisruptive tests.

2. Run all tests in the relevant functional area.

3. Run the TestTrafficStress test.

4. Run the TestEobcStressPing test.

5. Run the exhaustive-memory tests.

To run on-demand online diagnostic tests, perform this task:

Step 1 Run the nondisruptive tests.

To display the available tests and their attributes, and determine which commands are in the nondisruptive category, enter the show diagnostic content command.

Step 2 Run all tests in the relevant functional area.

Packet-switching tests fall into specific functional areas. When a problem is suspected in a particular functional area, run all tests in that functional area. If you are unsure about which functional area you need to test, or if you want to run all available tests, enter the complete keyword.

Step 3 Run the TestTrafficStress test.

This is a disruptive packet-switching test. This test switches packets between pairs of ports at line rate for the purpose of stress testing. During this test all of the ports are shut down, and you may see link flaps. The link flaps will recover after the test is complete. The test takes several minutes to complete.

Disable all health-monitoring tests f before running this test by using the no diagnostic monitor module number test all command.

Step 4 Run the TestEobcStressPing test.

This is a disruptive test and tests the Ethernet over backplane channel (EOBC) connection for the module. The test takes several minutes to complete. You cannot run any of the packet-switching tests described in previous steps after running this test. However, you can run tests described in subsequent steps after running this test.

Disable all health-monitoring tests before running this test by using the no diagnostic monitor module number test all command. The EOBC connection is disrupted during this test and will cause the health-monitoring tests to fail and take recovery action.

Step 5 Run the exhaustive-memory tests.

Before running the exhaustive-memory tests, all health-monitoring tests should be disabled because the tests will fail with health monitoring enabled and the switch will take recovery action. Disable the health-monitoring diagnostic tests by using the no diagnostic monitor module number test all command.

Perform the exhaustive-memory tests in the following order:

1. TestFibTcamSSRAM

2. TestAclQosTcam

3. TestNetFlowTcam

4. TestAsicMemory

5. TestAsicMemory

You must reboot the after running the exhaustive-memory tests before it is operational again. You cannot run any other tests on the switch after running the exhaustive-memory tests. Do not save the configuration when rebooting as it will have changed during the tests. After the reboot, reenable the health-monitoring tests using the diagnostic monitor module number test all command.

Configures on-demand diagnostic tests to run, how many times to run (iterations), and what action to take when errors are found.

This example shows how to set the on-demand testing iteration count:

Router# diagnostic ondemand iteration 3

Router#

This example shows how to set the execution action when an error is detected:

Router# diagnostic ondemand action-on-error continue 2

Router#

Scheduling Online Diagnostics

You can schedule online diagnostics to run at a designated time of day or on a daily, weekly, or monthly basis. You can schedule tests to run only once or to repeat at an interval. Use the no form of this command to remove the scheduling.

Configuring Health-Monitoring Diagnostics

You can configure health-monitoring diagnostic testing while the switch is connected to a live network. You can configure the execution interval for each health-monitoring test, the generation of a system message upon test failure, or the enabling or disabling an individual test. Use the no form of this command to disable testing.

Overview of Diagnostic Test Operation

After you configure online diagnostics, you can start or stop diagnostic tests or display the test results. You can also see which tests are configured and what diagnostic tests have already run.

•Enable the logging console/monitor to see all warning messages before you enable any online diagnostics tests.

•When you are running disruptive tests, run the tests when connected through the console. When disruptive tests are complete, a warning message on the console recommends that you reload the system to return to normal operation. Strictly follow this warning.

•While tests are running, all ports are shut down because a stress test is being performed with ports configured to loop internally; external traffic might alter the test results. The switch must be rebooted to bring the switch to normal operation. When you issue the command to reload the switch, the system will ask you if the configuration should be saved. Do not save the configuration.

•If you are running the tests on a supervisor engine, after the test is initiated and complete, you must reload or power down and then power up the entire system.

•If you are running the tests on a switching module, rather than the supervisor engine, after the test is initiated and complete, you must reset the switching module.

Starting and Stopping Online Diagnostic Tests

After you configure diagnostic tests to run, you can use the start and stop to begin or end a diagnostic test. To start or stop an online diagnostic command, perform one of these tasks:

This example shows how to display the output for the health checks performed:

Router# show diagnostic health

Non-zero port counters for 6/4 -

13. linkChange = 8530

Non-zero port counters for 6/5 -

13. linkChange = 8530

Router#

How to Perform Memory Tests

Most online diagnostic tests do not need any special setup or configuration. However, the memory tests, which include the TestFibTcamSSRAM and TestLinecardMemory tests, have some required tasks and some recommended tasks that you should complete before running them.

Before you run any of the online diagnostic memory tests, perform the following tasks:

•Required tasks

–Isolate network traffic by disabling all connected ports.

–Do not send test packets during a memory test.

–Reset the system before returning the system to normal operating mode.

•Turn off all background health-monitoring tests using the no diagnostic monitor module number test all command.

How to Perform a Diagnostic Sanity Check

You can run the diagnostic sanity check in order to see potential problem areas in your network. The sanity check runs a set of predetermined checks on the configuration with a possible combination of certain system states to compile a list of warning conditions. The checks are designed to look for anything that seems out of place and are intended to serve as an aid for maintaining the system sanity.

To run the diagnostic sanity check, perform this task:

Command

Purpose

show diagnostic sanity

Runs a set of tests on the configuration and certain system states.

This example displays samples of the messages that could be displayed with the show diagnostic sanity command:

Router# show diagnostic sanity

Pinging default gateway 10.6.141.1 ....

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.6.141.1, timeout is 2 seconds:

..!!.

Success rate is 0 percent (0/5)

IGMP snooping disabled please enable it for optimum config.

IGMP snooping disabled but RGMP enabled on the following interfaces,

please enable IGMP for proper config :

Vlan1, Vlan2, GigabitEthernet1/1

Multicast routing is enabled globally but not enabled on the following

interfaces:

GigabitEthernet1/1, GigabitEthernet1/2

A programming algorithm mismatch was found on the device bootflash:

Formatting the device is recommended.

The bootflash: does not have enough free space to accomodate the crashinfo file.

Please check your confreg value : 0x0.

Please check your confreg value on standby: 0x0.

The boot string is empty. Please enter a valid boot string .

Could not verify boot image "disk0:" specified in the boot string on the

slave.

Invalid boot image "bootflash:asdasd" specified in the boot string on the

slave.

Please check your boot string on the slave.

UDLD has been disabled globally - port-level UDLD sanity checks are

being bypassed.

OR

[

The following ports have UDLD disabled. Please enable UDLD for optimum

config:

Gi1/22

The following ports have an unknown UDLD link state. Please enable UDLD

on both sides of the link:

Gi1/22

]

The following ports have portfast enabled:

Gi1/20, Gi1/22

The following ports have trunk mode set to on:

Gi1/1, Gi1/13

The following trunks have mode set to auto:

Gi1/2, Gi1/3

The following ports with mode set to desirable are not trunking:

Gi1/3, Gi1/4

The following trunk ports have negotiated to half-duplex:

Gi1/3, Gi1/4

The following ports are configured for channel mode on:

Gi1/1, Gi1/2, Gi1/3, Gi1/4

The following ports, not channeling are configured for channel mode

desirable:

Gi1/14

The following vlan(s) have a spanning tree root of 32768:

1

The following vlan(s) have max age on the spanning tree root different from

the default:

1-2

The following vlan(s) have forward delay on the spanning tree root different

from the default:

1-2

The following vlan(s) have hello time on the spanning tree root different

from the default:

1-2

The following vlan(s) have max age on the bridge different from the

default:

1-2

The following vlan(s) have fwd delay on the bridge different from the

default:

1-2

The following vlan(s) have hello time on the bridge different from the

default:

1-2

The following vlan(s) have a different port priority than the default

on the port gigabitEthernet1/1

1-2

The following ports have recieve flow control disabled:

Gi1/20, Gi1/22

The following inline power ports have power-deny/faulty status:

Gi1/1, Gi1/2

The following ports have negotiated to half-duplex:

Gi1/22

The following vlans have a duplex mismatch:

Gig 1/22

The following interafaces have a native vlan mismatch:

interface (native vlan - neighbor vlan)

Gig 1/22 (1 - 64)

The value for Community-Access on read-only operations for SNMP is the same

as default. Please verify that this is the best value from a security point

of view.

The value for Community-Access on write-only operations for SNMP is the same

as default. Please verify that this is the best value from a security point

of view.

The value for Community-Access on read-write operations for SNMP is the same

as default. Please verify that this is the best value from a security point

of view.

Tip For additional information about Cisco Catalyst 6500 Series Switches (including configuration examples and troubleshooting information), see the documents listed on this page: