PDA

View Full Version : Networking Troubleshooting Guidelines



Raytan
08-18-2010, 09:20 AM
This articles provides guidelines on what to do when someone suspects that there is a problem with the
network. The guidelines in this articles are from a system manager¡¯s viewpoint and cover the following items:
6¦1 Problem identification and notification
6¦1 Problem verification and information gathering
6¦1 Theory development
6¦1 Theory testing
<p align="left"><img src="/attachment/pic/Troubleshooting Flowchart.jpg" alt"Troubleshooting Flowchart.jpg"></p>
<p align="left"><img src="/attachment/pic/Troubleshooting Flowchart 2.jpg" alt="Troubleshooting Flowchart 2.jpg"></p>
Problem Identification and Notification
How did someone notice that there is a problem, and how far-reaching is it? Gather all the information relevant
to the perceived problem. Check on the following:
6¦1 Is it the whole network, a server, or just one user that is down?
6¦1 Can the users get to the services that they are supposed to?
6¦1 Is there a slow transfer of information?
6¦1 Does the problem involve a new installation or one that has been operating for some time?
6¦1 Was the equipment designed to do the particular task in process when the problem occurred?
6¦1 Does the problem occur during a certain time of day?
NOTE: The network manager(s) must be notified that there is a problem. This can be done by
telephone, pager, or other means. Notification can be done automatically by pager if the
system has a system management application installed such as the Cabletron Systems
SPECTRUM or the SPECTRUM Element Manager (SPEL). Refer to the appropriate
application documentation for an explanation of this feature.
Problem Verification and Information Gathering
As the network manager or technician called to the site, proceed as follows to evaluate the situation and
determine the scope of the problem.
1. If you have a network management software application (such as SPECTRUM, SPECTRUM Element
Manager for Windows or Lanalyzer), use it to check for any indications of a problem and the extent of
impact on the network. Get a trace of the network traffic if possible and port statistics.
Was the problem narrowed down to one of the categories in Chapters 3 through 9?
If YES, refer to the appropriate chapter for instructions on how to remedy the most common problems
in that category. If NO, go to step 2.
2. See if the network has one main hub or other area where you can perform a visual inspection of the
equipment and check the power indications, LEDs, and switch settings. Refer to the associated
manuals for details about the LEDs and switch settings.
Was the problem narrowed down to one of the categories in Chapters 3 through 9?
If YES, refer to the appropriate chapter for instructions on how to remedy the most common problems
in that category. If NO, go to step 3.
3. Check the server(s) to see what is happening and if the server(s) is operating properly.
Was the problem narrowed down to one of the categories in Chapters 3 through 9?
If YES, refer to the appropriate chapter for instructions on how to remedy the most common problems
in that category. If NO, go to step 4.
4. Check to see if the network has been changed recently in any way by adding, removing, upgrading or
reconfiguring any of its elements?
Was the problem narrowed down to one of the categories in Chapters 3 through 9?
If YES, refer to the appropriate chapter for instructions on how to remedy the most common problems
in that category. If NO, go to step 5.
5. Look at the release notes to see if there are any incompatibilities or known issues. Check the revisions to
get a clue of what could be happening between devices. Was the problem narrowed down to one of the
categories in Chapters 3 through 9?
If YES, refer to the appropriate chapter for instructions on how to remedy the most common problems
in that category.
Develop Theories
Use the following guidelines to help determine the location and identity of the problem.
1. Develop plans to test the problem areas of the network by systematically eliminating those areas that are
trouble free from those that are not. Use one or more of the following troubleshooting methods:
a. Isolate the problem to one part of the network.
b. Go to the specific modules. Refer to the associated manuals for operating and troubleshooting
information.
2. Remember the basics.
a. Check the power and cable connections. Be aware of building electrical problems, for noise, grounds,
spikes, brownouts, and outages.
b. Also, check any uninterruptible power supplies (UPS) connected to the devices. The UPS can fail too.
Is there any construction going on? Things could be happening to fragile data links.
c. Make sure the network is whithin specifications.
3. Narrow down the problems by considering only the information relevant to the problem, but don¡¯t
completely discount the possibility of a remote problem affecting the network.
Test the Theories
To test the theories and make sure the problem has been corrected, proceed as follows:
1. Identify the course of action, specific steps, and then perform each step.
2. Try things one at a time, do not change all variables at once.
3. Record your progress by documenting what has actually been tried, the parts swapped, etc.
4. Observe the network and/or check with the users. Did your corrective action fix the problem? If not, did it
change any of the symptoms, and if so, how?
5. If the problem has not been remedied, proceed as follows:
a. Repeat steps in Sections 2.2, 2.3, and 2.4 until a solution is found within a reasonable period of time.
The possibilities should narrow down until either a solution is found or the problem is found to be
unsolvable by the persons applying this method.
b. Depending on the severity or urgency of the problem, call in other experts such as Cabletron Systems
Technical Support.
Implement and Document FixesDocument everything! Whether or not your effort fixed the problem, document your efforts to help prevent
repeating your steps or for future reference in case you run into the problem again. The documentation is
especially helpful for troubleshooting intermittent problems.
Isolate the Problem
The following steps provide a guide to help isolate the problem to a particular part of the network:
1. To help determine if there is a problem, when it started, and if it¡¯s a critical network situation, consider the
following.
a. Is the problem on all of the network?
b. Is it critical that the network remain up and running without interruption?
If so, refer to the Service Contract for spares coverage and other service support information.
c. What is the network bounded by?
- Is it local?
- Is it bridged?
- What distance does it cover?
- Is the network a WAN?
d. Is the problem confined to one portion of the network?
- You can remove components such as MIMs to see if the problem goes away.
e. Does that portion have any well-defined boundaries, such as bridges, routers, ATM switches or types of
workgroups?
f. Is the problem confined to one hub or workgroup, software application, or protocol?
g. Have there been any additions, changes, or upgrades made to the network within the last month or so?
If so, what, and what kind of consequences could they have on the network, and how could they affect
the current situation?
2. What tools do you have for troubleshooting? Have you used the tools in this situation, and what were the
results? For example:
a. Using a Lanalyzer or sniffer (refer to its manual for operating instructions)
1.) Did you get a trace of the failure?
2.) Do you have a trace of normal network activity?
3.) What is the traffic rate, normal and at the time(s) of the failure(s)?
b. Using SPECTRUM Element Manager for Windows or SPECTRUM management applications (refer to
the respective manuals for operating instructions), check for the following:
1.) What information does the management application provide about the problem? Does it show you
what is happening?
2.) Can you see if the whole network is down? Or, how widespread is the problem?
c. Using some other management software or packet analyzer
d. Using a cable tester
e. Using a LAN-MD or LAN-Specialist (refer to the respective manual for operating instructions)
f. Using a layout of the network (Faxable format is probably the most frequently needed.)
3. Where do you start? The following provides suggestions on what to do:
a. Check to see if there has been any similar problems in the last 30-60 days? Go straight to the trouble
spots and do not waste any time. Check the following items:
1.) Swap out (shotgun) any suspected defective devices with known good working spare devices.
(This shotgun method can be quick and effective.)
2.) Refer to system records and service contract information about replacements and on-site spares.
Good records on what happened before and how it was solved can save hours of troubleshooting.
b. If a known trouble spot is not the problem, do the cut by half algorithm as follows:
1.) Separate the network or problem area into two halves in order to get part of it working. Use
bridges or routers of the appropriate technology as needed.
2.) Keep isolating the problem down as far as you can until you get to the troubled spot(s).
For example: There are two hubs connected together. Isolate one from the other so that both can
operate independently. See if one or both can come up and operate properly. Chances are one
should come up. If not, refer to the respective manuals for operating and troubleshooting
information.