Performance Diagnosis: Overview (Linux)

RAM Overview
vmstat

Questions to ask During Performance Issue Diagnosis

Here are some questions to help direct performance issue diagnosis along with hints on how to answer them. This information applies to all versions of UNIX, but some of the tools may only be available in Linux.
   
Is CPU Usage a Problem? How about Length of Time to Complete?
Use these to determine the amount of CPU an application is using.
top
ps

If the application is a heavy CPU user, it may mean that the process is CPU bound and that performance is not slowed by waiting for other resources.

Is the System CPU bound?
If the entire system is spending less than 5% of the total time in idle and wait modes, the system is CPU bound.

Use these tools to determine if the system is CPU bound


Are One or More Processes Using Most of the System CPU?
Start with top
Use time command to see whether application is spending time in kernd or user mode
Also, try oprofile to see where application is spending time.

Are One or More Processes Using Most of an Individual CPU?
Use top . Turn on Irix mode (capital I in command line) so that top shows the amount of CPU time per processor rather than the total system.

Is the Kernel Servicing Many Interrupts?
Run procinfo or cat /proc/interrupts to determine how many interrupts have been fired and what device is causing them.

Where is Time Spent in the Kernel?
Run oprofile and record which kernel functions consume a significant amount of time. Determining which subsystems are being used may provide an important clue (for example, memory, network, scheduling, disk)

Is the Amount of Swap Space Being Used Increasing?
Use top, vmstat , procinfo

Is Application’s Disk Usage a problem?
Unfortunately, it is not easy with Linux to determine which processes are causing a lot of I/O.

Try this:
Use top to determine which processes are active
Use strace to trace all system calls the process is making
Look at /proc//fd to see the symbolic links to the actual files being accessed

What is the amount of each type of disk I/O  (read/write) on the system?
Use vmstat

Which devices are servicing most of the disk I/O?
Use vmstat , iostat , sar

How effectively are each of the disks fielding I/O requests?
Use iostat

Which processes are using a specific set of files?
Use lsof


Is Application’s Network Usage a problem?
Use strace to trace all I/O system calls
Make note of fie descriptors
Look at /proc//fd with ls -la to see the symbolic links
The links with socket in the name are links to network sockets
The developer can use this information to investigate how the sockets are used



Further Considerations

Load Average
The load average is based on the number of running processes waiting for or using CPU resources. Each of these processes adds 1 to the load average. A load average of 0 indicates that the computer is completely idle. A load average of 8 means eight processes are waiting for or using CPU. Load average is displayed instead of current load because current load can vary greatly at any time. The easiest way to display load average is with the uptime command, but is also displayed in sar . Each display the load average of the system over the last 1, 5, and 15 minutes.
$ uptime
12:48  up  5:37, 6 users, load averages: 8.34 6.77 5.44

Notes on Diagnosing Memory Shortages
UNIX memory shortages impact performance once virtual memory is impacted. Therefore, when we are investigating possible memory shortages, we are investigating virtual memory as well. Click here for an overview of diagnosing memory issues.

Notes on  Diagnosing CPU Usage
CPU can be doing one of 7 things.  These are:
  • idle: waiting for something to do
  • user time: running user code
  • system time: executing code in the kernel on behalf of application
  • nice: running lower priority tasks that have been "niced"
  • iowait: waiting for I/O
  • irq state: in high-priority code handling an interrupt; use procinfo for statistics
  • softirq mode: executing kernel code that was triggered by an interrupt, but running at lower priority; use mpstat for statistics

You can get statistics on most of these with procinfo , mpstat , sar and vmstat .


    

Suggestions for Future Learning
This information will be in the second edition of UNIX For Application Support Staff . The ETA for the second edition is December 1, 2016.



Performance Diagnosis Articles



CPU-related
sar
top


Memory-related
sar
top


Storage-related



Networking-related
RAM Overview
Technical articles