News

April 30, 2010

ATOP version 1.25

March 12, 2010

Patches kernel version 2.6.33

March 12, 2010

Complete redesign of ATOP website

January 30, 2010

Patches kernel version 2.6.31.12

Screenshots

The full-screen output of atop consists of a top half with system level statistics and a bottom half with process level statistics. The type of process level statistics can be modified by pressing certain keys, as shown by these screen shots. This page does not describe the meaning of every single counter. You can find a full description in the atop manual page.
Notice that for some screenshots an additional example is provided showing the dynamic extension of columns whenever the window has been widened (more than 80 positions).


Generic information — default

The generic screen gives an overview of the consumption on system and process level of the four major hardware resources, i.e. cpu, memory, disk and network. Since the kernel does not maintain per-process network statistics, network consumption on process level is only shown when you have installed the kernel patches.

Some details: In the line with the label PRC the counter '#exit' shows that one process has finished during the last interval. The bottom half shows which process: process 'find' with process-id 31085. Before it died, it has consumed .45 seconds cpu-time in system mode and .10 seconds cpu-time in user mode, so .55 cpu-seconds in total (2.75% of one cpu during the interval of 20 seconds). The column ST (state) shows 'E' (exit) and the column EXCODE shows the process' exit code 1 (an exit code of 0 would indicate a normal run).

Generic information — default (wider window)

When this wider screenshot is compared with the previous one, real and effective uid are shown now, and the number of threads and current cpu-number for the main thread.


Scheduling information — key 's'

This screen shows specific scheduling information about the main thread of each process, like scheduling policy, nice value, priority, realtime priority and cpu-number (current or last used) and state.
Furthermore it shows how many threads within this process are in state 'running' (busy on cpu or waiting in the runqueue), 'interruptible sleeping' or 'non-interruptible sleeping'. The total number of threads can be determined by accumulating these three values (columns TRUN, TSLPI and TSLPU).

Some details: The process 'chrome' with process-id 30549 runs with 4 threads in total; one of these threads is 'running' and three are interruptible sleeping. The running thread appears to be the main thread of the process, because the state of the main thread (column S) is 'R'.
The process 'firefox' with process-id 4680 runs with 8 threads in total from which one is 'running' (but not the main thread).


Memory consumption — key 'm'

This screen shows specific memory-related information per process like total virtual and resident size (column VSIZE and RSIZE) and the virtual and resident growth during the last interval (column VGROW and RGROW). The memory percentage (column MEM) shows the resident memory occupation by this process, because that is what matters when your system starts swapping.

Some details: In the line with the label PAG the counters 'swin' (swapins) and 'swout' (swapouts) show that this system suffers from a memory-overload. In the line with label LVM for logical volume 'vg00-lvswap' the 'read' and 'write' counters exactly match with the 'swin' and 'swout' counters. This logical volume is also most reponsible for the heavy load on the underlying disk 'sda'.
On process level a lot of negative resident growth (column RGROW) can be seen because processes loose their resident pages by swapout. It appears that process 'lekker' with process-id 31048 grows heavily due to a memory leakage; its resident size is currently 1.5 Gbytes (total memory 3.8 Gbytes).

Memory consumption — key 'm' (wider window)

When this wider screenshot is compared with the previous one, the real and effective uid are shown now.


Disk utilization — key 'd'

The lines with label LVM (logical volumes) and DSK (underlying physical disks) shows the disk-activity on system level.
On process level the disk activity is shows as the amount of data transferred by reads (column RDDSK) and writes (column WRDSK). Usually the written data is stored in the in-memory page cache before it is physically written to disk. When the data is written to the page cache but destroyed before physically written to disk, that amount is reported as cancelled (column WCANCL).

Some details: The line with the label DSK shows that disk 'sda' for 47% during the last interval, issueing 3096 read requests and 40 write requests. So most disk-utilization is caused by reading of processes.
The process that has read most data is 'bash' with process-id 21091. That process transferred 150MB (which is 97% of all accounted disk transfer). Since other processes did not transfer much data, 'bash' seems to have made the disk 47% busy during an interval of 20 seconds, which can not be true.... And it is not true! The process 'find' with process-id 31035 has issued most disk-transfers, but it has finished during the interval. In that case atop obtains its information from the process accounting record of the exited process, however unfortunately the number of disk transfer is not registered there....

Disk utilization — key 'd' (wider window)

When this wider screenshot is compared with the previous one, columns are added for the system level statistics, like the number of Kbytes transferred per read and write request, the total throughput per second for reading and writing, and the average number of requests in the request-queue of the disk driver.


Variable information — key 'v'

This screen shows miscellaneous information about processes, like credentials (real uid and real gid), parent process-id, start date and start time, etcetera.

Variable information — key 'v' (wider window)

When this wider screenshot is compared with the previous one, all flavors of uid and gid are shown now, and the exact end data and end time is shown for processes that finished during the interval.


Command line — key 'c'

This screen shows the command line of the processes. If the window is widened, more command line arguments are shown.


Accumulated per program — key 'p'

This screen shows in the most right column which programs are active (or been active during the last interval) and in the most left column how many processes (incarnations). The columns in between show the accumulated cpu consumption, the accumulated virtual and resident memory consumption (notice that the shared parts are accounted for every process, so this is far too high), the accumulated transferred data from/to disk and (only in case of a patched kernel) the accumulated network transfers.


Accumulated per user — key 'u'

This screen shows in the most right column which users are active (or been active during the last interval) and in the most left column how many processes each user runs/ran. The columns in between show the accumulated cpu consumption, the accumulated virtual and resident memory consumption (notice that the shared parts are accounted for every process, so this is far too high), the accumulated transferred data from/to disk and (only in case of a patched kernel) the accumulated network transfers.


Disk utilization — key 'd' (patched kernel)

This screen shows the disk activity per process when the kernel patches related to atop have been installed. With these patches disk transfers are accounted to the concerning process at the moment of the physical access. The number of read and write transfers per process are shown, as well as the average block size per read and write, and the total transfer rate for read and write. These figures are even shown for finished processes (like 'grep' in the example) due to the extended process accounting record.


Network utilization — key 'n' (patched kernel)

This screen shows the network activity per process when the kernel patches related to atop have been installed. With these patches network transfers are accounted to the concerning process at the system call level, the moment that data is sent to or received from the socket by the process. The number of receives and sends are shown for TCP, UDP and raw traffic, even for finished processes. If the window is widened, also the average size per transfer is shown (like in this example only for TCP-sends).