The “atop” Explained

The atop is program used to view the load on Linux system. It shows which processes are responsible for load of system. Also, atop shows how CPU, memory, disk and network handle each process.

By default, atop will display results in intervals of 10 seconds. There, we can see system load at the given time. At the top of screen, we can observe system load and at the lower screen we see statistics for each process and how much it affected the system.

When atop is started, it turns on ‘process accounting mechanism’ in the kernel. This switching tells kernel to write a record with information to a file whenever process ends. Also, atop can interpret records on disk. When atop is closed or keyboard interrupt signal is sent, ‘process accounting mechanism’ stops and file with records gets stored.

Colors

The atop utility uses colors to indicate that a critical percentage has been or has been almost reached. Critical percentage means that is likely that this load causes negative performance for processes.

CPU: Busy percentage of 90% or higher is considered critical.
Disk: Busy percentage of 70% or higher is considered critical.
Network: Busy percentage of 90% or higher is considered critical (per interface)
Memory: Busy percentage of 90% or higher is considered critical.
Swap: Busy percentage of 80% or higher is considered critical
NOTE: These thresholds can be modified in configuration file

What happens when threshold is reached? Well, entire screen line becomes red. When resource exceeded 80% (almost critical), then entire screen line becomes cyan. We can remove coloring by pressing ‘x’ key.

Generic output (default screen) explained

PID: Process ID
SYSCPU: CPU consumption in system mode
USRCPU: CPU consumption in user mode
VGROW: Virtual memory growth of process
RGROW: Resident (real) memory growth of process
ST: Status code for process
EXC: Exit code for process
THR: Number of threads in thread group
S: State of process (S for sleeping, I for
CPU: CPU occupation percentage
CMD: Process name

Memory related output with m flag

PID: Process ID
MINFLT: Minor memory faults
MAJFLT: Major memory faults
VSTEXT: Size of virtual shared text
VSIZE: Total virtual process size
RSIZE: Total resident process size
VGROW: Total virtual growth during last interval
RGROW: Total resident growth during last interval
MEM: Memory occupation percentage
CMD: Process name

Disk related output with d flag

PID: Process ID
RDDSK: Amount of data read from disk
WRDSK: Amount of data that was written to disk
WCANCL: Amount of data that was written but has been withdrawn again
DSK: Disk occupation percentage
CMD: Process name

Network related output with n flag

PID: Process ID
Number of received TCP packets
Number of sent TCP packets
Number of received UDP packets
Number of sent UDP packets
Number of received and sent raw packets in one column
Network occupation percentage
Process name

Command Line of process with c flag

PID: Process ID
CPU: CPU occupation
COMMAND-LINE: Command of process including arguments

Process activity per user with u flag

NPROCS: Number of processes active or terminated
SYSCPU: CPU consumption in system mode
USRCPU: CPU consumption in user mode
VSIZE: Virtual memory space consumed by active processes
RSIZE: Resident memory space consumed by active processes
CPU: Occupation percentage for CPU
RUID: User name

Sorting lower screen output

C: Process with highest CPU consumption is first (CPU)
M: Process with highest memory consumption is first (MEM)
D: Process with highest disk consumption is first (DSK)
N: Process with highest number of packets received/transmitted is first
A: Show most resource consuming process first

Other flags
z: Pause the screen
i 5: Modify refresh interval

System Level Information:

PRC line: Process level totals
sys: Total CPU time consumed in system mode
user: Total CPU time consumed in user mode
#proc: Total number of processes present at this moment
#trun: Total number of running threads present at this moment
#tslpi: Total number of sleeping threads present at this moment
#tslpi: Total number of sleeping uninterruptible threads present at this moment
#zombie: Total number of zombie processes
clones: Number of clone system calls
#exit: Number of processes that ended during this interval

CPU: shows total occupation of all CPU’s together

cpu: shows occupation of individual processor
sys: percentage of CPU time spent in kernel mode by all active processes
user: percentage of CPU time consumed in user mode for all active processes (including processes running with nice value larger than zero)
irq: percentage of CPU time spent for interrupt handling
idle: percentage of unused CPU time while no processes were waiting for disk I/O
wait: percentage of unused CPU time while at least one process was waiting for disk I/O
steal: for virtual machines only, shows CPU steal percentage which tells percentage of CPU stolen by other virtual machines running on same hardware
guest: if this machine hosts virtual machine, this tells percentage of CPU time used by virtual machines

CPL: shows CPU load information

avg: number of threads that are available to run on CPU or that are waiting for disk I/O
csw: number of context switches
intr: number of serviced interrupts
numcpu: shows number of cores in CPU

MEM: shows memory occupation

tot: total amount of physical memory
free: amount of memory which is currently free
cache: amount of memory used as cache
dirty: amount of memory within page cache that has to be written (currently sits in memory)
buff: amount of memory used for filesystem metadata
slab: amount of memory used for kernel malloc’s

SWP: shows swap occupation and overcommit info

tot: total amount of swap space on disk
free: amount of free swap space
vmcom: commited virtual memory space. This space is reserved virtual space for all allocations of private memory space for processes. The kernel only verifies whether commited space exceeds the limit if strict overcommit handling is configured (vm.overcommit_memory 2)
vmlim: swap size plus 50% of memory size

LVM / MDD / DSK: shows logical volume, multiple device (RAID) and disk utilization

busy: shows busy percentage. In other words, shows portion of time that unit was busy handling requests
read: number of read requests issued
write: number of write requests issued
KiB/r: number of KiBytes per read
KiB/w: number of KiBytes per write
MBr/s: number of MiBytes per second throughput for reads
MBw/s: number of MiBytes per second throughput for writes
avq: average queue depth
avio: average number of miliseconds needed by request for data transfer

NET: shows network utilization

First line shows activity for transport layer (TCP/UDP)
Second line shows activity for IP layer
Other NET lines are showing activity per active interface

Line for TCP/UDP:

tcpi: number of received TCP packets
tcpo: number of sent TCP packets
udpi: number of received UDP packets
udpo: number of sent UDP packets
tcpao: number of active TCP connections
tcppo: number of passive TCP connections
tcprs: number of TCP output retransmissions
tcpie: number of TCP input errors
tcpor: number of TCP output retransmisions
udpnp: number of UDP no ports
udpie: number of UDP input errors

Line for IP layer:

ipi: number of IP packets received from interface
ipo: number of IP packets destined for transmission
ipfrw: number of received IP packets which were forwarded to other interface
deliv: number of IP packets which were delivered to higher-layer protocols
icmpi: number of received ICMP datagrams
icmpo: number of sent ICMP datagrams

Line for individual active interface (sorted by interface activity)

lo: Name of interface and its own busy percentage (near name, same column)
pcki: number of received packets
pcko: number of transmitted packets
si: amount of received bits per second
so: amount of sent bits per second
coll: number of collisions
mlti: number of received multicast packets
erri: number of errors while receiving packet
erro: number of errors while sending packet
drpi: number of received packets dropped
drpo: number of sent packets dropped

Leave a Reply