How to identify processes that waiting for disk I/O
Using top command, you can easily find out how much CPU used by each job, However, a process/task I/O waiting is also conclued in CPU usage. So, in some case, CPU actually is not busy but you still see high load on system, some processes are just blocked by I/O, therefore CPU cost is high.
How to identify these processes?
The way today I’m going to explain is to find out the processes that are in ‘D’ state.You may wonder, what’s that?
The process state code
Before we get started, let’s review the process state code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
PROCESS STATE CODES Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process. D Uninterruptible sleep (usually IO) R Running or runnable (on run queue) S Interruptible sleep (waiting for an event to complete) T Stopped, either by a job control signal or because it is being traced. W paging (not valid since the 2.6.xx kernel) X dead (should never be seen) Z Defunct ("zombie") process, terminated but not reaped by its parent. For BSD formats and when the stat keyword is used, additional characters may be displayed: < high-priority (not nice to other users) N low-priority (nice to other users) L has pages locked into memory (for real-time and custom IO) s is a session leader l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do) + is in the foreground process group |
Identify process in ‘D’ state by ps command
The process in ‘D’ state are in uninterruptible sleep mode, so we can use Linux ps command shows the process state in 8th column, so here is one way to quickly find out the processes that in ‘D’ state
1 2 3 4 5 6 7 |
#ps aux | awk '$8 ~ /D/ { print $0 }' root 9324 0.0 0.0 8316 436 ? D< Apr22 0:00 /sbin/blkid -o udev -p /dev/dm-0 root 11917 0.0 0.0 15016 832 ? D< May06 0:00 /sbin/kpartx -a -p p /dev/dm-0 root 12864 0.0 0.0 15016 696 ? D< Apr22 0:00 /sbin/kpartx -a -p p /dev/dm-0 root 13210 0.0 0.0 15016 696 ? D< Apr22 0:00 /sbin/kpartx -a -p p /dev/dm-0 root 15799 0.0 0.0 8316 508 ? D< May06 14:29 /sbin/blkid -o udev -p /dev/dm-0 root 25825 0.0 0.0 8316 504 ? D< May06 14:16 /sbin/blkid -o udev -p /dev/dm-0 |
In above output, you can see not just one process that are blocked by device dm-0, so you can look into more about the device see what’s going on there.
Want repeat the process see if the processes are constantly running on blocked io state
1 |
watch -d -n 1 "(ps aux | <a href="http://fibrevillage.com/scripting/31-awk-useful-examples" rel="alternate">awk</a> '\$8 ~ /D/ { print \$0 }')" |
Or
1 |
while true; do date; ps aux | <a href="http://fibrevillage.com/scripting/356-one-line-programs-in-awk" rel="alternate">awk</a> '$8 ~ /D/ { print $0 }'; sleep 1; done |
In the meantime, you should be able to see iostate command shows high iowait
1 2 |
avg-cpu: %user %nice %system %iowait %steal %idle 0.05 0.00 0.05 74.91 0.00 24.99 |
【下一篇】Brocade SAN switch commands for trouble shooting