How to monitor Linux operations ?

All our customers are on Linux, They have multiple flavors of Linux actually – Ubuntu, CentOS, RedHat Linux, Oracle Linux, SUSE Linux etc. Though we are an full-service everything MySQL shop, Our consulting, support and managed services are never restricted to only MySQL Ops. , We are experts in Linux,  DevOps and Site Reliability Engineering (SRE). We have proven methods to deliver Linux performance audit / health check / diagnostics and recommendations. What are the tools we use for monitoring Linux ops. ? This post is about those tools we use regularly in MinervaDB for monitoring Linux operations:

How long Linux server is up and running ? 

[root@localhost ~]# uptime 
 12:32:56 up 1800 min,  83 users,  load average: 88.01, 88.52, 88.64
[root@localhost ~]#

Print all the processes running as root

[root@localhost ~]# ps -U root -u root 
  PID TTY          TIME CMD
    1 ?        00:00:00 systemd
    2 ?        00:00:00 kthreadd
    3 ?        00:00:00 ksoftirqd/0
    4 ?        00:00:00 kworker/0:0
    5 ?        00:00:00 kworker/0:0H
    6 ?        00:00:00 kworker/u2:0
    7 ?        00:00:00 migration/0
    8 ?        00:00:00 rcu_bh
    9 ?        00:00:00 rcu_sched
   10 ?        00:00:00 watchdog/0
   12 ?        00:00:00 kdevtmpfs
   13 ?        00:00:00 netns
   14 ?        00:00:00 khungtaskd
   15 ?        00:00:00 writeback
   16 ?        00:00:00 kintegrityd
   17 ?        00:00:00 bioset
   18 ?        00:00:00 kblockd

 

[root@localhost ~]# ps -A

[root@localhost ~]# ps -A
  PID TTY          TIME CMD
    1 ?        00:00:00 systemd
    2 ?        00:00:00 kthreadd
    3 ?        00:00:00 ksoftirqd/0
    4 ?        00:00:00 kworker/0:0

[root@localhost ~]# ps -e 
  PID TTY          TIME CMD
    1 ?        00:00:00 systemd
    2 ?        00:00:00 kthreadd
    3 ?        00:00:00 ksoftirqd/0
    4 ?        00:00:00 kworker/0:0
    5 ?        00:00:00 kworker/0:0H

Print all the processes owned by MySQL

[root@localhost ~]# ps -fG mysql 
UID        PID  PPID  C STIME TTY          TIME CMD
mysql     1165     1  0 10:54 ?        00:00:00 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
[root@localhost ~]# 

Top 

Displays all actively running processes in real-time, The matrices addressed in Linux “top” command are CPU usage, Memory usage, Swap Memory, Cache Size, Buffer Size, Process PID, User, Commandsetc.

[root@localhost ~]# top

top - 11:37:31 up 43 min,  3 users,  load average: 0.50, 0.15, 0.08
Tasks:  90 total,   1 running,  89 sleeping,   0 stopped,   0 zombie
%Cpu(s):100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1016156 total,   541664 free,   307024 used,   167468 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used.   553344 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                            
 1412 root      20   0   31068   1720   1324 S 99.7  0.2   0:43.38 sysbench                                           
    1 root      20   0  125572   4096   2496 S  0.0  0.4   0:00.76 systemd                                            
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd                                           
    3 root      20   0       0      0      0 S  0.0  0.0   0:00.04 ksoftirqd/0                                        
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H                                       
    6 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kworker/u2:0                                       
    7 root      rt   0       0      0      0 S  0.0  0.0   0:00.00 migration/0                                        
    8 root      20   0       0      0      0 S  0.0  0.0   0:00.00 rcu_bh                                             
    9 root      20   0       0      0      0 S  0.0  0.0   0:00.15 rcu_sched                                          

Press “c” option in running top command to know absolute path of running process

top - 11:43:20 up 49 min,  3 users,  load average: 1.27, 0.88, 0.43
Tasks:  90 total,   1 running,  88 sleeping,   1 stopped,   0 zombie
%Cpu(s): 99.2 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.8 si,  0.0 st
KiB Mem :  1016156 total,   540768 free,   307752 used,   167636 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used.   552484 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                            
 1412 root      20   0   31068   1720   1324 S 99.7  0.2   6:31.61 sysbench --test=cpu --cpu-max-prime=300000000 run  
 1415 root      20   0       0      0      0 S  0.3  0.0   0:00.14 [kworker/0:3]                                      
    1 root      20   0  125572   4096   2496 S  0.0  0.4   0:00.77 /usr/lib/systemd/systemd --switched-root --system +
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 [kthreadd]                                         
    3 root      20   0       0      0      0 S  0.0  0.0   0:00.04 [ksoftirqd/0]                                      
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 [kworker/0:0H]                                     
    6 root      20   0       0      0      0 S  0.0  0.0   0:00.00 [kworker/u2:0]                                     

Monitor CPU statistics using IOSTAT

[root@localhost ~]# iostat -c
Linux 3.10.0-693.21.1.el7.x86_64 (localhost.localdomain) 	04/21/2018 	_x86_64_	(1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.73    0.00    0.28    0.15    0.00   94.84

Top five CPU consuming processes

[root@localhost ~]# watch "ps aux | sort -nrk 3,3 | head -n 5"

Every 2.0s: ps aux | sort -nrk 3,3 | head -n 5                                                          Sat Apr 21 15:44:49 2018

root	  2658 99.8  0.2  39160  2816 pts/0    Sl+  15:38   6:41 sysbench --test=cpu --cpu-max-prime=30000000 --num-threads=120
mysql     1165  1.8 24.4 1152160 248112 ?      Sl   10:54   5:20 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.
USER	   PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root	   932  0.0  0.0 113372   940 ?        S    10:54   0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/r
root        92  0.0  0.0      0     0 ?        S    10:54   0:00 [kauditd]

Monitor % of CPU and memory consumption by Linux processes

[root@localhost ~]# ps aux k-pcpu | head -6
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      2658 99.8  0.2  39160  2816 pts/0    Sl+  15:38  10:05 sysbench --test=cpu --cpu-max-prime=30000000 --num-threads=120 run
mysql     1165  1.8 24.4 1152160 248112 ?      Sl   10:54   5:20 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
root         1  0.0  0.2 125572  2148 ?        Ss   10:54   0:00 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
root         2  0.0  0.0      0     0 ?        S    10:54   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    10:54   0:00 [ksoftirqd/0]

Monitor active and inactive memory

[root@localhost ~]# vmstat -a 
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 540052 114584 294208    0    0    52    11  131   76  7  0 93  0  0
[root@localhost ~]# 

Monitoring  memory usage with timestamp

[root@localhost ~]# vmstat -t 1 50
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st                 IST
 2  0      0 540508   2108 165580    0    0    22     5  146   65  9  0 91  0  0 2018-04-21 13:01:38
 2  0      0 540508   2108 165580    0    0     0     0  300   76 100  0  0  0  0 2018-04-21 13:01:39
 2  0      0 540508   2108 165580    0    0     0     0  283   76 100  0  0  0  0 2018-04-21 13:01:40
 1  0      0 540508   2108 165580    0    0     0     0  246   73 100  0  0  0  0 2018-04-21 13:01:41
 1  0      0 540508   2108 165580    0    0     0     0  280   84 96  4  0  0  0 2018-04-21 13:01:42
 2  0      0 540508   2108 165576    0    0     0    24  337   93 100  0  0  0  0 2018-04-21 13:01:43
 1  0      0 540508   2108 165576    0    0     0     0  283   79 100  0  0  0  0 2018-04-21 13:01:44

Monitoring top five memory consuming processes

[root@localhost ~]# ps -eo pid,comm,%cpu,%mem --sort=-%mem | head -n 5
  PID COMMAND         %CPU %MEM
 1165 mysqld           1.8 24.4
 2405 dhclient         0.0  1.5
  774 NetworkManager   0.0  0.4
 1137 tuned            0.0  0.2

Monitoring disk I/O statistics

[root@localhost ~]# iostat -d 
Linux 3.10.0-693.21.1.el7.x86_64 (localhost.localdomain) 	04/21/2018 	_x86_64_	(1 CPU)

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              27.68      1191.80      3866.67   13191808   42799263
dm-0             22.33      1185.06      3838.77   13117158   42490503
dm-1              8.38         5.99        27.71      66324     306712

More detailed disk I/O monitoring at process level using  ‘iotop’ 

To monitor disk I/O more detailed at process level (very much real-time) we use ‘iotop’

Total DISK READ :     269.52 M/s | Total DISK WRITE :      50.85 M/s
Actual DISK READ:     269.52 M/s | Actual DISK WRITE:      51.96 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                  
 1821 be/4 root     1564.46 K/s  204.70 K/s  0.00 % 50.39 % sysbench fileio --thr~le-test-mode=rndrw run
 1893 be/4 root      720.09 K/s  190.07 K/s  0.00 % 42.68 % sysbench fileio --thr~le-test-mode=rndrw run
 2080 be/4 root      789.54 K/s  190.07 K/s  0.00 % 42.23 % sysbench fileio --thr~le-test-mode=rndrw run
 1867 be/4 root     1286.66 K/s  190.07 K/s  0.00 % 41.35 % sysbench fileio --thr~le-test-mode=rndrw run
 1870 be/4 root      321.66 K/s  160.83 K/s  0.00 % 41.16 % sysbench fileio --thr~le-test-mode=rndrw run
 1833 be/4 root     1257.41 K/s  204.70 K/s  0.00 % 40.66 % sysbench fileio --thr~le-test-mode=rndrw run
 1934 be/4 root      789.54 K/s  190.07 K/s  0.00 % 40.15 % sysbench fileio --thr~le-test-mode=rndrw run
 2035 be/4 root     1286.66 K/s  204.70 K/s  0.00 % 39.73 % sysbench fileio --thr~le-test-mode=rndrw run
 1975 be/4 root      745.68 K/s  190.07 K/s  0.00 % 39.49 % sysbench fileio --thr~le-test-mode=rndrw run
 1851 be/4 root     1579.08 K/s  190.07 K/s  0.00 % 39.36 % sysbench fileio --thr~le-test-mode=rndrw run
 1836 be/4 root      877.27 K/s  190.07 K/s  0.00 % 39.35 % sysbench fileio --thr~le-test-mode=rndrw run
 2001 be/4 root      160.83 K/s  131.59 K/s  0.00 % 39.34 % sysbench fileio --thr~le-test-mode=rndrw run
 1879 be/4 root     1842.26 K/s  190.07 K/s  0.00 % 39.22 % sysbench fileio --thr~le-test-mode=rndrw run
 1872 be/4 root      263.18 K/s  204.70 K/s  0.00 % 38.48 % sysbench fileio --thr~le-test-mode=rndrw run
 1953 be/4 root        2.81 M/s  146.21 K/s  0.00 % 38.35 % sysbench fileio --thr~le-test-mode=rndrw run
 1941 be/4 root      292.42 K/s  277.80 K/s  0.00 % 38.31 % sysbench fileio --thr~le-test-mode=rndrw run
 1913 be/4 root     1345.14 K/s  204.70 K/s  0.00 % 38.02 % sysbench fileio --thr~le-test-mode=rndrw run
 2017 be/4 root      628.71 K/s  160.83 K/s  0.00 % 37.93 % sysbench fileio --thr~le-test-mode=rndrw run
 2040 be/4 root      745.68 K/s  190.07 K/s  0.00 % 37.61 % sysbench fileio --thr~le-test-mode=rndrw run
 1942 be/4 root      555.60 K/s  160.83 K/s  0.00 % 37.49 % sysbench fileio --thr~le-test-mode=rndrw run
 1980 be/4 root      233.94 K/s  219.32 K/s  0.00 % 37.49 % sysbench fileio --thr~le-test-mode=rndrw run

This is how usually we begin a MySQL performance benchmarking and audit project, We first understand the load in the Linux and from there our matrices will be completely MySQL biased, We deliver very detailed performance optimization recommendation, which will / can be used by our customers for performance optimization and capacity planning / sizing.

About MinervaDB Corporation 88 Articles
Independent and vendor neutral consulting, support, remote DBA services and training for MySQL, MariaDB, Percona Server, PostgreSQL and ClickHouse with core expertize in performance, scalability and high availability . We are an virtual corporation, all of us work from home on multiple timezones and stay connected via Email, Skype, Google Hangouts, Phone and IRC supporting over 250 customers worldwide
UA-155183614-1