All our customers are on Linux, They have multiple flavors of Linux actually – Ubuntu, CentOS, RedHat Linux, Oracle Linux, SUSE Linux etc. Though we are an full-service everything MySQL shop, Our consulting, support and managed services are never restricted to only MySQL Ops. , We are experts in Linux, DevOps and Site Reliability Engineering (SRE). We have proven methods to deliver Linux performance audit / health check / diagnostics and recommendations. What are the tools we use for monitoring Linux ops. ? This post is about those tools we use regularly in MinervaDB for monitoring Linux operations:
How long Linux server is up and running ?
[root@localhost ~]# uptime 12:32:56 up 1800 min, 83 users, load average: 88.01, 88.52, 88.64 [root@localhost ~]#
Print all the processes running as root
[root@localhost ~]# ps -U root -u root PID TTY TIME CMD 1 ? 00:00:00 systemd 2 ? 00:00:00 kthreadd 3 ? 00:00:00 ksoftirqd/0 4 ? 00:00:00 kworker/0:0 5 ? 00:00:00 kworker/0:0H 6 ? 00:00:00 kworker/u2:0 7 ? 00:00:00 migration/0 8 ? 00:00:00 rcu_bh 9 ? 00:00:00 rcu_sched 10 ? 00:00:00 watchdog/0 12 ? 00:00:00 kdevtmpfs 13 ? 00:00:00 netns 14 ? 00:00:00 khungtaskd 15 ? 00:00:00 writeback 16 ? 00:00:00 kintegrityd 17 ? 00:00:00 bioset 18 ? 00:00:00 kblockd
[root@localhost ~]# ps -A [root@localhost ~]# ps -A PID TTY TIME CMD 1 ? 00:00:00 systemd 2 ? 00:00:00 kthreadd 3 ? 00:00:00 ksoftirqd/0 4 ? 00:00:00 kworker/0:0 [root@localhost ~]# ps -e PID TTY TIME CMD 1 ? 00:00:00 systemd 2 ? 00:00:00 kthreadd 3 ? 00:00:00 ksoftirqd/0 4 ? 00:00:00 kworker/0:0 5 ? 00:00:00 kworker/0:0H
Print all the processes owned by MySQL
[root@localhost ~]# ps -fG mysql UID PID PPID C STIME TTY TIME CMD mysql 1165 1 0 10:54 ? 00:00:00 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid [root@localhost ~]#
Top
Displays all actively running processes in real-time, The matrices addressed in Linux “top” command are CPU usage, Memory usage, Swap Memory, Cache Size, Buffer Size, Process PID, User, Commandsetc.
[root@localhost ~]# top top - 11:37:31 up 43 min, 3 users, load average: 0.50, 0.15, 0.08 Tasks: 90 total, 1 running, 89 sleeping, 0 stopped, 0 zombie %Cpu(s):100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 1016156 total, 541664 free, 307024 used, 167468 buff/cache KiB Swap: 2097148 total, 2097148 free, 0 used. 553344 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1412 root 20 0 31068 1720 1324 S 99.7 0.2 0:43.38 sysbench 1 root 20 0 125572 4096 2496 S 0.0 0.4 0:00.76 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.04 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u2:0 7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root 20 0 0 0 0 S 0.0 0.0 0:00.15 rcu_sched
Press “c” option in running top command to know absolute path of running process
top - 11:43:20 up 49 min, 3 users, load average: 1.27, 0.88, 0.43 Tasks: 90 total, 1 running, 88 sleeping, 1 stopped, 0 zombie %Cpu(s): 99.2 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.8 si, 0.0 st KiB Mem : 1016156 total, 540768 free, 307752 used, 167636 buff/cache KiB Swap: 2097148 total, 2097148 free, 0 used. 552484 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1412 root 20 0 31068 1720 1324 S 99.7 0.2 6:31.61 sysbench --test=cpu --cpu-max-prime=300000000 run 1415 root 20 0 0 0 0 S 0.3 0.0 0:00.14 [kworker/0:3] 1 root 20 0 125572 4096 2496 S 0.0 0.4 0:00.77 /usr/lib/systemd/systemd --switched-root --system + 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kthreadd] 3 root 20 0 0 0 0 S 0.0 0.0 0:00.04 [ksoftirqd/0] 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/0:0H] 6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kworker/u2:0]
Monitor CPU statistics using IOSTAT
[root@localhost ~]# iostat -c Linux 3.10.0-693.21.1.el7.x86_64 (localhost.localdomain) 04/21/2018 _x86_64_ (1 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 4.73 0.00 0.28 0.15 0.00 94.84
Top five CPU consuming processes
[root@localhost ~]# watch "ps aux | sort -nrk 3,3 | head -n 5" Every 2.0s: ps aux | sort -nrk 3,3 | head -n 5 Sat Apr 21 15:44:49 2018 root 2658 99.8 0.2 39160 2816 pts/0 Sl+ 15:38 6:41 sysbench --test=cpu --cpu-max-prime=30000000 --num-threads=120 mysql 1165 1.8 24.4 1152160 248112 ? Sl 10:54 5:20 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld. USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 932 0.0 0.0 113372 940 ? S 10:54 0:00 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/r root 92 0.0 0.0 0 0 ? S 10:54 0:00 [kauditd]
Monitor % of CPU and memory consumption by Linux processes
[root@localhost ~]# ps aux k-pcpu | head -6 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2658 99.8 0.2 39160 2816 pts/0 Sl+ 15:38 10:05 sysbench --test=cpu --cpu-max-prime=30000000 --num-threads=120 run mysql 1165 1.8 24.4 1152160 248112 ? Sl 10:54 5:20 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid root 1 0.0 0.2 125572 2148 ? Ss 10:54 0:00 /usr/lib/systemd/systemd --switched-root --system --deserialize 21 root 2 0.0 0.0 0 0 ? S 10:54 0:00 [kthreadd] root 3 0.0 0.0 0 0 ? S 10:54 0:00 [ksoftirqd/0]
Monitor active and inactive memory
[root@localhost ~]# vmstat -a procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free inact active si so bi bo in cs us sy id wa st 2 0 0 540052 114584 294208 0 0 52 11 131 76 7 0 93 0 0 [root@localhost ~]#
Monitoring memory usage with timestamp
[root@localhost ~]# vmstat -t 1 50 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp----- r b swpd free buff cache si so bi bo in cs us sy id wa st IST 2 0 0 540508 2108 165580 0 0 22 5 146 65 9 0 91 0 0 2018-04-21 13:01:38 2 0 0 540508 2108 165580 0 0 0 0 300 76 100 0 0 0 0 2018-04-21 13:01:39 2 0 0 540508 2108 165580 0 0 0 0 283 76 100 0 0 0 0 2018-04-21 13:01:40 1 0 0 540508 2108 165580 0 0 0 0 246 73 100 0 0 0 0 2018-04-21 13:01:41 1 0 0 540508 2108 165580 0 0 0 0 280 84 96 4 0 0 0 2018-04-21 13:01:42 2 0 0 540508 2108 165576 0 0 0 24 337 93 100 0 0 0 0 2018-04-21 13:01:43 1 0 0 540508 2108 165576 0 0 0 0 283 79 100 0 0 0 0 2018-04-21 13:01:44
Monitoring top five memory consuming processes
[root@localhost ~]# ps -eo pid,comm,%cpu,%mem --sort=-%mem | head -n 5 PID COMMAND %CPU %MEM 1165 mysqld 1.8 24.4 2405 dhclient 0.0 1.5 774 NetworkManager 0.0 0.4 1137 tuned 0.0 0.2
Monitoring disk I/O statistics
[root@localhost ~]# iostat -d Linux 3.10.0-693.21.1.el7.x86_64 (localhost.localdomain) 04/21/2018 _x86_64_ (1 CPU) Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 27.68 1191.80 3866.67 13191808 42799263 dm-0 22.33 1185.06 3838.77 13117158 42490503 dm-1 8.38 5.99 27.71 66324 306712
More detailed disk I/O monitoring at process level using ‘iotop’
To monitor disk I/O more detailed at process level (very much real-time) we use ‘iotop’
Total DISK READ : 269.52 M/s | Total DISK WRITE : 50.85 M/s Actual DISK READ: 269.52 M/s | Actual DISK WRITE: 51.96 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 1821 be/4 root 1564.46 K/s 204.70 K/s 0.00 % 50.39 % sysbench fileio --thr~le-test-mode=rndrw run 1893 be/4 root 720.09 K/s 190.07 K/s 0.00 % 42.68 % sysbench fileio --thr~le-test-mode=rndrw run 2080 be/4 root 789.54 K/s 190.07 K/s 0.00 % 42.23 % sysbench fileio --thr~le-test-mode=rndrw run 1867 be/4 root 1286.66 K/s 190.07 K/s 0.00 % 41.35 % sysbench fileio --thr~le-test-mode=rndrw run 1870 be/4 root 321.66 K/s 160.83 K/s 0.00 % 41.16 % sysbench fileio --thr~le-test-mode=rndrw run 1833 be/4 root 1257.41 K/s 204.70 K/s 0.00 % 40.66 % sysbench fileio --thr~le-test-mode=rndrw run 1934 be/4 root 789.54 K/s 190.07 K/s 0.00 % 40.15 % sysbench fileio --thr~le-test-mode=rndrw run 2035 be/4 root 1286.66 K/s 204.70 K/s 0.00 % 39.73 % sysbench fileio --thr~le-test-mode=rndrw run 1975 be/4 root 745.68 K/s 190.07 K/s 0.00 % 39.49 % sysbench fileio --thr~le-test-mode=rndrw run 1851 be/4 root 1579.08 K/s 190.07 K/s 0.00 % 39.36 % sysbench fileio --thr~le-test-mode=rndrw run 1836 be/4 root 877.27 K/s 190.07 K/s 0.00 % 39.35 % sysbench fileio --thr~le-test-mode=rndrw run 2001 be/4 root 160.83 K/s 131.59 K/s 0.00 % 39.34 % sysbench fileio --thr~le-test-mode=rndrw run 1879 be/4 root 1842.26 K/s 190.07 K/s 0.00 % 39.22 % sysbench fileio --thr~le-test-mode=rndrw run 1872 be/4 root 263.18 K/s 204.70 K/s 0.00 % 38.48 % sysbench fileio --thr~le-test-mode=rndrw run 1953 be/4 root 2.81 M/s 146.21 K/s 0.00 % 38.35 % sysbench fileio --thr~le-test-mode=rndrw run 1941 be/4 root 292.42 K/s 277.80 K/s 0.00 % 38.31 % sysbench fileio --thr~le-test-mode=rndrw run 1913 be/4 root 1345.14 K/s 204.70 K/s 0.00 % 38.02 % sysbench fileio --thr~le-test-mode=rndrw run 2017 be/4 root 628.71 K/s 160.83 K/s 0.00 % 37.93 % sysbench fileio --thr~le-test-mode=rndrw run 2040 be/4 root 745.68 K/s 190.07 K/s 0.00 % 37.61 % sysbench fileio --thr~le-test-mode=rndrw run 1942 be/4 root 555.60 K/s 160.83 K/s 0.00 % 37.49 % sysbench fileio --thr~le-test-mode=rndrw run 1980 be/4 root 233.94 K/s 219.32 K/s 0.00 % 37.49 % sysbench fileio --thr~le-test-mode=rndrw run
This is how usually we begin a MySQL performance benchmarking and audit project, We first understand the load in the Linux and from there our matrices will be completely MySQL biased, We deliver very detailed performance optimization recommendation, which will / can be used by our customers for performance optimization and capacity planning / sizing.