Enterprise-class Consulting, 24*7 Support and Remote DBA Services for MySQL, MariaDB, PostgreSQL and ClickHouse

We already have written blog on Sysbench (https://minervadb.com/index.php/2018/03/13/benchmarking-mysql-using-sysbench-1-1/) , so in this blog we are not covering basic details like installation and configuration of Sysbench. In this blog we are just specific on benchmarking CPU, Memory, file I/O and mutex performance :

Benchmarking CPU using Sysbench

This benchmark is configured with the number of simultaneous threads and the maximum number to verify if it is a prime.

[root@localhost shiv]# sysbench --test=cpu --cpu-max-prime=2000000 --num-threads=120 run

Running the test with following options:
Number of threads: 120
Initializing random number generator from current time


Prime numbers limit: 2000000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:     0.69

Throughput:
    events/s (eps):                      0.6891
    time elapsed:                        174.1418s
    total number of events:              120

Latency (ms):
         min:                               169807.71
         avg:                               172640.02
         max:                               174120.65
         95th percentile:                   100000.00
         sum:                             20716802.25

Threads fairness:
    events (avg/stddev):           1.0000/0.00
    execution time (avg/stddev):   172.6400/0.83

“time elapsed” is the variable we seriously look for to measure CPU performance, In this case it is 174.1418 seconds.

Benchmarking threads performance using sysbench

When we increase the threads workload, each worker thread will be allocated a mutex (a sort of lock) and will, for each execution, loop a number of times (documented as the number of yields) in which it takes the lock, yields (meaning it asks the scheduler to stop itself from running and put it back and the end of the runqueue) and then, when it is scheduled again for execution, unlock.

[root@localhost shiv]# sysbench --test=threads --thread-locks=10 --max-time=60 run

sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Initializing worker threads...

Threads started!


Throughput:
    events/s (eps):                      2366.0725
    time elapsed:                        60.0003s
    total number of events:              141965

Latency (ms):
         min:                                    0.38
         avg:                                    0.42
         max:                                    8.86
         95th percentile:                        0.53
         sum:                                59942.51

Threads fairness:
    events (avg/stddev):           141965.0000/0.00
    execution time (avg/stddev):   59.9425/0.00

To conclude the interpretation of thread performance benchmarking, we annotate time elapsed (actual time for the completion of the activity), in this case it “60.0003” seconds.

Benchmarking mutex workload

When benchmarking mutex workload, sysbench will run a single request per thread. This request generates load on the CPU (using a simple incremental loop, through the –mutex-loops parameter), after that it makes a random mutex, increments a global variable and release the lock again. This process is continued till the number of locks mentioned (–mutex-locks). The random mutex is generated by –mutex-num parameter.

[root@localhost shiv]# sysbench --test=mutex --num-threads=130 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 130
Initializing random number generator from current time


Initializing worker threads...

Threads started!


Throughput:
    events/s (eps):                      5.8047
    time elapsed:                        22.3956s
    total number of events:              130

Latency (ms):
         min:                                17566.82
         avg:                                20789.93
         max:                                22230.90
         95th percentile:                    21641.55
         sum:                              2702690.46

Threads fairness:
    events (avg/stddev):           1.0000/0.00
    execution time (avg/stddev):   20.7899/0.82

The throughput and average latency are the two matrices we consider to interpret mutex workload performance :

Throughput:
    events/s (eps):                      5.8047
    time elapsed:                        22.3956s

Latency (ms):
         min:                                17566.82
         avg:                                20789.93
         max:                                22230.90
         95th percentile:                    21641.55
         sum:                              2702690.46

Benchmarking the memory workload

When we use sysbench to benchmark memory, sysbench allocate a memory buffer and then read or write from/on it, each time for the size of a pointer (32 bit or 64 bit) and until the total buffer size has been read from or written to. This activity will be continued till the provided volume (–memory-total-size) is reached. The load can be increased or reduced by providing multiple threads (–num-threads), size of buffer (–memory-block-size) and request type (read / write / sequential / random)

[root@localhost shiv]# sysbench --test=memory --num-threads=140 --memory-total-size=10G run

sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 140
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 10240MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 10485720 (3351958.44 per second)

10239.96 MiB transferred (3273.40 MiB/sec)


Throughput:
    events/s (eps):                      3351958.4393
    time elapsed:                        3.1282s
    total number of events:              10485720

Latency (ms):
         min:                                    0.00
         avg:                                    0.01
         max:                                 2931.98
         95th percentile:                        0.00
         sum:                               123371.54

Threads fairness:
    events (avg/stddev):           74898.0000/0.00
    execution time (avg/stddev):   0.8812/0.93

Throughput and operations per second are the important matrices to measure for memory workload benchmarking :

Total operations: 10485720 (3351958.44 per second)

10239.96 MiB transferred (3273.40 MiB/sec)

Benchmarking file system I/O with Sysbench

You can use multiple scenarios for benchmarking file system I/O but here we have used rndrw (combined random read / write) for more complex I/O and production similar I/O operations, This happens in three steps explained below:

Prepare – Creates the files for testing
Run – Performs the benchmarking and reporting
Cleanup – Clean the system by deleting the files

Prepare

[root@localhost shiv]# sysbench --num-threads=16 --test=fileio --file-total-size=10G --file-test-mode=rndrw prepare

sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3)

128 files, 81920Kb each, 10240Mb total
Creating files for the test...
Extra file open flags: (none)
Reusing existing file test_file.0
Reusing existing file test_file.1
Reusing existing file test_file.2
Reusing existing file test_file.3
..................................
..................................

Reusing existing file test_file.122
Reusing existing file test_file.123
Reusing existing file test_file.124
Reusing existing file test_file.125
Reusing existing file test_file.126
Reusing existing file test_file.127

Run

[root@localhost shiv]# sysbench --num-threads=16 --test=fileio --file-total-size=10G --file-test-mode=rndrw run

sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 16
Initializing random number generator from current time


Extra file open flags: (none)
128 files, 80MiB each
10GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Initializing worker threads...

Threads started!


Throughput:
         read:  IOPS=2495.85 39.00 MiB/s (40.89 MB/s)
         write: IOPS=1663.70 26.00 MiB/s (27.26 MB/s)
         fsync: IOPS=5311.68

Latency (ms):
         min:                                  0.00
         avg:                                  1.69
         max:                                631.90
         95th percentile:                      5.00
         sum:                             159794.48

Cleanup

[root@localhost shiv]# sysbench --num-threads=16 --test=fileio --file-total-size=10G --file-test-mode=rndrw cleanup 
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
WARNING: --num-threads is deprecated, use --threads instead
sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3)

Removing test files...

In the file system I/O benchmarking, We spend time annotating and interpreting only throughput (both reads and writes) under varying loads, Here in the test above read throughput is 40.89 MB/s and the write throughput is 27.26 MB/s

The WebScale Database Infrastructure Operations Experts

Committed to Building Optimal, Scalable, Highly Available, Fault-Tolerant, Reliable and Secured WebScale Database Infrastructure Operations

Benchmarking CPU, Memory, file I/O and mutex performance using Sysbench

Benchmarking file system I/O with Sysbench

Benchmarking file system I/O with Sysbench

Related Articles

Benchmarking InnoDB and MyRocks Performance using Sysbench

MySQL 8.0 : Improved error logging

How to use ProxySQL to work on ClickHouse like MySQL ?