We already have written blog on Sysbench (https://minervadb.com/index.php/2018/03/13/benchmarking-mysql-using-sysbench-1-1/) , so in this blog we are not covering basic details like installation and configuration of Sysbench. In this blog we are just specific on benchmarking CPU, Memory, file I/O and mutex performance :
Benchmarking CPU using Sysbench
This benchmark is configured with the number of simultaneous threads and the maximum number to verify if it is a prime.
[root@localhost shiv]# sysbench --test=cpu --cpu-max-prime=2000000 --num-threads=120 run
Running the test with following options: Number of threads: 120 Initializing random number generator from current time Prime numbers limit: 2000000 Initializing worker threads... Threads started! CPU speed: events per second: 0.69 Throughput: events/s (eps): 0.6891 time elapsed: 174.1418s total number of events: 120 Latency (ms): min: 169807.71 avg: 172640.02 max: 174120.65 95th percentile: 100000.00 sum: 20716802.25 Threads fairness: events (avg/stddev): 1.0000/0.00 execution time (avg/stddev): 172.6400/0.83
[root@localhost shiv]# sysbench --test=threads --thread-locks=10 --max-time=60 run sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 1 Initializing random number generator from current time Initializing worker threads... Threads started! Throughput: events/s (eps): 2366.0725 time elapsed: 60.0003s total number of events: 141965 Latency (ms): min: 0.38 avg: 0.42 max: 8.86 95th percentile: 0.53 sum: 59942.51 Threads fairness: events (avg/stddev): 141965.0000/0.00 execution time (avg/stddev): 59.9425/0.00
To conclude the interpretation of thread performance benchmarking, we annotate time elapsed (actual time for the completion of the activity), in this case it “60.0003” seconds.
[root@localhost shiv]# sysbench --test=mutex --num-threads=130 run WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options. WARNING: --num-threads is deprecated, use --threads instead sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 130 Initializing random number generator from current time Initializing worker threads... Threads started! Throughput: events/s (eps): 5.8047 time elapsed: 22.3956s total number of events: 130 Latency (ms): min: 17566.82 avg: 20789.93 max: 22230.90 95th percentile: 21641.55 sum: 2702690.46 Threads fairness: events (avg/stddev): 1.0000/0.00 execution time (avg/stddev): 20.7899/0.82
The throughput and average latency are the two matrices we consider to interpret mutex workload performance :
Throughput: events/s (eps): 5.8047 time elapsed: 22.3956s Latency (ms): min: 17566.82 avg: 20789.93 max: 22230.90 95th percentile: 21641.55 sum: 2702690.46
Benchmarking the memory workload
When we use sysbench to benchmark memory, sysbench allocate a memory buffer and then read or write from/on it, each time for the size of a pointer (32 bit or 64 bit) and until the total buffer size has been read from or written to. This activity will be continued till the provided volume (–memory-total-size) is reached. The load can be increased or reduced by providing multiple threads (–num-threads), size of buffer (–memory-block-size) and request type (read / write / sequential / random)
[root@localhost shiv]# sysbench --test=memory --num-threads=140 --memory-total-size=10G run sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 140 Initializing random number generator from current time Running memory speed test with the following options: block size: 1KiB total size: 10240MiB operation: write scope: global Initializing worker threads... Threads started! Total operations: 10485720 (3351958.44 per second) 10239.96 MiB transferred (3273.40 MiB/sec) Throughput: events/s (eps): 3351958.4393 time elapsed: 3.1282s total number of events: 10485720 Latency (ms): min: 0.00 avg: 0.01 max: 2931.98 95th percentile: 0.00 sum: 123371.54 Threads fairness: events (avg/stddev): 74898.0000/0.00 execution time (avg/stddev): 0.8812/0.93
Throughput and operations per second are the important matrices to measure for memory workload benchmarking :
Total operations: 10485720 (3351958.44 per second) 10239.96 MiB transferred (3273.40 MiB/sec)
Benchmarking file system I/O with Sysbench
You can use multiple scenarios for benchmarking file system I/O but here we have used rndrw (combined random read / write) for more complex I/O and production similar I/O operations, This happens in three steps explained below:
- Prepare – Creates the files for testing
- Run – Performs the benchmarking and reporting
- Cleanup – Clean the system by deleting the files
Prepare
[root@localhost shiv]# sysbench --num-threads=16 --test=fileio --file-total-size=10G --file-test-mode=rndrw prepare sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3) 128 files, 81920Kb each, 10240Mb total Creating files for the test... Extra file open flags: (none) Reusing existing file test_file.0 Reusing existing file test_file.1 Reusing existing file test_file.2 Reusing existing file test_file.3 .................................. .................................. Reusing existing file test_file.122 Reusing existing file test_file.123 Reusing existing file test_file.124 Reusing existing file test_file.125 Reusing existing file test_file.126 Reusing existing file test_file.127
Run
[root@localhost shiv]# sysbench --num-threads=16 --test=fileio --file-total-size=10G --file-test-mode=rndrw run sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 16 Initializing random number generator from current time Extra file open flags: (none) 128 files, 80MiB each 10GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Periodic FSYNC enabled, calling fsync() each 100 requests. Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random r/w test Initializing worker threads... Threads started! Throughput: read: IOPS=2495.85 39.00 MiB/s (40.89 MB/s) write: IOPS=1663.70 26.00 MiB/s (27.26 MB/s) fsync: IOPS=5311.68 Latency (ms): min: 0.00 avg: 1.69 max: 631.90 95th percentile: 5.00 sum: 159794.48
Cleanup
[root@localhost shiv]# sysbench --num-threads=16 --test=fileio --file-total-size=10G --file-test-mode=rndrw cleanup WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options. WARNING: --num-threads is deprecated, use --threads instead sysbench 1.1.0-651e7fd (using bundled LuaJIT 2.1.0-beta3) Removing test files...
In the file system I/O benchmarking, We spend time annotating and interpreting only throughput (both reads and writes) under varying loads, Here in the test above read throughput is 40.89 MB/s and the write throughput is 27.26 MB/s