MySQL 8 Group Replication Limitations

Understanding MySQL 8.0 Group Replication Limitations to build an highly reliable MySQL Infrastructure Operations 


Introduction – Redundant and fault-tolerant database systems

Most of our customers are 24*7 internet properties and their applications ( i.e. entire business operations) rely heavily on availability of respective database systems nodes, These systems going unavailable cause serious impact on reputation, customer experience and revenue. To ensure business continuity under foreseeable and unforeseeable man-made or natural disasters, the database system has to be designed and built with fault tolerance and disaster recovery capabilities, in other words the component can be removed and the system should continue to operate as expected. High availability refers to the characteristics of a system that allow the system to sustain continuous operation in the event of hardware and software failures due to natural or man made causes (Webb, 2008).

The availability of system is normally defined as (Marcus, 2003): –  A= MTBF / MTBF + MTTR

Where A is system availability, MTBF is the mean time between failures, and MTTR is the mean time to recover the system. From the formula for system availability, it can be derived that when MTTR approaches zero (i.e., system down time is substantially short), availability (A) increases toward 100 percent. On the other hand, when MTBF gets larger (i.e., system down time occurs very rarely), MTTR has less impact on A. Therefore, the goal of system high availability is to make MTTR as small as possible and MTBF as large as possible.

How MySQL 8.0 Group Replication works ?

  • MySQL Group Replication is a plugin (first released with MySQL 5.7 Server) which can provide virtually synchronous replication, with built-in conflict detection / handling and consistency guarantee, So the individual machines and/or MySQL instances can fail or be taken offline for maintenance while the distributed MySQL service continues to operate and handle application traffic.
  • MySQL Group Replication is based on a Paxos implementation, which ensures that all servers receive the same set of data in the same order. There is a built-in group membership service that keeps the view of the group consistent and available for all servers at any given point in time. Servers can leave and join the group and the view is updated accordingly. Sometimes servers can leave the group unexpectedly, in which case the failure detection mechanism detects this and notifies the group that the view has changed. This is all automatic.
  • In MySQL Group Replication, For a transaction to commit, the majority of the group have to agree on the order of a given transaction in the global sequence of transactions. Deciding to commit or abort a transaction is done by each server individually, but all servers make the same decision. If there is a network partition, resulting in a split where members are unable to reach agreement, then the system does not progress until this issue is resolved. Hence there is also a built-in, automatic, split-brain protection mechanism.

MySQL 8.0 Group Replication Limitations

  • Maximum number of MySQL servers that can be the members of a single replication group is 9, Any further attempt made by adding more members will be refused. Functionally this appears very rational to us, Why add several MySQL instances to make the Group Replication infrastructure operationally complex ?
  • If you are operating Group Replication in multi-primary mode, The SELECT … FOR UPDATE statements can cause a deadlock, This is because the lock is not shared across the members of the group, therefore the expectation for such a statement might not be reached.
  • We cannot use  replication filters on Group Replication. Because, Filtering the transactions  on selected servers would make the group enabled  to reach agreement on consistent state. But, You can implement  replication filters on the replication channels that are not directly involved with  Group Replication

Handling Transaction Limit in MySQL Group Replication

  • Configure group_replication_transaction_size_limit system variable to control maximum transaction size the group will tolerate, The default value is approximately 143 MB (150000000 bytes). The time taken to process a transaction is proportional to its size.
  • Group Replication use compression algorithm LZ4 and it is enabled by default with threshold of 1000000 bytes (1 MB), You can tune this value by configuring the system variable group_replication_compression_threshold . We have copied below the steps to set a compression threshold:
STOP GROUP_REPLICATION;
SET GLOBAL group_replication_compression_threshold= 5242880;
START GROUP_REPLICATION; 

In the above script, We have set the compression threshold to 5MB. Now if a transaction generates a replication  message  more  than  5MB, i.e., the binary log  transaction entry is larger than 5MB, then it is compressed.

Note: You can disable the compression by setting threshold to “0

  • In Group Replication large messages are automatically fragmented. Because, When an abnormally large message is sent between Group Replication group members, it can result in some group members being reported as failed and expelled from the group. So large messages are automatically split into fragments that are sent separately and reassembled by the recipients. You can configure system variable  group_replication_communication_max_message_size ( default value is 10485760 bytes / 10 MiB ) to specify a message size above which messages are fragmented. This value cannot be larger than the system variable slave_max_allowed_packet , which is 1073741824 bytes (1 GB). You can switch off fragmentation by setting value “0” for group_replication_communication_max_message_size .

Note: In order for a replication group to use fragmentation, all group members must be at MySQL 8.0.16 or above

Conclusion

Group Replication definitely  is an off-the-shelf solution for building highly available, fault tolerant  and self-healing MySQL infrastructure operations. Even if some of the servers subsequently fail, as long it is not all or a majority, the system is still available. Depending on the number of servers which fail the group might have degraded performance or scalability, but it is still available. To conclude, MySQL Group Replication guarantees that the database service is continuously available.  But, When you are using such compelling solution for MySQL Reliability, It’s alway better to also discount limitations and tunable parameter for better performance. Thanks  for taking  time in reading this  blog and we welcome your comments.

References 

https://www.percona.com/blog/2017/08/01/group-replication-sweet-sour/ 

https://mysqlhighavailability.com/mysql-group-replication-a-quick-start-guide/ 

https://dev.mysql.com/doc/refman/8.0/en/group-replication-limitations.html

About MinervaDB Corporation 88 Articles
Independent and vendor neutral consulting, support, remote DBA services and training for MySQL, MariaDB, Percona Server, PostgreSQL and ClickHouse with core expertize in performance, scalability and high availability . We are an virtual corporation, all of us work from home on multiple timezones and stay connected via Email, Skype, Google Hangouts, Phone and IRC supporting over 250 customers worldwide
UA-155183614-1