Patroni: Cascading Replication with Standby Cluster – PostgreSQL High Availability Guide

PostgreSQL has become the database of choice for organizations that demand reliability, and Patroni sits at the center of most production-grade PostgreSQL high availability deployments. In this in-depth guide we walk through one of Patroni's most powerful and least understood capabilities: the Patroni standby cluster. A standby cluster implements cascading PostgreSQL Replication across an entirely separate Patroni cluster, giving you cross-datacenter disaster recovery, controlled migrations, and read scaling without ever touching your primary control plane. Patroni standby cluster architecture for PostgreSQL high availability and cascading PostgreSQL Replication   Patroni is a widely used solution for managing PostgreSQL high availability. It provides a robust framework for automatic failover, cluster management, and operational simplicity in PostgreSQL environments. Throughout this article we focus on how a Patroni standby cluster works, the prerequisites you must satisfy before bootstrapping it, how to configure it end to end, how it behaves during switchovers, and finally how to promote it into a fully independent primary cluster.

What Is a Patroni Standby Cluster?

Fundamentally, a Patroni standby cluster implements cascading PostgreSQL Replication. Within the standby cluster there is a special node that acts as the standby leader. Although this node behaves like a leader inside the standby cluster, it is actually a replica of the primary Patroni cluster. Every other node in the standby cluster then replicates from this standby leader rather than reaching back to the source cluster directly. This cascading topology is what makes the design so elegant. The primary Patroni cluster only ever sees a single replication consumer, the standby leader, while the standby cluster maintains its own isolated high availability control plane. If the standby leader fails, Patroni promotes another standby node to become the new standby leader and replication continues transparently. Because each cluster runs its own Distributed Configuration Store (DCS), the two clusters never fight over leadership and remain operationally independent.

Why Do We Use a Patroni Standby Cluster?

There are several reasons why teams deploy a standby cluster in production PostgreSQL environments. The most common use cases can be grouped into four broad categories, and understanding them helps you decide whether a standby cluster is the right tool for your PostgreSQL high availability strategy. Geographic Redundancy and Disaster Recovery. If the entire primary cluster becomes unavailable because of a data center failure, an infrastructure outage, or another catastrophic event, the standby cluster can be promoted to become the new primary cluster. Because the standby cluster lives in a different failure domain, it survives events that would take down every node of the primary at once. This is the single most popular reason for adopting cascading PostgreSQL Replication with Patroni. Controlled Migrations. A standby cluster lets teams migrate an existing PostgreSQL cluster to new hardware, a new data center, or another cloud provider with minimal downtime. You bring the standby cluster fully in sync, schedule a short maintenance window, and then promote it. The application simply repoints to the new endpoint, and the migration is complete without a lengthy dump-and-restore. Isolated Load Balancing and Read Scaling. Applications running in another region can perform read operations against the standby cluster, reducing network latency compared to reaching across regions to the primary cluster, an approach often paired with broader PostgreSQL Replication strategies. Because the standby cluster is isolated, heavy analytical reads there never destabilize the write path on the primary Patroni cluster. Testing. The standby cluster can be used for disaster recovery testing, failover simulations, and backup validation without impacting the primary production cluster. You can safely rehearse promotions, measure recovery time objectives, and validate that your runbooks actually work before a real incident forces the issue.

Prerequisites for the Standby Cluster

Before setting up a standby cluster, several requirements must be satisfied. Skipping any of these is the most common cause of a failed bootstrap, so it is worth reviewing each one carefully before you begin building your PostgreSQL Replication topology. First, you need a healthy primary Patroni cluster. For demonstration purposes, the source cluster in this guide runs three nodes on the addresses 192.168.122.237, 192.168.122.93, and 192.168.122.128. Second, Patroni expects a postgresql.conf or postgresql.conf.backup file to be present in the PGDATA directory on the remote primary. This matters when the basebackup method is used as one of the create_replica_methods. Otherwise the standby leader bootstrap fails, and the bootstrap of every other node in the standby cluster fails as a consequence. If postgresql.conf is kept outside PGDATA, as it is on Debian based distributions, it is your responsibility to copy it into PGDATA. Third, if a replication slot will be used for the standby cluster by specifying primary_slot_name, then a corresponding replication slot must be created by the user with a name matching primary_slot_name. Patroni does not automatically create a replication slot on the primary cluster for the standby cluster. If use_slots is enabled, the permanent replication slots feature can be used on the primary cluster so that, even when a switchover or failover occurs, the replication slot is preserved and maintained by Patroni. Fourth, either a single endpoint such as a VIP, or a list containing all the hosts of the primary Patroni cluster, must be defined in standby_cluster.hosts. When you use a list, hostnames or IP addresses should be separated by commas. Fifth, when basebackup and streaming replication will be used, new pg_hba rules must be added to the primary Patroni cluster for the standby cluster nodes. It is strongly recommended to add rules for all standby cluster nodes, not just the designated standby leader, so that a switchover or failover inside the standby cluster does not break replication. The commands below apply the required changes on the primary Patroni cluster. We list the current topology and then edit the configuration to add the new pg_hba rules and a permanent replication slot:
# patronictl -c /etc/patroni/16-pgcluster.yml list
+ Cluster: pgcluster (7610779391636123875) ------+----+-------------+-----+------------+-----+
| Member | Host            | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
| so1    | 192.168.122.237 | Replica | streaming | 12 | 0/F0001B8   | 0   | 0/F0001B8  | 0   |
| so2    | 192.168.122.93  | Replica | streaming | 12 | 0/F0001B8   | 0   | 0/F0001B8  | 0   |
| so3    | 192.168.122.128 | Leader  | running   | 12 |             |     |            |     |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+

# patronictl -c /etc/patroni/16-pgcluster.yml edit-config
---
+++
@@ -42,6 +42,12 @@
   - host postgres    rewinder   192.168.122.93/32  scram-sha-256
   - host replication replicator 192.168.122.128/32 scram-sha-256
   - host postgres    rewinder   192.168.122.128/32 scram-sha-256
+  - host replication replicator 192.168.122.97/32  scram-sha-256
+  - host replication replicator 192.168.122.198/32 scram-sha-256
+  - host replication replicator 192.168.122.172/32 scram-sha-256
+  - host postgres    rewinder   192.168.122.97/32  scram-sha-256
+  - host postgres    rewinder   192.168.122.198/32 scram-sha-256
+  - host postgres    rewinder   192.168.122.172/32 scram-sha-256
   pg_ident:
   - rewindmap postgres rewinder
@@ -49,4 +55,7 @@
     use_pg_rewind: true
     use_slots: true
   retry_timeout: 10
+  slots:
+    my_standby_cluster_slot:
+      type: physical
   ttl: 30

Apply these changes? [y/N]:
As you can see, the primary Patroni cluster is running and healthy, a replication slot named my_standby_cluster_slot has been created, and pg_hba rules have been added for every standby cluster node. Because this example uses a RHEL based distribution, our postgresql.conf already lives inside PGDATA and we did not need to move it. For standby_cluster.hosts we will use the value '192.168.122.237,192.168.122.93,192.168.122.128'. After applying the changes, we can verify the replication slot on the leader node of the primary Patroni cluster:
[postgres@so3 ~]$ psql -c "select slot_name, slot_type, active, wal_status from pg_replication_slots;"
        slot_name        | slot_type | active | wal_status
-------------------------+-----------+--------+------------
 so1                     | physical  | t      | reserved
 so2                     | physical  | t      | reserved
 my_standby_cluster_slot | physical  | f      | reserved
(3 rows)
Since the standby cluster has not been bootstrapped yet, the replication slot my_standby_cluster_slot is currently inactive, indicated by the active column showing f. Once the standby leader connects, this slot will become active and begin retaining WAL for the standby cluster.

Bootstrapping the Standby Cluster

With the prerequisites in place, we can now define the standby cluster. The configuration below is the Patroni YAML file for the first node of the standby cluster. Pay special attention to the bootstrap.dcs.standby_cluster block, which is where cascading PostgreSQL Replication is actually declared:
# cat /etc/patroni/16-pgcluster.yml
scope: pgcluster
namespace: /service/patroni/
name: ta1

restapi:
  listen: 192.168.122.97:8008
  connect_address: 192.168.122.97:8008

etcd3:
  hosts:
  - 192.168.122.97:2379
  - 192.168.122.198:2379
  - 192.168.122.172:2379
  protocol: http

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    standby_cluster:
      host: '192.168.122.237,192.168.122.93,192.168.122.128'
      port: 5432
      primary_slot_name: my_standby_cluster_slot
      create_replica_methods:
      - pg_basebackup
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        archive_command: /bin/true
        archive_mode: 'on'
        work_mem: 64 MB

  pg_hba:
  - local all postgres peer
  - local replication replicator peer map=replicamap
  - local postgres rewinder peer map=rewindmap
  - host replication replicator 192.168.122.97/32  scram-sha-256
  - host postgres    rewinder   192.168.122.97/32  scram-sha-256
  - host replication replicator 192.168.122.198/32 scram-sha-256
  - host postgres    rewinder   192.168.122.198/32 scram-sha-256
  - host replication replicator 192.168.122.172/32 scram-sha-256
  - host postgres    rewinder   192.168.122.172/32 scram-sha-256
  - host all postgres all reject

  pg_ident:
  - rewindmap postgres rewinder
  - replicamap postgres replicator
  initdb:
  - encoding: UTF8
  - data-checksums
  - auth-host: scram-sha-256

postgresql:
  listen: 192.168.122.97:5432
  connect_address: 192.168.122.97:5432
  data_dir: /var/lib/pgsql/16/pgcluster
  bin_dir: /usr/pgsql-16/bin
  pgpass: /etc/patroni/pgcluster.pgpass
  use_unix_socket: true
  use_unix_socket_repl: true
  parameters:
    password_encryption: scram-sha-256
    data_directory: /var/lib/pgsql/16/pgcluster
  authentication:
    replication:
      username: replicator
      password: $strong_replicator_password
    superuser:
      username: postgres
    rewind:
      username: rewinder
      password: $strong_rewinder_password

tags:
  noloadbalance: false
  clonefrom: false
  nosync: false
There are a few important points worth highlighting about this configuration. In the previous section, the patronictl ... list output showed that the cluster name of the primary Patroni cluster is pgcluster. In the standby cluster configuration we reuse the same name, which is not a problem here because we use an independent etcd cluster for the standby cluster. Their HA control planes are completely isolated from each other. You must be very careful if a single etcd is shared by the primary and standby clusters; in that architecture the cluster scope must be different, or a different namespace must be used, otherwise the two clusters will collide. In the postgresql.authentication section, the username and password pairs for replication and rewind must be exactly the same as the ones used in the primary Patroni cluster. Otherwise either the cluster bootstrap or a later pg_rewind operation will fail. The pg_hba rules here are fairly straightforward: the cluster nodes allow incoming connections from one another. However, if you plan to reconfigure the current primary Patroni cluster as a new standby cluster after promoting this one, the pg_hba rules must be adjusted accordingly ahead of time. Once every standby node is configured, start the Patroni service on each standby node:
# systemctl start patroni@16-pgcluster.service

Checking the Standby Cluster

After the services start, we can check the state of the standby cluster. Notice how one node reports the role Standby Leader while the others are ordinary Replica nodes streaming from it. This is cascading PostgreSQL Replication in action:
[root@ta1 pgcluster]# patronictl -c /etc/patroni/16-pgcluster.yml list
+ Cluster: pgcluster (7610779391636123875) -+----------------+-----------+----+-------------+-----+------------+-----+
| Member | Host            | Role           | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+----------------+-----------+----+-------------+-----+------------+-----+
| ta1    | 192.168.122.97  | Replica        | streaming | 12 | 0/F0001B8   | 0   | 0/F0001B8  | 0   |
| ta2    | 192.168.122.198 | Standby Leader | streaming | 12 |             |     |            |     |
| ta3    | 192.168.122.172 | Replica        | streaming | 12 | 0/F0001B8   | 0   | 0/F0001B8  | 0   |
+--------+-----------------+----------------+-----------+----+-------------+-----+------------+-----+
To understand exactly what happened under the hood, it helps to inspect the Patroni logs. The standby leader's log shows that it bootstrapped itself directly from a remote member of the primary cluster using basebackup:
Mar 10 09:30:46 ta2 patroni@16-pgcluster[2294]: INFO: Selected new etcd server http://192.168.122.198:2379
Mar 10 09:30:46 ta2 patroni@16-pgcluster[2294]: INFO: No PostgreSQL configuration items changed, nothing to reload.
Mar 10 09:30:46 ta2 patroni@16-pgcluster[2294]: INFO: Lock owner: None; I am ta2
Mar 10 09:30:46 ta2 patroni@16-pgcluster[2294]: INFO: trying to bootstrap a new standby leader
Mar 10 09:30:47 ta2 patroni@16-pgcluster[2294]: INFO: replica has been created using basebackup
Mar 10 09:30:47 ta2 patroni@16-pgcluster[2294]: INFO: bootstrapped clone from remote member postgresql://192.168.122.237,192.168.122.93,192.168.122.128:5432
Mar 10 09:30:47 ta2 patroni@16-pgcluster[2294]: INFO: postmaster pid=2354
Mar 10 09:30:48 ta2 patroni@16-pgcluster[2361]: /run/postgresql:5432 - accepting connections
Mar 10 09:30:48 ta2 patroni@16-pgcluster[2294]: INFO: establishing a new patroni heartbeat connection to postgres
Mar 10 09:30:48 ta2 patroni@16-pgcluster[2294]: INFO: initialized a new cluster
Mar 10 09:30:48 ta2 patroni@16-pgcluster[2294]: INFO: no action. I am (ta2), the standby leader with the lock
Meanwhile, the log on a replica node in the standby cluster tells a different story. That node does not reach back to the primary cluster at all; instead it bootstraps from the standby leader ta2, confirming the cascading topology:
Mar 10 09:30:46 ta1 patroni@16-pgcluster[1615]: INFO: Selected new etcd server http://192.168.122.97:2379
Mar 10 09:30:46 ta1 patroni@16-pgcluster[1615]: INFO: Lock owner: None; I am ta1
Mar 10 09:30:46 ta1 patroni@16-pgcluster[1615]: INFO: failed to acquire initialize lock
Mar 10 09:30:48 ta1 patroni@16-pgcluster[1615]: INFO: Lock owner: ta2; I am ta1
Mar 10 09:30:48 ta1 patroni@16-pgcluster[1615]: INFO: trying to bootstrap from leader 'ta2'
Mar 10 09:30:48 ta1 patroni@16-pgcluster[1615]: INFO: bootstrap from leader 'ta2' in progress
Mar 10 09:30:49 ta1 patroni@16-pgcluster[1615]: INFO: replica has been created using basebackup
Mar 10 09:30:49 ta1 patroni@16-pgcluster[1615]: INFO: bootstrapped from leader 'ta2'
Mar 10 09:30:49 ta1 patroni@16-pgcluster[1615]: INFO: postmaster pid=1658
Mar 10 09:30:50 ta1 patroni@16-pgcluster[1665]: /run/postgresql:5432 - accepting connections
Mar 10 09:30:50 ta1 patroni@16-pgcluster[1615]: INFO: establishing a new patroni heartbeat connection to postgres
Mar 10 09:30:50 ta1 patroni@16-pgcluster[1615]: INFO: no action. I am (ta1), a secondary, and following a standby leader (ta2)
The standby leader is bootstrapped using a remote member of the primary cluster, while the replica is bootstrapped from the standby leader. We can now re-check the replication slots on the leader of the primary Patroni cluster:
[postgres@so3 ~]$ psql -c "select slot_name, slot_type, active, wal_status from pg_replication_slots;"
        slot_name        | slot_type | active | wal_status
-------------------------+-----------+--------+------------
 so1                     | physical  | t      | reserved
 so2                     | physical  | t      | reserved
 my_standby_cluster_slot | physical  | t      | reserved
(3 rows)
The replication slot my_standby_cluster_slot is now active, confirming that the standby leader has connected and the entire standby cluster is streaming through a single, protected slot on the primary.

Playing With the Architecture

A standby cluster is only useful if it survives the events that happen in real production life, so let us stress the topology. First, we perform a switchover inside the primary Patroni cluster and observe how both clusters react. In a well-behaved cascading setup, a leadership change on the primary should not disrupt the standby cluster at all:
[root@so1 ~]# patronictl -c /etc/patroni/16-pgcluster.yml switchover --leader so3 --candidate so1 --force
Current cluster topology
+ Cluster: pgcluster (7610779391636123875) ------+----+-------------+-----+------------+-----+
| Member | Host            | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
| so1    | 192.168.122.237 | Replica | streaming | 12 | 0/F0001B8   | 0   | 0/F0001B8  | 0   |
| so2    | 192.168.122.93  | Replica | streaming | 12 | 0/F0001B8   | 0   | 0/F0001B8  | 0   |
| so3    | 192.168.122.128 | Leader  | running   | 12 |             |     |            |     |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
2026-03-10 13:19:41.74548 Successfully switched over to "so1"

[root@so1 ~]# patronictl -c /etc/patroni/16-pgcluster.yml list
+ Cluster: pgcluster (7610779391636123875) ------+----+-------------+-----+------------+-----+
| Member | Host            | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
| so1    | 192.168.122.237 | Leader  | running   | 13 |             |     |            |     |
| so2    | 192.168.122.93  | Replica | streaming | 13 | 0/100001B8  | 0   | 0/100001B8 | 0   |
| so3    | 192.168.122.128 | Replica | streaming | 13 | 0/100001B8  | 0   | 0/100001B8 | 0   |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
The primary cluster switched over cleanly and so1 is now the leader on a new timeline. But what about the replication slot that feeds the standby cluster? Let us check:
[root@so1 ~]# sudo -u postgres psql -c "select slot_name, slot_type, active, wal_status from pg_replication_slots;"
        slot_name        | slot_type | active | wal_status
-------------------------+-----------+--------+------------
 so3                     | physical  | t      | reserved
 so2                     | physical  | t      | reserved
 my_standby_cluster_slot | physical  | t      | reserved
(3 rows)
Because use_slots is enabled, Patroni treats my_standby_cluster_slot as a permanent slot and automatically maintains it across the switchover on the new primary. As a result the standby cluster keeps streaming without any interruption. We can confirm that the standby cluster is still perfectly healthy and now following the new timeline:
[root@ta3 pgcluster]# patronictl -c /etc/patroni/16-pgcluster.yml list
+ Cluster: pgcluster (7610779391636123875) -+----------------+-----------+----+-------------+-----+------------+-----+
| Member | Host            | Role           | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+----------------+-----------+----+-------------+-----+------------+-----+
| ta1    | 192.168.122.97  | Replica        | streaming | 13 | 0/100001B8  | 0   | 0/100001B8 | 0   |
| ta2    | 192.168.122.198 | Standby Leader | streaming | 13 |             |     |            |     |
| ta3    | 192.168.122.172 | Replica        | streaming | 13 | 0/100001B8  | 0   | 0/100001B8 | 0   |
+--------+-----------------+----------------+-----------+----+-------------+-----+------------+-----+
It is equally important to test a switchover inside the standby cluster itself. This proves that the standby cluster has its own fully functional high availability, independent of the primary. Here we hand the standby leader role from ta2 to ta1:
[root@ta3 pgcluster]# patronictl -c /etc/patroni/16-pgcluster.yml switchover --leader ta2 --candidate ta1 --force
2026-03-10 13:36:35.68724 Successfully switched over to "ta1"

[root@ta3 pgcluster]# patronictl -c /etc/patroni/16-pgcluster.yml list
+ Cluster: pgcluster (7610779391636123875) -+----------------+-----------+----+-------------+-----+------------+-----+
| Member | Host            | Role           | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+----------------+-----------+----+-------------+-----+------------+-----+
| ta1    | 192.168.122.97  | Standby Leader | streaming | 13 |             |     |            |     |
| ta2    | 192.168.122.198 | Replica        | streaming | 13 | 0/100001B8  | 0   | 0/100001B8 | 0   |
| ta3    | 192.168.122.172 | Replica        | streaming | 13 | 0/100001B8  | 0   | 0/100001B8 | 0   |
+--------+-----------------+----------------+-----------+----+-------------+-----+------------+-----+
The standby cluster promoted ta1 to standby leader while the other nodes reattached and continued streaming. This confirms that the standby cluster provides real, self-contained PostgreSQL high availability on top of the cascading replication link to the primary. Promoting a Patroni standby cluster to an independent PostgreSQL primary

Promoting the Standby Cluster

The moment a standby cluster proves its worth is during a promotion, whether that is a planned migration or a real disaster recovery event. Before promoting, you must make sure the clusters are fully synchronized. The easiest way to do that is by comparing the Receive LSN and Replay LSN in the patronictl list output on both clusters. This comparison only makes sense if there are no concurrent writes on the primary cluster, so it is best to take a short downtime and stop the application if possible during this operation. The recommended promotion procedure is as follows. Schedule a short maintenance window. Stop application traffic and remove the VIP from the primary cluster. Compare the Receive and Replay LSN values for both clusters to confirm they match. Once the standby cluster is fully caught up, run the promotion command on one of the standby cluster nodes. After confirming the promotion, assign the VIP to the new primary Patroni cluster, and finally restart the application against the new endpoint. The command that detaches the standby cluster and promotes it into an independent primary is a single edit-config call that sets standby_cluster to null:
[root@ta3 pgcluster]# patronictl -c /etc/patroni/16-pgcluster.yml edit-config --set standby_cluster=null --force
---
+++
@@ -49,10 +49,4 @@
     use_pg_rewind: true
     use_slots: true
   retry_timeout: 10
-  standby_cluster:
-    create_replica_methods:
-    - basebackup
-    host: 192.168.122.237,192.168.122.93,192.168.122.128
-    port: 5432
-    primary_slot_name: my_standby_cluster_slot
   ttl: 30

Configuration changed

[root@ta3 pgcluster]# patronictl -c /etc/patroni/16-pgcluster.yml list
+ Cluster: pgcluster (7610779391636123875) ------+----+-------------+-----+------------+-----+
| Member | Host            | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
| ta1    | 192.168.122.97  | Leader  | running   | 14 |             |     |            |     |
| ta2    | 192.168.122.198 | Replica | streaming | 14 | 0/10000298  | 0   | 0/10000298 | 0   |
| ta3    | 192.168.122.172 | Replica | streaming | 14 | 0/10000298  | 0   | 0/10000298 | 0   |
+--------+-----------------+---------+-----------+----+-------------+-----+------------+-----+
Once standby_cluster is removed from the configuration, the former standby leader ta1 becomes a genuine Leader on a new timeline, and the cluster no longer replicates from the original primary. It is now a fully independent PostgreSQL cluster ready to accept writes.

Best Practices for Patroni Standby Clusters

Running a standby cluster in production goes beyond the initial bootstrap. A few operational habits will keep your PostgreSQL Replication topology healthy over the long term. Always use a dedicated permanent replication slot with use_slots enabled so that a switchover on the primary never strands the standby cluster by discarding WAL it still needs. Add pg_hba rules for every standby node up front, not just the current standby leader, because an internal switchover can promote any node to standby leader at any time. Keep the PostgreSQL major version identical between the primary and standby clusters. A standby cluster relies on physical, block-level replication, which is not compatible across major versions. When you plan a migration to a newer version, use logical replication or an in-place upgrade path instead of a standby cluster. Monitor replication lag continuously by scraping pg_stat_replication on the primary and the standby leader, and alert on both byte lag and time lag so that a silently falling-behind standby cluster does not surprise you during a real disaster recovery event. Finally, rehearse promotions regularly. The value of a standby cluster is only realized if your team can promote it quickly and confidently under pressure. Schedule periodic game days where you promote the standby cluster in a staging environment, measure the actual recovery time, and refine the runbook. Document the exact VIP reassignment steps, because in most real incidents the database promotion is the easy part and the network and application cutover is where teams lose the most time.

Conclusion

Patroni standby clusters provide a powerful mechanism for building robust disaster recovery architectures in PostgreSQL environments. By maintaining a fully synchronized secondary cluster that replicates through cascading PostgreSQL Replication, you can recover quickly from major infrastructure failures while minimizing both downtime and data loss. The standby leader shields the primary from having to serve many downstream replicas, and each cluster keeps its own isolated high availability control plane so that leadership changes on one side never destabilize the other. In addition to disaster recovery, standby clusters can also be used for controlled migrations across data centers or cloud providers, cross-region read scaling, and safe operational testing without ever impacting the primary production cluster. If PostgreSQL high availability is a priority for your organization, this standby cluster design is one of the most valuable tools in your architecture toolbox, and with the prerequisites, configuration, and promotion workflow covered in this guide you now have everything you need to deploy one with confidence.

Frequently Asked Questions

What is a Patroni standby cluster?

A Patroni standby cluster is a separate PostgreSQL cluster, managed by its own Patroni and its own DCS, whose standby leader replicates from a primary Patroni cluster. Internally it provides full high availability, while externally it acts as a single cascading PostgreSQL Replication consumer of the primary.

How is a standby cluster different from a normal Patroni replica?

A normal replica is a member of the same Patroni cluster and shares the same control plane and DCS. A standby cluster is an entirely independent Patroni cluster with its own leadership election; only its standby leader replicates from the primary, and the other nodes cascade from that standby leader.

Does a switchover on the primary break the standby cluster?

No. As long as use_slots is enabled and a permanent replication slot is configured, Patroni maintains that slot across a primary switchover or failover, so the standby cluster continues streaming without interruption.

How do I promote a Patroni standby cluster?

After confirming both clusters are synchronized by comparing their Receive and Replay LSN values, run patronictl edit-config --set standby_cluster=null on a standby node. This detaches the standby cluster and promotes its leader into an independent primary on a new timeline.

Further Reading

About MinervaDB Corporation 311 Articles
Full-stack Database Infrastructure Architecture, Engineering and Operations Consultative Support(24*7) Provider for PostgreSQL, MySQL, MariaDB, MongoDB, ClickHouse, Trino, SQL Server, Cassandra, CockroachDB, Yugabyte, Couchbase, Redis, Valkey, NoSQL, NewSQL, SAP HANA, Databricks, Amazon Resdhift, Amazon Aurora, CloudSQL, Snowflake and AzureSQL with core expertize in Performance, Scalability, High Availability, Database Reliability Engineering, Database Upgrades/Migration, and Data Security.