Summary
This runbook describes how to failover to the faraway replica in EDB Cloud Service by promoting it as the new Primary node.
Any automatic failovers in EDB Cloud Service with Primary/Standby High Availability would be normal standby replicas deployed in the same region. In a Disaster recovery (DR) scenario, manual intervention is needed to switch to the faraway replica, as shown in the below steps.
Failover to faraway replica
We can promote a faraway replica to a full-fledged cluster, which makes it capable of accepting writes.
1. Go to the Clusters page. A list of previously created clusters appears.
2. Select the cluster with the replica you want to promote. The cluster's replicas are under Faraway Replicas in the Overview tab.
3. Select the Promote Replica icon next to the replica you want to promote. The Promote Faraway Replica page appears.
4. Select Promote Replica after verifying that there is minimum lag.
5. Review/Edit the PostgreSQL settings for the new primary during the promotion.
IMPORTANT NOTES:
- archive_mode will be changed from always to on. This parameter change needs a PG restart after the promotion. More details about archive_mode
- There will be a minimum 16 MB lag on the Faraway replica because WALs only get shipped to the Faraway replica after it is ready to be archived. So current WAL will still be on the primary
- Maximum lag will depend upon the application workload, and replication lag on the faraway replica during the promotion. Promoting a Faraway replica to the EDB Cloud cluster can result in some data loss because of this lag. Refer to the link below for the faraway replica limitations and its advantages. Refer to EDB Docs - Faraway replicas
- Ensure the private connection for the new primary cluster is established.
- After promotion, the new Primary Endpoint name will have to be updated in application connect strings.
Add a Faraway Replica to the new cluster
- Once you have promoted the Faraway replica as a primary node, you will lose connectivity to the old primary nodes, as they will go out of sync.
- To have a highly available setup, we need to add a Faraway replica to this promoted cluster so that it can be used as a Disaster Recovery solution in the future.
- To create the Faraway replica, Select the Create Replica option under Quick Actions menu on the Clusters page for the new primary cluster.
- This will open a new window where you can select the New region to be used for the Faraway Replica.
- The Faraway replica will take some time to be ready; the overall time will depend on the size of the cluster.
More details about archive_mode
In addition to off, to disable, there are two modes:
- on: Default setting. WAL archiver is not enabled during archive recovery or standby mode
- always: The setting on Faraway Replica. WAL archiver is enabled also during archive recovery or standby mode. In always mode, all files restored from the archive or streamed with streaming replication will be archived (again).
So after a Faraway replica is promoted, the archive_mode needs to be changed from always to on, and this change requires a PG restart.
- When the cluster is promoted with a single-node configuration, it will restart immediately.
- When the cluster is promoted with a high availability (HA) configuration(in a request), it will remain pending a restart until all standby nodes have joined.