Introducing the pgEdge Control Plane: High-Availability Postgres, Simplified
Making Postgres highly available adds significant operational complexity. Deployments require careful coordination of replication, failover, backups, configuration distribution, node lifecycle management, and operational safety. If you deploy your databases to Kubernetes, you have excellent tools—such as CloudNativePG—that automate much of this. But Kubernetes introduces its own operational overhead and isn’t the right fit for every organization. When deploying directly to bare metal or virtual machines, there are fewer tools, and they are often significantly more manual.
That's why we've built the pgEdge Control Plane: a new way to deploy and operate highly available Postgres across a fleet of machines. Whether on-prem, in the cloud, or hybrid, the pgEdge Control Plane provides a unified, consistent, and reliable framework for managing highly available Postgres clusters.
What is the Control Plane?
The Control Plane is a lightweight service deployed on each machine where you want Postgres to run. When possible, we try to “do the right thing” by default, so per-instance configuration is typically quite minimal. Once started, instances can be joined together to form a Control Plane cluster through a set of simple HTTP API calls. Any instance in the cluster can accept API requests to coordinate global changes—both to the Control Plane itself and to the databases it orchestrates.
At the heart of the system is a declarative HTTP API: you describe the state you want (such as replicas, backup configuration, or the roles that should exist within each database), and the Control Plane converges the real system to match it.
The goal is simple: make high-availability Postgres predictable, safe, and easy to operate, even in the presence of failures.
A Distributed Architecture Designed for Reliability
Deployment architecture diagram: This diagram shows how data flows within a Control Plane cluster. Control Plane instances are installed on each host running Postgres, where they are responsible for the Postgres instances on that host. Control Plane instances use embedded Etcd to share configuration and state data and to coordinate tasks across the cluster.
Single Process, Multi-Instance Coordination
By default, every Control Plane instance maintains the current configuration and state of your databases, ensuring multiple points of redundancy. Any instance can respond to cluster-level requests, distribute work, or pick up in-progress tasks. The cluster automatically handles coordination behind the scenes.
Embedded Etcd for a Durable State
The Control Plane embeds an Etcd server. Depending on cluster size, multiple instances will serve as Etcd peers, forming a strongly consistent datastore that ensures:
Cluster configuration
Database specifications
Durable workflow state
Membership and coordination metadata
This architecture provides strong consistency—operations execute in the correct order—with durability: your cluster state persists through restarts, crashes, and even machine failures.
If an instance goes down mid-operation (for example, during replica creation), the task can resume when the instance restarts or, for non-machine-specific steps, be safely picked up by another instance.
A Durable, Re-Entrant Workflow Engine
All database operations are executed as durable tasks stored in Etcd. Because every step is idempotent and the task state is persistent, operations can resume cleanly after interruptions. If an operation fails, you can address the underlying issue and retry.
This eliminates a large category of operational risk: partial or incomplete work never leaves the cluster in an undefined state. The system determines what has already been completed and finishes whatever remains.
High-Availability Postgres, Two Ways
The Control Plane supports two methods for building highly available Postgres clusters: physical replication with Patroni and logical replication with Spock. These can be used independently or simultaneously and are fully integrated into the declarative model.
Patroni: Physical Replication and Automated Failover
For traditional HA with a single primary and read replicas, Control Plane integrates with Patroni. The Control Plane handles lifecycle management and configuration, while Patroni manages failover and replication coordination. Because the Control Plane includes embedded Etcd, it serves as Patroni’s Distributed Configuration Store (DCS).
Adding or removing replicas is as simple as updating the database specification and submitting an “update database” API request. The Control Plane handles the rest without downtime.
Spock: Multi-Active Logical Replication
For multi-region, multi-active architectures where every node can accept writes, we integrate with Spock, our logical replication extension. The Control Plane orchestrates configuration, subscriptions, and replication topology, making complex deployments straightforward.
Just like with Patroni, adding or removing Spock nodes is as simple as updating your database specification and submitting an update request. The Control Plane automates Spock’s Zero-Down-Time Add Node process, so these changes can be performed while the database is actively in use.
Declarative API: Safety Through Idempotence and Dynamic Planning
curl http://localhost:3000/v1/databases/northwind --data '{
"spec": {
"database_name": "northwind",
"port": 5432,
"database_users": [
{
"username": "admin",
"db_owner": true,
"attributes": ["SUPERUSER", "LOGIN"]
}
],
"nodes": [
{
"name": "n1",
"host_ids": ["us-east-1a", "us-east-1b"],
"backup_config": {
"repositories": [
{
"type": "s3",
"s3_bucket": "backups-51153c7d-26f1-4afc-9096-4f64f17296f3",
"s3_region": "us-east-1",
}
]
}
},
{
"name": "n2",
"host_ids": ["us-west-1c", "us-west-1d"]
},
{
"name": "n3",
"host_ids": ["eu-west-2b", "eu-west-2c"]
}
]
}
}'An example update request: This is an example request to the Control Plane’s “update database” API endpoint. It describes a database that uses both physical and logical replication across six Postgres instances: three Spock nodes, each deployed to two different hosts in the cluster, resulting in one primary instance and one read replica per node.
The Control Plane’s API lets you define what you want, not how to achieve it. You specify where Postgres should run and any custom configuration (such as backup destinations and schedules), and the Control Plane determines the steps required to reach that state.
Whenever you modify a database, the Control Plane assesses the current system state and dynamically plans the operations needed. Because plans are generated on demand and operations are idempotent, it is safe to reapply a configuration, retry failed work, or adapt to changes that occur outside the system.
Backup and Restore with pgBackRest
The Control Plane integrates deeply with pgBackRest to provide scheduled full, differential, and incremental backups, as well as continuous WAL archiving. When you perform a point-in-time recovery through the Control Plane API, recovery benefits from the same durable, idempotent architecture as all other operations. You can also create a new database from a backup, making it easy to clone databases for testing or development.
Conclusion
The pgEdge Control Plane represents a new approach to Postgres operations—one built on declarative configuration, durable distributed systems concepts, and seamless integration with the best tools in the ecosystem.
Its goal is simple: make high-availability Postgres easy, safe, and scalable.
We’re excited to share more, including upcoming features, detailed architecture documents, and guides for deploying your first Control Plane cluster.
If you'd like to try it out, explore the docs, or join early testing, we'd love to hear from you.



