What is Multi-Master (Active-Active) Distributed Postgres, and Do You Need It?
New approaches bolster effectiveness of multi-master database architecture
As a Postgres developer, you know the importance of having a robust, scalable and highly available database solution that can handle a large volume of read/write traffic. That's where multi-master distributed Postgres comes into play. However, until recently some users were concerned about the overall scalability across multiple locations, stability caused by conflicts and cost of non-standard solutions.
At its core, multi-master distributed Postgres allows you to have multiple master databases spread across different locations (multiple nodes), each capable of handling read and write traffic simultaneously, allowing for better performance and high-availability of your applications. It also ensures data consistency (i.e. eventual data consistency) and improved data access times for your applications by using bi-directional replication and conflict resolution.
But what does that all mean, and how does it work?
Traditional bi-directional replication
One of the key features of multi-master (active-active) distributed Postgres is bi-directional replication. This allows you to replicate data between multiple instances in real-time, ensuring that data is always up-to-date across all instances. Any changes made to one master database are replicated to all other masters in real-time. This ensures that all master databases are kept in sync with each other. Additionally, with bi-directional replication, you can ensure that conflicts are avoided or resolved, ensuring that your data remains consistent.
Traditional conflict resolution and avoidance
What happens when two or more users try to update the same row or record simultaneously? That's where built-in conflict resolution is important. Multi-master (active-active) distributed Postgres provides a robust conflict resolution mechanism that ensures that all masters have the same data. Timestamp-based conflict resolution ensures that no data is lost during an update, and guarantees database consistency.
How pgEdge delivers multi-master distributed Postgres across locations
pgEdge addresses previous concerns by providing a flexible and scalable high-availability solution for PostgreSQL databases by synchronizing multi-master bi-directional replication across multiple databases and locations. This is done through pgEdge's Spock extension that allows you to create a multi-master (active-active) replication solution for your PostgreSQL database.
Spock uniquely provides a true multi-master (multi-active) replication architecture that ensures that your users can read and write transactions on any master node simultaneously. The extension captures all conflict resolution events and stores them in a Postgres table, allowing you to see all conflict resolutions in a central location. To learn more about Spock, read the blog.
pgEdge conflict avoidance with conflict free delta-apply
Logical multi-master replication can encounter conflicts when maintaining a running sum (such as a YTD balance). Suppose that a bank account contains a balance of $1,000. If two withdrawals come in for the account from different nodes (for example, if transaction A is a $1,000 withdrawal from the account, and transaction B is also a $1,000 withdrawal from the account), the transactions will be in conflict. If both transactions are honored, the resulting account balance will be $-1,000. pgEdge’s delta-apply algorithm resolves the collision, and avoids workload conflicts like this scenario (like a Transaction Processing Performance Council Benchmark C (TPC-C) benchmark) so transactions are processed correctly at lightning speed.
pgEdge distributed server model
The pgEdge platform provides asynchronous multi-master (active-active) replication with conflict resolution that allows your users access to up-to-date data hosted on multiple servers in multiple regions safely and efficiently. pgEdge improves database performance and cost by allowing you to host information on servers that are geographically closer to the user, while maintaining data consistency across a distributed database.
If required, personal identifying information (pii) data can be restricted or managed according to the rules of the country in which the data resides to conform to General Data Protection Regulation (GDPR) or Health Insurance Portability and Accountability Act (HIPAA) requirements. This capability can be used in conjunction with granular replication sets (see below) to address data residency requirements.
pgEdge enables blue-green upgrades with little to zero downtime
You can also use the pgEdge platform to perform a Blue-Green upgrade between major versions of Postgres with little to no downtime. During a Blue-Green upgrade, the 'blue' nodes of the database remain available to your users, while you perform an in-place server upgrade of the 'green' node. When the green node is ready, database traffic transitions from the blue nodes to the green node, leaving the blue nodes ready for upgrade at your convenience.
pgEdge granular replication sets meets data residency requirements
The pgEdge platform also allows you to refine the replication set to include only the data that you need to replicate. This is critical in addressing data residency needs such as international regulations that require data created by citizens of a specific region or country to be stored within that respective region or country.
You can replicate global data globally while keeping the required local data local. It can also save considerable time and money that would be otherwise spent storing extra information. Instead of replicating an entire database, you can selectively replicate individual columns, rows, or entire tables on either the publisher or subscriber node. pgEdge also supports robust and granular replication for partitioned tables.
In summary, multi-master (active-active) distributed Postgres is powerful enough to allow you to scale your database horizontally across multiple locations while ensuring high availability and data consistency. With the pgEdge Platform (which includes the Spock extension), you can achieve true multi-master (active-active) replication and capture all conflict resolution events, providing you with a robust and scalable database replication solution. With pgEdge, you can achieve faster read and write performance, improved scalability, and better data consistency.