A Brief History of Logical Replication in Postgres — and Looking Ahead at its Likely Future Evolution
A brief history of logical replication in Postgres — and looking ahead at its likely future evolution
This blog is divided into two parts. In this section, we walk through how the logical replication feature has evolved over the years, what the recent improvements for Postgres logical replication are, and how the feature will likely change in the future. The second blog of the series will discuss the multi-master (active-active), multi-region, and highly available PostgreSQL cluster created by pgEdge that is built on top of logical replication and pglogical.
Postgres replication is the process of copying data between systems. PostgreSQL supports two main methods of replication: logical replication and physical replication. Physical replication copies the data exactly as it appears on the disk to each node in the cluster. Physical replication requires all nodes to use the same major version to accommodate on-disk changes between the major versions of PostgreSQL.
Logical replication on the other hand is the method of replicating data based on data changes. The building blocks of the logical replication feature were introduced in PostgreSQL 9.4, however the feature was completed in PostgreSQL 10. Logical replication provides fine grained control over the replication set via a publisher/subscriber model where multiple subscribers can subscribe to one or more publishers. Logical replication uses logical decoding plugins that format the data so it can be interpreted by other systems. This makes replication possible among heterogeneous systems and across major PostgreSQL releases; this means it requires zero downtime for major version upgrades. Logical replication also provides fine grained control over the replication set so you can decide whether to replicate an entire table, only certain columns from a table, or all of the tables within a schema.
Postgres logical replication evolution in Chronological order
As mentioned above, the community began developing the underlying technology that made logical replication possible in PostgreSQL 9.4. These features are the core building blocks for the logical replication feature.
This section describes the main features for logical replication that were added in each release. To review a complete list of logical replication features for each release, please refer to the
Replication and Recovery section of each version of the release notes.
This blog provides some context to the life cycle involved when building a major feature for PostgreSQL, and allows you to see how a feature matures over time. The basic logical replication feature was committed to PostgreSQL 10 however it required important patches in subsequent releases to make the feature performance feasible and usable. Logical replication is not finished yet; please read my thoughts in the final section on what might be on the roadmap for replication in the next set of releases.
PostgreSQL 9.4 - 2014
Logical Decoding - The basic idea behind logical decoding is to stream database changes out in a format that is understandable by other systems. Logical decoding is implemented by decoding the contents of WAL logs and streaming the changes in a customizable format. This decoding enables the logical replication to replicate changes to other heterogeneous systems.
Replica Identity - Replica identity is a new table level parameter added to PostgreSQL 9.4 that can be used to control the information that is written to the WAL files. It can be used to identify if the tuple data that is being added or deleted. The replica identity is set to
default, and can only be changed with an
ALTER TABLEstatement. The other options are
FULL(i.e write maximum data, which is an expensive option) or
NOTHING(which means write nothing).
Replication Slot - In the context of logical replication, the replication slot represents the persistent stream of changes that can be replayed on a client in the order they were made on the original server. We can have multiple replication slots for a single database; each slot has its own state allowing different consumers to receive changes from different points in the database change stream. You can configure multiple receivers for a replication slot, since the slots are unaware of the state of the receiver. Multiple receivers can be streaming data from a single slot at different points in time, receiving the changes after the last receiver stopped consuming them. Only one receiver may consume changes from a slot at any given time.
Logical replication uses a publisher/subscriber model where multiple subscribers can receive data from a single publisher. The functionality of replication slots comes into play when providing for this functionality.
PostgreSQL 9.5 - 2016 Jan
Track commit timestamp -
track_commit_timestampis a server level parameter that is configured in the
postgresql.conffile. It can be set on any master node that sends replication data to one or more stand-by servers. The role and meaning of the parameter doesn’t change if the stand-by node becomes the master (the server that is sending the replication data).
This is a boolean parameter that is set to
offby default. The purpose of this parameter is to record the commit time of the transactions.
Track Replication Progress - Replication origins are added in PostgreSQL 9.5 to allow implementation of a logical replication solution on top of logical decoding. Recording replication origin helps solve two problems with logical replication:
How to safely keep track of replication progress.
How to change replication behavior based on the origin of the row. This is particularly crucial for bi-directional replication to avoid getting stuck in a loop. PostgreSQL 9.6 - 2016 Sept
PostgreSQL 9.6 - 2016 Sep
Generic messages for logical decoding - This feature allows an extension to insert data into WAL streams that can be read by the logical decoding plugin. These messages are either transactional (decoded on commit) or non-transactional (decoded immediately). For a standard WAL reply, these messages are NOOP, created with arbitrary data (user-specified). The messages are decoded in a logical decoding plugin with special callbacks of output plugins called.
Three main use cases for this feature are:
Reliable communication between nodes in a multi-node replication setup.
Out of order messaging in a logical replication scenario (allows sending a message immediately to a node).
Support for queue tables. This is kind-of the opposite of unlogged tables. A queue creates tables with the data itself, and all inserts go into the WAL without having to store the data.
PostgreSQL 10 - 2017
Logical replication - The logical replication feature was added to PostgreSQL in version 10. Logical replication is a method of replicating data objects and changes to those objects based on replication identity. The logical replication feature provides security and fine grained access control over a replication set. The term logical replication is used in contrast to physical replication which performs replication using exact block addresses and byte-by-byte replication.Logical replication uses a published/subscriber model where multiple subscribers can subscribe to a publisher. Subscribers pull data from the publications they are subscribed to, and may do cascading replication by re-publishing data to other subscribers.
Logical replication uses a published/subscriber model where multiple subscribers can subscribe to a publisher. Subscribers pull data from the publications they are subscribed to, and may do cascading replication by re-publishing data to other subscribers.
Logical replication provides control over the replication set so you can decide if you want to replicate a particular table or columns from a table or all the tables in a schema.
PostgreSQL 11 - 2018
Truncate support - Replicating
TRUNCATEstatements to subscribers with logical replication. Previously if an application issues a truncate statement, it wasn’t replicated to the subscriber nodes. With this feature in version 11, truncate functionality is replicated with logical replication.
Prepared transaction information - Passing prepared transaction information to logical replication subscribers.
Efficient advancement of replication slots - This feature allows replication slots to be advanced programmatically instead of being consumed by the subscribers. This is particularly useful in efficient advancement of replication slots when the contents don’t need to be consumed by the subscribers, this action is performed by the
Reduce memory usage
PostgreSQL 12 - 2019
Copying replication slots - The features allow the replication slots to be copied using
pg_copy_logical_replication_slot()functions. The logical slot starts from the same LSN as the source logical slot.
PostgreSQL 13 - 2020
logical_decoding_work_mem - PostgreSQL 13 adds the
logical_decoding_work_memparameter to specify the amount of memory allocated to the WAL sender for saving changes in memory before spilling it to the disk. You can increase the parameter value to keep more changes in memory and reduce disk writes, or decrease the value to reduce the memory usage of WAL sender. The default value of this parameter is 64 MB, and it doesn’t require a database server restart, but the configuration file needs to be reloaded if it is modified. Each subscription spawns a WAL sender process on the publisher node to process the changes from publisher to subscriber. The size of the WAL sender process determines the amount of changes to keep in memory before spilling it to the disk. The logical_decoding_work_mem parameter will control this memory usage for logical replication.
Partitioned table support - Prior to this feature, the partitions on a partitioned table needed to be replicated individually. PostgreSQL 13 allows you to logically replicate partitioned tables; you can publish the partition table explicitly and all its partitions will be automatically replicated. The addition or removal of partitions from the table likewise need to be added or removed from the publication.
publish_via_partition_rootoption (used when creating a publication) controls whether the changes to a partition contained in the publication will be published using the identity and schema of the parent table, rather than that of the partitions that are actually changed.
PostgreSQL 13 also supports logical replication of partitioned tables on the subscribers; previously this was only supported for non-partitioned tables.
max_slot_wal_keep_size (integer) - The
max_slot_wal_keep_sizeparameter can be set in the
postgresql.conffile or on the command line. It determines the number of WAL files required by
replication_slotsto be kept in the
pg_logdirectory; any replication slots exceeding the specified value are marked invalid.
PostgreSQL 14 - 2021
Streaming large in-progress transactions - This feature in version 14 allows streaming large in-progress transactions to the subscribers. Previously, all large in-progress transactions exceeding the
logical_decoding_work_memvalue would be written to disk until the transaction is completed. This improves the performance of logical replication for large transactions.
Data transfer in binary mode - PostgreSQL 14 provides the ability to create subscriptions with binary transfer mode instead of text mode (the default). The binary transfer mode is faster than text.
Process large DDL efficiently - Allow logical decoding to more efficiently process cache invalidation messages. This improves logical replication efficiency when processing large amounts of DDL.
Support two phase commit with logical decoding - The logical decoding API is enhanced to support two phase commits. Two phase commit is controlled by the
pg_create_logical_replication_slot()function. The optional parameter,
twophase, when set to
true, specifies that the decoding of prepared transactions is enabled for this slot.
PostgreSQL 15 - 2022
Publish all tables in schema - Version 15 supports the syntax for including all tables in a schema for publication:
CREATE PUBLICATION pub1 FOR TABLES IN SCHEMA foo. This command includes all the tables in schema
fooin the publication; any table added to the schema at a later time is automatically included in the publication.
Row filtering - This feature allows publication content to be filtered using a
WHEREclause. Any rows that don't satisfy the
WHEREclause are omitted from the publication.
Column filtering - This feature allows publication content to be filtered for specific columns. Only the specified columns are included in the publication.
Support for prepared transactions - Logical replication of prepared transactions is supported in the version15 release. Also, the new
create_replication_slotoption supports a two-phase option for slot creation.
SKIP transaction - Ability to skip transactions on the subscriber is supported with the
ALTER SUBSCRIPTION… SKIPcommand. This command skips conflicting transactions on the subscriber; you can specify the last LSN by using the
skip_optionto indicate the failed transaction.
Disable replication on error - The
disable_on_erroroption is supported while creating a subscription. This option allows replication to be stopped when an error is raised at the subscriber. This prevents infinite loops if an error is caused by the replicating transaction.
PostgreSQL 16 - 2023
Filtering based on publication Origin - This feature in PostgreSQL 16 supports the
origin = NONEclause while creating a subscription; the other value supported for
NONEmeans that the subscriber is only requesting replication changes that don’t have a origin. Setting
ANYmeans send changes regardless of the origin. This prevents loops in bi-directional replication.
Allow logical decoding from stand-by - Allowing logical decoding from stand-by means that subscribers can subscribe from the stand-by, reducing the load on the primary server. This wasn’t possible prior to PostgreSQL 16. It requires that
wal_levelbe set to
logicalon both the primary and stand-by server.
Parallel Apply - The parallel apply feature is a significant performance improvement for logical replication in PostgreSQL 16. It supports large in-progress transactions by allowing multiple parallel workers to be used on the subscriber to apply changes. The user can specify the parallel streaming option while creating the subscriber. The
max_parallel_apply_workers_per_subscriptionparameter controls the maximum number of parallel workers per subscription.
Support binary mode for COPY - Prior to PostgreSQL 16,
textmode was only supported for the initial table copy, and
binarytransfer mode was only supported for the replication of changes. With PostgreSQL 16, you can set
binary=truewhile creating the subscription to perform the initial data copy in binary mode (much faster then text mode). This option is only supported if both publisher and subscriber are on PostgreSQL 16.
Logical Replication - Looking ahead
The building blocks for logical replication were added in PostgreSQL 9.4, but the logical replication feature was added in PostgreSQL 10. Since that release, there have been a number of important improvements to logical replication. The last two major releases of PostgreSQL have contributed to the performance and usability of logical replication with parallel application on the subscriber, allowing binary mode initial copy, supporting row/column based filtering, and more.
Looking ahead at PostgreSQL 17 (and beyond) for logical replication, there is definitely a requirement for more performance improvement by increasing the replication rate and reducing the replication lag. I believe this can be achieved with parallelism support and worker optimization. There is also a need for better integration of logical replication with external tools for high availability and upgrades. The possibility of active-active (multi-master) replication is also approachable as part of the PostgreSQL core, but it is missing major features like conflict detection and resolution.
Some of the missing but important features are provided by pgEdge's Spock extension. pgEdge provides a fully distributed PostgreSQL cluster that supports active-active replication with low latency, high availability, and data residency. Multi-master replication and the pgEdge clustering solution will be discussed in the next post of this series.