In our previous blog we shared how the combination of a powerful distributed database like pgEdge that leverages Equinix Metal's regionalized infrastructure to bring data closer to your users, can result in lightning-fast application responsiveness regardless of where users are located.  This blog outlines how to deploy the pgEdge distributed PostgreSQL in an Equinix environment.  

A typical pgEdge replication cluster consists of a combination of Read/Write and Read-only nodes.  With the Read/Write nodes created as Multi-master replication nodes, transactions are stored quickly and replicated efficiently, while the Read-only nodes handle queries from your application.  Transactions are automatically replicated to the Read-only nodes after being committed on the Read/Write nodes, ensuring that queries have the most current data.

Picture-optimized-1 You can use the Equinix console to quickly deploy on-demand Bare Metal Servers for your clusters.  For this post, we'll be using two servers; one deployed in New York , and one deployed in the Dallas Metro.  From the console, select New Server, and then select a deployment region and server type.Picture-optimized-2Then, on the Operating System dialog, select Rocky 9.

Picture-optimized-3

Once deployed, your servers are displayed in the Equinix console's Manage Servers list:

New Equinix server screenshot with Dallas and NYIn our example, our IP addresses are:

  • 145.40.80.77 

  • 145.40.64.253

Before installing pgEdge Platform, you need to ensure that security rules allow communication between the two nodes in our cluster.

Installing pgEdge Platform on an Equinix Server

You can deploy pgEdge Platform on an Equinix server as a robust production system or as a test environment for your application.  You can also explore pgEdge test environments with a point-and-click pgEdge Cloud deployment.

Creating a Replication User

During cluster creation, pgEdge Platform creates a PostgreSQL replication user with the same identity as the OS user that invokes the installer.  That user has a special set of permissions that allow that user to invoke pgEdge commands to create and manage replication clusters. For this post, we're going to create a non-root, administrative OS role on each node of the cluster, and use that role when creating the cluster.

To create a user on an Equinix host, ssh in to the server, and invoke the commands:

$ adduser rockyuser
$ passwd rockyuser
$ sudo usermod -aG wheel rockyuser

After creating the user and adding the user to the wheel group, use your choice of editor to update the sudoers file, adding the user to allow passwordless sudo access to the new user.  Locate the section of the file that grants passwordless sudo access, and add an entry for rockyuser:

## Read drop-in files from /etc/sudoers.d (the # here does not mean a comment)
#includedir /etc/sudoers.d
rocky   	ALL=(ALL)		NOPASSWD: ALL
rockyuser	ALL=(ALL)		NOPASSWD: ALL

We also need our user to have passwordless ssh access.  Use the following steps to generate a key file and configure passwordless ssh:

ssh-keygen -t rsa
cd ~/.ssh
cat id_rsa.pub >> authorized_keys
chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys

Installing pgEdge Platform

You'll need to install pgEdge Platform on each node in your Equinix deployment.  Connect to each Equinix host, and assume the identity of the OS user we just created: rockyuser.  Then, use python to download and run the pgEdge Platform installer:

$ su rockyuser
$ python3 -c "$(curl -fsSL https://pgedge-download.s3.amazonaws.com/REPO/install.py)

The installer will create a directory named pgedge; navigate into the pgedge directory:

$ cd pgedge

After installing pgEdge Platform, you'll use the pgEdge Command Line Interface (CLI) setup command to create your replication cluster on each Equinix host:

$ ./pgedge setup -U admin -P 1safe_password -d acctg

The command shown above installs PostgreSQL 16 with the Spock and Snowflake extensions, and creates a database superuser named admin with an associated password of 1safe_password.  It also creates a database named acctg.

You can use the database superuser and password to connect with the psql client, pgbench, or other PostgreSQL tools and utilities.  To simplify accessing the database tools, you can source the PostgreSQL environment variable file on each node of the cluster with the command:

$ source pg16/pg16.env

Then, to start psql, you can use the command:

$ psql acctg

After connecting, you can use psql commands and meta-commands to review your database; for example, if you invoke the \dt meta-command, you'll find that there are no tables currently defined in the acctg database:

Equinix pgEdge install blog example1

After installing pgEdge Platform and the PostgreSQL database, you're ready to start configuring your replication cluster.  The first step is to create the replication nodes.

Creating Replication Nodes

After installing pgEdge Platform and running setup, we'll use spock node-create to create a replication node on each host. A replication node is a named collection of databases, tables, and other artifacts that are replicated via a pgEdge subscription.

When creating the node, provide three arguments:

● The node name.

● The node address.

● The database name.

For our example, we'll use the CLI on node 1 and the following command to create n1 :

$ ./pgedge spock node-create n1 'host=145.40.80.77 user=admin password=1safe_password dbname=acctg' acctg

Equinix pgEdge install blog example2

Then, we'll use the CLI on node 2 and the following command to create n2:

$ ./pgedge spock node-create n2 'host=145.40.64.253 user=admin password=1safe_password dbname=acctg' acctg

Equinix pgEdge install blog example3
Creating Subscriptions

Next, we'll use the spock sub-create command to create the subscriptions that connect the nodes to each other. Since this is a multi-master replication system, each node acts as both a subscriber and a publisher node.

We'll provide three arguments:

  • A unique subscription name.

  • A single-quoted connection string that specifies the IP address/connection properties of the node you're subscribing to; the connection is verified when you invoke the command.

  • The database name.

Since we have a two node cluster, we'll create two subscriptions; one that subscribes node 1 to node 2 and one that subscribes node 2 to node 1. On node 1 (n1) we'll use the following command to subscribe to Node 2 (n2):

$ ./pgedge spock sub-create sub_n1n2 'host=145.40.64.253 port=5432 user=admin password=1safe_password dbname=acctg' acctg

On node 2 (n2) we'll use the following command to create a subscription to Node 1 (n1):

$ ./pgedge spock sub-create sub_n2n1 'host=145.40.80.77 port=5432 user=admin password=1safe_password dbname=acctg' acctg

Adding Some Sample Data

The pgbench utility is a benchmarking tool distributed with PostgreSQL; pgbench provides an easy way to add sample tables to your cluster.  On each node, we'll use the following pgbench command to set up a very simple four-table database:

$ pgbench -i acctg

Then, connect to each node with the psql client:

$ psql acctg

Alter the numeric columns, setting LOG_OLD_VALUE equal to true.  This will make these numeric fields conflict-free delta-apply columns, ensuring that the value replicated is the delta of the committed changes (the old value plus or minus any new value) for a given record:

acctg=# ALTER TABLE pgbench_accounts ALTER COLUMN abalance SET (LOG_OLD_VALUE=true);
acctg=# ALTER TABLE pgbench_branches ALTER COLUMN bbalance SET (LOG_OLD_VALUE=true);
acctg=# ALTER TABLE pgbench_tellers ALTER COLUMN tbalance SET (LOG_OLD_VALUE=true);

Then, exit psql:

acctg=# exit

Adding Tables to the Replication Set

The last step is adding the tables to the replication set; this will start replication between the nodes. On the OS command line for each node, use the spock repset-add-table command to add the tables to the system-created replication set (default):

$ ./pgedge spock repset-add-table default 'pgbench_*' acctg

You'll notice that the fourth table, pgbench_history, is excluded from the replication set because it does not have a primary key.

Checking the Configuration and Testing Replication

You can check the configuration on the psql command line after connecting with the command:

$ psql acctg

Querying spock.node from n1 returns a list of the nodes in the cluster:

acctg=# SELECT * FROM spock.node;
node_id | node_name
---------+----------
673694252 | n1
560818415 | n2
(2 rows)

Querying spock.subscription from n1 returns information about the subscription from n1 to n2:

acctg=# SELECT sub_id, sub_name, sub_slot_name, sub_replication_sets  FROM spock.subscription;
  sub_id   | sub_name |   sub_slot_name  |   sub_replication_sets            
-----------+----------+------------------+------------------------------------
3293941396 | sub_n1n2 | spk_db_name_n2_sub_n1n2 | {default,default_insert_only,ddl_sql}
(1 row)

You can exercise replication either at the psql command line or with pgbench.  For example, If you use psql to update a row on n1, you should see the update to the same row on n2.  Invoke the following command on n1 to add a value (test) to the filler column of the pgbench_tellers column:

acctg=# UPDATE pgbench_tellers SET filler = 'test' WHERE tid = 1;
UPDATE 1

When you query the table on n2, the value will be displayed in the filler column:

acctg=# SELECT * FROM pgbench_tellers WHERE tid = 1;
tid | bid | tbalance | filler   
----+-----+----------+------------------------------------
  1 |   1 |      0   | test                              
(1 row)

You can also use pgbench to exercise replication; exit psql, and run the following command on both nodes at the same time to exercise pgbench for one minute.

$ pgbench -R 100 -T 60 -n db_name

When you connect with psql and check the sum of the tbalance column on both nodes, you'll see that the sum of the columns match. Without the conflict-free delta-apply columns, each conflict would have resulted in accepting the first in, potentially leading to sums that do not match between nodes.

acctg=# SELECT SUM(tbalance) FROM pgbench_tellers;
 sum  |
------+
-84803
(1 row)

Conclusion

The combination of pgEdge and Equinix Metal offers a brilliant solution for those seeking to maximize their application and database performance while providing ultra high availability. With your data at the edge and support from a robust network infrastructure, your potential for innovation and growth is infinite. Watch this live demo video to learn more.

Explore pgEdge Cloud Developer Edition or engage with our solution architecture team today. For detailed information about installing self-hosted pgEdge Platform, see the documentation at: https://docs.pgedge.com/platform/installing_pgedge/manual.