In this post, we’ll investigate using Cloudflare Workers and pgEdge distributed PostgreSQL to build a web application with data access times kept to less than 100 ms, for users anywhere in the United States and Europe. Building web applications to be fast matters, and it generally doesn’t happen by accident. Let’s see if we can make it happen.

We’ll use a free distributed PostgreSQL database from pgEdge, a Cloudflare account with the Workers Paid plan, and Cloudflare Hyperdrive. We’ll introduce all these briefly as we go, so don’t worry if any of these are totally new to you. 

This is written so that you can follow along and create your own database and Cloudflare Worker to experiment with. Give it a try! You can also reference our companion open source repository on GitHub: https://github.com/pgEdge/cloudflare-pgedge-latency-tests

How much do request latencies matter? 

Slow interactions and load times in web applications translate to a poor user experience and lost business. Pingdom found that a page’s load time directly impacts its bounce rate. In their tests, when page load times increased from 2 seconds to 5 seconds, bounce rates increased from 6% to 38%.

While many factors contribute to website performance, any site that shows dynamic data needs to query that data, and with pgEdge, we can help make sure those queries are as fast as possible. Even a savings of 100-200 ms latency on an average request can make a difference on a page that may trigger dozens of requests, with some requests dependent on prior ones.

What makes this a challenge?

Traditionally, and still in many cases today, application data is stored in one geographic region. Within that region, the data may be replicated and available in multiple availability zones for redundancy.

That means, for any application with a globally distributed user base, users located far away from the application servers and the database will commonly experience significantly slower load times than users that are nearby. And the engineers building the application are often close to where the database is deployed, so they aren’t likely to notice location-dependent performance problems by themselves.

A single request from the user’s web browser to the application and then to the database behind that will commonly involve multiple round trips due to the underlying TCP messages. As a reference point, each round trip across the US incurs around a 70 ms penalty. So if our goal is < 100 ms data access times for all our users, this may be quite a challenge!

Our only hope is to have copies of both application servers and the data deployed close to our users, keeping all round trips as short as possible. Of course, we have to do this without the solution being incredibly complicated or costly. Ideally, the solution would be compatible with many existing applications.

It has been tough to find a database that fits the bill without having other significant shortcomings. While there are a growing number of distributed databases available, pgEdge is the only distributed PostgreSQL database that is pure PostgreSQL, is open source on GitHub, and supports multi-master (multi-active) replication across the database cluster.

Distributed PostgreSQL, Connection Pooling, and Compute

Here are the key components of our test application and database:

  • 3-node pgEdge PostgreSQL cluster: Our data will be stored on 3 nodes spanning the east and west coasts of the US and Germany. Using pgEdge replication, we can easily keep all tables in sync, even while allowing writes on all 3 nodes.

  • pgEdge latency-based DNS routing: With this feature, each database has a unique domain name, and all connections to the database are automatically routed to the nearest node from the client. This happens transparently to the client, without any configuration.

  • Cloudflare Workers: we’ll leverage Worker’s built-in capability to handle requests at a data center close to the end-user. According to Cloudflare, 95% of the world’s population is within 50 ms access time to one of their data centers.

  • Cloudflare Hyperdrive (in beta): transparently pool PostgreSQL connections at a regional level. By reusing a pool of warm TCP connections across multiple requests, we eliminate latencies relating to connections from most requests.

  • The node-postgres library (`npm install pg`): we’ll use the standard Javascript library to connect to and query the database. Because pgEdge is pure Postgres, no custom clients are needed. Any PostgreSQL tooling you already have is compatible with pgEdge.

Create a Cloudflare Worker 

Here’s how to quickly create a Cloudflare Worker for testing. You can sign up for a free Cloudflare account, or spring for the modest $5/month Workers paid plan if you want to use Hyperdrive.

Next, make sure you have npm installed on your laptop. You’ll need npm and npx to develop and deploy Cloudflare Workers. You can now install npx using the command:

 npm install -g npx

Use npx to run the Cloudflare Wrangler CLI in order to log into your Cloudflare account: 

npx wrangler login

In a directory where you keep your projects, run this command to generate a worker project: 

npx wrangler init pgedge-worker

Follow the prompts, choosing the "Hello World" Worker, with Typescript enabled, and answer “Yes” to the question “Do you want to deploy your application?”. Once you’re done with the prompts, it will deploy the simple Worker and show you the Worker URL, which should look something like:

https://pgedge-worker.your-name-423.workers.dev

In the next section, we’ll edit the worker source code, which will now be located at ./src/index.ts. As a convenience, you can now install the Cloudflare Wrangler CLI in this project: 

npm install wrangler --save-dev

Create a pgEdge Database

Go ahead and sign up for a pgEdge Cloud account. You’ll then have a Developer Edition account that allows you to create one free 3-node database. Pick “US - Europe” and then click “Create Database”.Picture1Your database should be up in about a minute. Once its dashboard is showing, copy the PostgreSQL PSQL connection command, as highlighted in the screenshot below.Picture2In a terminal window, run the command you just copied to connect to the database. Then run the two SQL statements below to create a products table and create a handful of records.

create table products(id varchar,name varchar, primary key(id));

insert into products (id, name) values

('1', 'Whisper Quiet Vacuum'),

('2', 'Everlast Lightbulbs'),

('3', 'Magic Sponge Erasers'),

('4', 'EcoFresh Laundry Detergent'),

('5', 'Sunrise Alarm Clock'),

('6', 'Infinity Batteries'),

('7', 'Gleam Window Cleaner'),

('8', 'IronGlide Steam Iron'),

('9', 'Breeze Air Purifier'),

('10', 'QuickFix Super Glue');

Having done that, if you go back to the pgEdge UI, you should see the products table mentioned, along with a Start Replication button. Go ahead and click that now.Picture4In a few seconds, the table will be replicated to all nodes in the cluster. Now copy the PostgreSQL connection string for the app user in the “Connect to your database” section that is shown:

Picture5Create a Hyperdrive

We’ll create a Hyperdrive and then update our Cloudflare Worker to use it to connect to pgEdge. If you have any questions about this section, you can refer to the Hyperdrive Get Started docs.

npx wrangler hyperdrive create pgedge --connection-string="postgresql-string-from-pgedge" 

That will output some information including the Hyperdrive object UUID. Copy that ID and update the wrangler.toml file in the project to include the following:

node_compat = true # required for the postgres connection

[[hyperdrive]]

binding = "HYPERDRIVE"

id = "the-hyperdrive-uuid-goes-here"

Update the Worker to Query pgEdge

Replace the code in src/index.ts with the following:

import { Client } from 'pg';


export interface Env {

	HYPERDRIVE: Hyperdrive;

}

export default {

	async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {

		const client = new Client(env.HYPERDRIVE.connectionString);

		try {

			await client.connect();

			const result = await client.query('SELECT * FROM products');

		    ctx.waitUntil(client.end());

			const items = result.rows.map((row: any) => {

				return { id: row.id, name: row.name };

			});

			return new Response(JSON.stringify({ items }), {

				headers: { 'Content-Type': 'application/json' },

			});

		} catch (e) {

			console.log(e);

			return Response.json({ error: JSON.stringify(e) }, { status: 500 });

		}

	},

};

Now deploy all those updates using the command “npx wrangler deploy”. If you hit the URL for the worker using your browser or using curl, you should see the product JSON information as queried from pgEdge.

We’re almost there. Now let’s see how this performs when tested from multiple client locations!

Testing Latencies Across the US and Europe

We aim to test latencies from the point of view of end-users located across the US and Europe, being sure not to just pick locations near our database nodes. My approach here was to use Terraform to deploy a virtual machine in each of 5 different Google Cloud regions: Los Angeles, Salt Lake City, Dallas, Montreal, and London. Then I wrote a simple Go program to issue multiple HTTPS requests to our worker and record the average request latencies.

You can find all the scripts, Terraform code, and the rest of the test setup in this open source repository on GitHub.

As you’ll see in the results section below, I tested with four different configurations:

  • Hyperdrive off, pgEdge us-east node (no-hd-pgedge-us-east)

  • Hyperdrive off, pgEdge nearest node (no-hd-pgedge-nearest-node)

  • Hyperdrive on, pgEdge us-east node (hd-pgedge-us-east)

  • Hyperdrive on, pgEdge nearest node (hd-pgedge-nearest-node)

The results for these different configurations will help us understand how much each factor contributes to reducing the overall latency from the end-user’s perspective.

The Results

In short, the fastest configuration - Hyperdrive with pgEdge “nearest node” - gets really, really close to our target of <100 ms request latencies for end-users across the US and Europe. See the numbers for the green dataset in the chart below.Picture6Hyperdrive is certainly a big win. The multiple round-trips to establish the TCP connection and then authenticate with PostgreSQL have a huge latency impact when this is happening in a geographically distributed situation. Enabling Hyperdrive eliminated 295 ms of latency for the user in Los Angeles.

Similarly, being able to access the nearest PostgreSQL node in a distributed cluster also helps significantly, depending on where the end-user is located. For example, the end-user in London experienced latencies that were cut in half (139 ms to 72 ms) when using pgEdge nearest-node capability.

In Summary 

It is definitely possible to create high-performance web apps leveraging Cloudflare Workers and pgEdge distributed PostgreSQL. The test setup evaluated here successfully combined regional TCP connection pooling (via Hyperdrive) with the latency-based DNS routing that’s baked into every pgEdge database. Many end-users can retrieve dynamic data from queries in around 70 ms.

Achieving sub-100 ms request latencies for geographically distributed end-users is generally feasible.

Don’t forget to browse the open source code for these tests on GitHub.

Sign up for pgEdge Cloud and give our distributed PostgreSQL a try!