pgEdge
Distributed Postgres
JamStack
Edge Database
Integrating pgEdge Distributed PostgreSQL with Cloudflare Workers for low latency database access
With Cloudflare Workers, developers can effortlessly deploy code around the world, executing it in close proximity to users for reduced latency and improved performance. The problem is that if you want or need to use a relational database it inevitably is going to reside at a single location on the network – eliminating a lot of the latency and performance advantages Cloudflare Workers can bring. Enter pgEdge.
In this post, we’ll investigate how to combine Cloudflare Workers with pgEdge’s distributed PostgreSQL database, allowing us to also locate the application’s data close to the end users. With this integration, we can create extremely responsive applications supported by a flexible and efficient data architecture based on standard Postgres.
Rather than connecting directly to the Postgres database on port 5432, we’ll leverage the excellent PostgREST project to provide a secure HTTPS API to our database. We’ll use Cloudflare features including KV in order to manage our API credentials. The pgEdge Cloud API will be used to determine the closest database node for each request.
For our target database, we'll use a three-node pgEdge cluster with PostgREST running on each node. Thanks to pgEdge multi-master (multi-active) replication, we can run these nodes in different regions around the world, and use PostgREST to create an API at each location. For a walkthrough on how to configure and use PostgREST, see this tutorial.
The general architecture we're using for this demo is illustrated below, followed by a text description of the flow of data.
In our example, a client makes a request which is handled by the Cloudflare worker. The worker then queries pgEdge Node1 which it learns is the closest database node. The overall sequence of events looks like this:
A client request is sent to the Cloudflare worker, which is running our code.
When handling the request, our code looks up the geographic location information in the request header. If the nearest node information was already cached (in the Cloudflare key value store) we use that.
Otherwise, we query the pgEdge Cloud API with the request’s location data to identify the URL corresponding to the nearest database node.
The database query is then sent to that pgEdge node as a GET, POST, or PATCH request to the PostgREST API. The Cloudflare worker responds to the request with the query results.
The caching strategy for the “nearest node” information is important to get right, since we don’t want to slow down each request with work to figure out which node is closest. To avoid doing this work on each request, we use Cloudflare KV to store the URL of the nearest database node for a given longitude/latitude pair (with the coordinate used as the primary key).
To get started, sign up for Cloudflare here if you don’t already have an account.
To recreate this demo, you can use the Cloudflare dashboard to:
1. Create the Cloudflare KV and worker.
2. Bind the KV to your Cloudflare worker.
3. Provide code for the Cloudflare worker.
Then, you're ready to begin using pgEdge with Cloudflare workers.
Creating a Cloudflare worker
Under the Workers section of the Cloudflare dashboard, click on the Overview section; then select the Create a Service button.
Enter a new service name (or use the default service name), and choose the HTTP handler option. Once created, the worker will be listed along with the Last 24 hrs requests, Error counts, and Median CPU Time.
Your Cloudflare Website
Here we’ll assume you already have a website hosted with Cloudflare. If that’s not the case yet, you might find it’s really quick to get started with Cloudflare Pages. Read about this here.
We need to modify the Page Rules for our Cloudflare site. In the pull down menu, select IP Geolocation Header and enable the slider to the right. This instructs Cloudflare to pass the cf-iplongitude and cf-iplatitude values in the header to the Cloudflare worker:
Creating a Cloudflare KV
Under the Workers section of the Cloudflare dashboard, click on the WorkersKV section. Then, select the Create a namespace button and provide a name for your KV (we're using locations). The image below shows the created KV namespace locations.
Binding the Cloudflare KV namespace to a worker
Next, we’ll provide the Worker with the information about the KV namespace we just created. Bind the KV to a worker by clicking on the worker name in the Cloudflare dashboard. Then, open the Settings tab in the second table on the page, click on Variables, and then select the Add binding button under the KV Namespace Bindings section. Provide the name “node_locations” in the Variable name field and select your KV Namespace from the drop down list, then click Save and Deploy.
Authenticating with pgEdge Cloud
As mentioned earlier, we’ll use the pgEdge Cloud API to find the closest database node when handling a request. Before we can query for the database node, we’ll need to generate an access token for the pgEdge Cloud API. As you can see in the Cloudflare worker code below, we issue a POST request to the pgEdge oauth/token API endpoint. This request must contain a client_id and client_secret, which are values you are given in the pgEdge Cloud UI. The response to this POST request contains an access token which will be used to query for the nearest database nodes.
We’ll save the client_id and client_secret in Cloudflare to make them available to our worker code. In your Cloudflare worker, go to Settings, and then Variables to add these as environment variables. Note we’ll also add our pgEdge cluster ID here as well, since that is also used in our queries.
Cloudflare Worker Code
At this point, we’ll assume you already have some content populated in your pgEdge database. In the example below, we’ll be using tables created by the pgbench utility. You can use a curl command like the following to return the contents of the pgbench_branches table from a pgEdge cluster with PostgREST installed and configured:
curl https://node_host/pgbench_branches
The following code uses a GET method to perform the same query:
const url = http://${node_host}:3000/pgbench_branches;
const init = {
method: "GET",
headers: {
"content-type": "application/json;charset=UTF-8",
},
};
response = awaitfetch(url, init);
Using KV
I have defined node_host as the variable used to represent the IP address of a node. This information is saved as the value in the KV with the longitude/latitude pair that is closest serving as the key. To pull that information from KV, use the following code:
let node_host = await env.node_locations.get({latitude: my_lat,longitude:my_lon });
If this call returns NULL, then the closest node to that longitude/latitude key combination has yet to be stored in the KV. The pgEdge Cloud API offers a cluster information call that will take the longitude and latitude and return information about all nodes of the cluster, ordered from nearest to furthest. The code for this API call is below; in the code, node_host is being set to the first public_ip_address returned.
const nodes_url =
https: //api.pgedge.com/clusters/${env.clusterID}/nodes?order_by=distance&near_lat=${my_lat}&near_lon=${my_lon}
const conn_info = {
method: "GET",
headers: {
"Authorization": token,
"content-type": "application/json;charset=UTF-8",
}
};
const nodes_response = await fetch(nodes_url, conn_info);
const nodes_results = await nodes_response.json()
node_host = nodes_results[0]["public_ip_address"]
So we have it for the next time this code executes with the same longitude and latitude, we'll use the following code snippet to add this key value pair to the KV (where latitude/longitude is the key, and node_host, or the closest node’s ip address is the value):
await env.node_locations.put({latitude: my_lat,longitude: my_lon}, node_host);
Complete Cloudflare Worker Code
Now, we'll piece together all of the concepts we've discussed. The following code is the code you need to integrate Cloudflare workers with pgEdge Cloud. This code will:
● get and round the user’s longitude and latitude
● check the KV store for an existing record for that longitude and latitude
● if no record is found:
- it uses the pgEdge Cloud API to list all nodes in the cluster in distance order
- add the public IP address of the closest node to the KV store
● query the pgbench_branches table from the nearest node
export default {
async fetch(request, env, ctx) {
const my_lat=5 * math.round(parseFloat(request.headers.get("cf-iplatitude")) /5)
const my_lon=5 * math.round(parseFloat(request.headers.get("cf-iplongitude")) /5)
/* Check KV for record */
let node_host = await env.node_locations.get({latitude: my_lat,longitude:my_lon });
/* If not found- get closest node*/
if (node_host == null) {
/* Get Token */
const token_url = "https://api.pgedge.com/oauth/token"
const get_token = {
method: "POST",
headers: {
"content-type": "application/json;charset=UTF-8",
},
body: JSON.stringify({
"client_id": env.client_id,
"client_secret": env.client_secret
})
};
const token_response = await fetch(token_url, get_token);
const token_results = await token_response.json()
const token = `Bearer ${token_results["access_token"]}`
/* Find Closest Node */
const nodes_url = `https://api.pgedge.com/clusters/${env.clusterID}/nodes?order_by=distance&near_lat=${my_lat}&near_lon=${my_lon}`
const conn_info = {
method: "GET",
headers: {
"Authorization": token,
"content-type": "application/json;charset=UTF-8",
}
};
const nodes_response = await fetch(nodes_url, conn_info);
const nodes_results = await nodes_response.json()
node_host = nodes_results[0]["public_ip_address"]
/* Add new url to KV */
await env.node_locations.put({latitude: my_lat,longitude: my_lon}, node_host);
}
/* Query pgEdge node via postgREST */
const url = `http://${node_host}:3000/pgbench_branches?bid=eq.1`;
const init = {
method: "GET",
headers: {
"content-type": "application/json;charset=UTF-8",
},
};
response = await fetch(url, init);
results = await response.text();
return new Response(results, init);
}
};
This code sample does not work in the Quick Edit dashboard; to exercise this code, you need to deploy the Cloudflare workers and a website in order to test the longitude and latitude coming from the header. If you want to test the pgEdge Cloud API and database calls in the Quick Edit dashboard, you can hard code the variables my_lon and my_lat.
Next Steps
You can use information returned by the pgEdge Cloud API to add an extra layer of robustness to your client application. The following simple additions are beyond the scope of this blog, but can help ensure that your code handles unexpected cloud and database outages gracefully.
The pgEdge Cloud API returns a list of all the nodes in the cluster, ordered from closest to furthest. You can use that information to implement retry logic that supports the above code. Rather than saving the first value returned as the node_host, you can make node_host an array of all returned public IP addresses. Then, if your client encounters a failure when you run your business logic against the first node, it can retry with the next node in the array.
You might also want to add a script that automatically clears stale node addresses from the KV store when you reconfigure your pgEdge cluster. This will ensure that a Cloudflare worker will go back to the pgEdge Cloud API to look up the new nearest node, rather than potentially trying to reach a node that is no longer in the cluster.
Wrapping Up
In this post we explored how to integrate Cloudflare Workers with a pgEdge distributed database to achieve global, low-latency data access for our application. We covered a lot of ground in this post and know that some aspects could use further elaboration. Look for follow up posts to dive into more detail. Overall, we are thrilled to present this distributed Postgres solution featuring PostgREST for web-friendly, location-aware data access. Stay tuned for more exciting developments and insights in the world of distributed databases and edge computing!