Skip to main content

Source Database

Bemi tracks changes made in a primary PostgreSQL database (source database) by implementing a design pattern called Change Data Capture (CDC), a process of identifying and capturing changes made to data in a database in real-time. More specifically, Bemi workers connect and use built-in PostgreSQL replication mechanisms with Write-Ahead Log (WAL).

Connection

Specify the following source database connection details:

  • Host
  • Database
  • Port
  • User
  • Password

dashboard

After that, you can enable selective tracking and pick which database tables you want to track.

WAL Level

Bemi relies on logical replication that allows ingesting changes row-by-row, unlike physical replication that sends disk block changes. You can check the wal_level to make sure logical replication is enabled:

SHOW wal_level;
+-------------+
| wal_level |
|-------------|
| logical |
+-------------+

If your current WAL level is set to replica, you need to update it to logical and restart your PostgreSQL instance. Updating this value won't break replication, it will just slightly increase the WAL volume (disk space and network traffic if there are replicas).

PostgreSQL Hosting Platforms

Supabase

Connection

To connect a Supabase database, you need to go to your Supabase project settings, untoggle "Use connection pooling", and use these connection details on Bemi to connect to the replication log

Note that you can't create new credentials with REPLICATION permissions in Supabase, see this discussion.

WAL level

Supabase provisions PostgreSQL with the WAL level already set to logical. So, it is ready to be used.

Render

Connection

To connect a Render database, specify your database credentials that can be found on the Render dashboard:

  • Please use the full Host name that ends with .render.com from the External Database URL section

Note that you can't create new credentials with REPLICATION permissions in Render.

WAL level

Please submit a Render support request, and they'll run a special runbook to set up Bemi:

In a few words, what can we help you with?

Configure database for Bemi

Describe the issue in more detail.

- Set "wal_level" to "logical"
- Add "REPLICATION" permission to the database user
- Create "bemi" publication

Neon

Connection

To connect a Neon database, specify your database credentials that can be found on the project's dashboard:

  • Please use the Host name without enabled "Pooled connection"

And that's it, everything should just work!

Read-only credentials

Alternatively, you can manually create read-only PostgreSQL database credentials to connect to the primary instance's WAL. At a high level, you need to run these commands that are safe to execute without any downtime or performance issues:

  • CREATE ROLE creates a new read-only user for Bemi to read database changes.
  • CREATE PUBLICATION creates a "channel" that we'll subscribe to and track changes in real-time.
  • REPLICA IDENTITY FULL enhances records stored in WAL to record the previous state (“before”) in addition to the tracked by default new state (“after”).
-- Create read-only user with REPLICATION permission
CREATE ROLE [username] WITH LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE REPLICATION PASSWORD '[password]';
-- Grant SELECT access to tables for selective tracking
GRANT SELECT ON ALL TABLES IN SCHEMA public TO [username];
-- Grant SELECT access to new tables created in the future for selective tracking
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO [username];

-- Create "bemi" PUBLICATION to enable logical replication
CREATE PUBLICATION bemi FOR ALL TABLES;

-- Create a procedure to set REPLICA IDENTITY FULL for tables to track the "before" state on DB row changes
CREATE OR REPLACE PROCEDURE _bemi_set_replica_identity() AS $$ DECLARE current_tablename TEXT;
BEGIN
FOR current_tablename IN SELECT tablename FROM pg_tables LEFT JOIN pg_class ON relname = tablename WHERE schemaname = 'public' AND relreplident != 'f' LOOP
EXECUTE format('ALTER TABLE %I REPLICA IDENTITY FULL', current_tablename);
END LOOP;
END $$ LANGUAGE plpgsql;
-- Call the created procedure
CALL _bemi_set_replica_identity();

WAL level

If you have access to Neon's Logical Replication Beta program, you can set WAL level to logical by enabling the feature in your Project settings.

If the Enrollment is paused for you, just send a support request email:

To: [email protected]
Subject: Enable logical replication

I need to integrate with Bemi.io, which requires logical replication.
Can you please enable it for my [YOUR-DATABASE-HOST.neon.tech] database?

AWS RDS

Connection

You can specify the same regular database credentials you use to connect to PostgreSQL from your code. And that's it, everything should just work!

Read-only credentials

Alternatively, you can manually create read-only PostgreSQL database credentials to connect to the primary instance's WAL. At a high level, you need to run these commands that are safe to execute without any downtime or performance issues:

  • CREATE ROLE creates a new read-only user for Bemi to read database changes.
  • CREATE PUBLICATION creates a "channel" that we'll subscribe to and track changes in real-time.
  • REPLICA IDENTITY FULL enhances records stored in WAL to record the previous state (“before”) in addition to the tracked by default new state (“after”).
-- Create read-only user
CREATE ROLE [username] WITH LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE NOREPLICATION PASSWORD '[password]';
-- Grant RDS replication permission
GRANT rds_replication TO [username];
-- Grant SELECT access to existing tables for selective tracking
GRANT SELECT ON ALL TABLES IN SCHEMA public TO [username];
-- Grant SELECT access to new tables created in the future for selective tracking
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO [username];

-- Create "bemi" PUBLICATION to enable logical replication
CREATE PUBLICATION bemi FOR ALL TABLES;

-- Create a procedure to set REPLICA IDENTITY FULL for tables to track the "before" state on DB row changes
CREATE OR REPLACE PROCEDURE _bemi_set_replica_identity() AS $$ DECLARE current_tablename TEXT;
BEGIN
FOR current_tablename IN SELECT tablename FROM pg_tables LEFT JOIN pg_class ON relname = tablename WHERE schemaname = 'public' AND relreplident != 'f' LOOP
EXECUTE format('ALTER TABLE %I REPLICA IDENTITY FULL', current_tablename);
END LOOP;
END $$ LANGUAGE plpgsql;
-- Call the created procedure
CALL _bemi_set_replica_identity();
-- Create a trigger function that calls the created procedure
CREATE OR REPLACE FUNCTION _bemi_set_replica_identity_func() RETURNS event_trigger AS $$
BEGIN CALL _bemi_set_replica_identity(); END $$ LANGUAGE plpgsql;
-- Create a trigger to set REPLICA IDENTITY FULL for all new created tables
CREATE EVENT TRIGGER _bemi_set_replica_identity_trigger ON ddl_command_end WHEN TAG IN ('CREATE TABLE')
EXECUTE FUNCTION _bemi_set_replica_identity_func();

Read-only credentials with manually managed permissions for each table

Run the following queries if you want to isolate read access only to logical replication for certain tables and manage permissions manually instead of relying on our robust built-in selective tracking manageable through our UI.

-- Create read-only user
CREATE ROLE [username] WITH LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE NOREPLICATION PASSWORD '[password]';
-- Grant replication permission to allow using replication slots
GRANT rds_replication TO [username];

-- Create "bemi" PUBLICATION to enable logical replication for selected tables
CREATE PUBLICATION bemi FOR TABLE [table1], [table2];

-- Set REPLICA IDENTITY FULL for tables to track the "before" state on DB row changes
ALTER TABLE [table1] REPLICA IDENTITY FULL;
ALTER TABLE [table2] REPLICA IDENTITY FULL;

To enable data change tracking for a new table:

ALTER PUBLICATION bemi ADD TABLE [table3];
ALTER TABLE [table3] REPLICA IDENTITY FULL;

To stop data change tracking for a table:

ALTER PUBLICATION bemi DROP TABLE [table3];
ALTER TABLE [table3] REPLICA IDENTITY DEFAULT;

WAL level

At a high level, these are the steps necessary to update the WAL level from replica to logical

  1. Create an RDS parameter group if it doesn’t exist
  2. Update rds.logical_replication parameter from 0 to 1
  3. Apply the parameter group to your RDS instance and restart it

Now let's break down these steps.

Create an RDS parameter group if it doesn’t exist by choose the group family depending on your PostgreSQL version and specifying any name and description:

Edit the created parameter group:

Find and change the rds.logical_replication parameter from 0 to 1:

Find and modify your RDS instance by using the parameter group:

Apply the modification by restarting your RDS instance:

If you have a Multi-AZ database cluster and you used a cluster parameter group, you will need to explicitly Reboot the Writer instance (it may take ~ 2 seconds if there is not a lot of activity). The Reader endpoint will continue to be available without downtime.

See the AWS RDS user guides to learn more about parameter groups.

GCP Cloud SQL

Connection

Run the below command and then you can connect with the same credentials on the Bemi dashboard!

-- Grant replication permission to allow using replication slots
ALTER USER [user] WITH REPLICATION;

WAL level

Logical replication is turned off by default. To turn it on, you have to update the cloud flag: cloudsql.logical_decoding = on. This will need a restart of your instance before SHOW wal_level; returns logical.

Self-managed PostgreSQL

Connection

You can specify the same regular database credentials you use to connect to PostgreSQL from your code. And that's it, everything should just work!

Read-only credentials

Alternatively, you can manually create read-only PostgreSQL database credentials to connect to the primary instance's WAL. At a high level, you need to run these commands that are safe to execute without any downtime or performance issues:

  • CREATE ROLE creates a new read-only user for Bemi to read database changes.
  • CREATE PUBLICATION creates a "channel" that we'll subscribe to and track changes in real-time.
  • REPLICA IDENTITY FULL enhances records stored in WAL to record the previous state (“before”) in addition to the tracked by default new state (“after”).
-- Create read-only user with REPLICATION permission
CREATE ROLE [username] WITH LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE REPLICATION PASSWORD '[password]';
-- Grant SELECT access to tables for selective tracking
GRANT SELECT ON ALL TABLES IN SCHEMA public TO [username];
-- Grant SELECT access to new tables created in the future for selective tracking
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO [username];

-- Create "bemi" PUBLICATION to enable logical replication
CREATE PUBLICATION bemi FOR ALL TABLES;

-- Create a procedure to set REPLICA IDENTITY FULL for tables to track the "before" state on DB row changes
CREATE OR REPLACE PROCEDURE _bemi_set_replica_identity() AS $$ DECLARE current_tablename TEXT;
BEGIN
FOR current_tablename IN SELECT tablename FROM pg_tables LEFT JOIN pg_class ON relname = tablename WHERE schemaname = 'public' AND relreplident != 'f' LOOP
EXECUTE format('ALTER TABLE %I REPLICA IDENTITY FULL', current_tablename);
END LOOP;
END $$ LANGUAGE plpgsql;
-- Call the created procedure
CALL _bemi_set_replica_identity();
-- Create a trigger function that calls the created procedure
CREATE OR REPLACE FUNCTION _bemi_set_replica_identity_func() RETURNS event_trigger AS $$
BEGIN CALL _bemi_set_replica_identity(); END $$ LANGUAGE plpgsql;
-- Create a trigger to set REPLICA IDENTITY FULL for all new created tables
CREATE EVENT TRIGGER _bemi_set_replica_identity_trigger ON ddl_command_end WHEN TAG IN ('CREATE TABLE')
EXECUTE FUNCTION _bemi_set_replica_identity_func();

Read-only credentials with manually managed permissions for each table

Run the following queries if you want to isolate read access only to logical replication for certain tables and manage permissions manually instead of relying on our robust built-in selective tracking manageable through our UI.

-- Create read-only user with REPLICATION permission
CREATE ROLE [username] WITH LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE REPLICATION PASSWORD '[password]';

-- Create "bemi" PUBLICATION to enable logical replication for selected tables
CREATE PUBLICATION bemi FOR TABLE [table1], [table2];

-- Set REPLICA IDENTITY FULL for tables to track the "before" state on DB row changes
ALTER TABLE [table1] REPLICA IDENTITY FULL;
ALTER TABLE [table2] REPLICA IDENTITY FULL;

To enable data change tracking for a new table:

ALTER PUBLICATION bemi ADD TABLE [table3];
ALTER TABLE [table3] REPLICA IDENTITY FULL;

To stop data change tracking for a table:

ALTER PUBLICATION bemi DROP TABLE [table3];
ALTER TABLE [table3] REPLICA IDENTITY DEFAULT;

WAL level

Run the following SQL command to change the WAL level from replica to logical and restart your database:

ALTER SYSTEM SET wal_level = logical;

If you have issues in other PostgreSQL hosting environments, please reach out to us and we will send you detailed instructions on how to set it up.

Selective tracking

Tracking by Tables

During the Source Database connection setup or any time after, you can configure what tables you want to track:

Bemi automatically tracks changes in the default public schema. If you would like to enable tracking for other schemas in your Bemi organization, please contact us.

Ignoring by Columns

Bemi allows to configure ignore-change columns, such as myTable.updatedAt, to ignore meaningless data changes. This prevents the creation of a new audit trail entry (called "change") for a record in myTable if updatedAt was the only column value that was updated.

In other words, myTable.updatedAt is used to determine whether an audit trail entry should be recorded or not. Note that this column will always be recorded if there were updated values in other columns.

SSH Tunnel

If your PostgreSQL source database is not accessible over the internet, you can specify SSH credentials to enable an SSH tunnel via a jump host.

dashboard

Once the source database connection settings are submitted, we'll generate a public SSH key. Add this public SSH key to your SSH host to allow Bemi workers to connect and SSH-tunnel to the PostgreSQL database:

touch ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys
echo 'ssh-ed25519 AAAAC3Nz...' >> ~/.ssh/authorized_keys

If you need a public SSH Key before you know the SSH host address, just specify any address and later reach out to us to update it.

Bemi Static IPs

If you restrict access to your databases by IP addresses, contact us. We will share our static IP addresses, which you can add to an allowlist, so we can connect to your Source PostgreSQL database.

Disconnecting

To disconnect from Bemi, you can to execute the following queries to remove the triggers that set REPLICA IDENTITY FULL for tracking the previous state:

DROP EVENT TRIGGER _bemi_set_replica_identity_trigger;
DROP FUNCTION _bemi_set_replica_identity_func;
DROP PROCEDURE _bemi_set_replica_identity;

If you used Bemi packages for the Supported ORMs, you can execute the following queries to remove the lightweight triggers used for passing application context:

DROP EVENT TRIGGER _bemi_create_table_trigger;
DROP FUNCTION _bemi_create_table_trigger_func;
DROP PROCEDURE _bemi_create_triggers;
DROP FUNCTION _bemi_row_trigger_func CASCADE;

To completely disable logical replication, run the following queries:

(!!!) If you later decide to resume Bemi, we won't be able to recover and ingest data changes after this point.

SELECT pg_drop_replication_slot('bemi');
DROP PUBLICATION bemi;