Source Database
Bemi tracks changes made in a primary PostgreSQL database (source database) by implementing a design pattern called Change Data Capture (CDC), a process of identifying and capturing changes made to data in a database in real-time. More specifically, Bemi workers connect and use built-in PostgreSQL replication mechanisms with Write-Ahead Log (WAL).
Connection
Specify the following source database connection details:
- Host
- Database
- Port
- User
- Password
After that, you can enable selective tracking and pick which database tables you want to track.
WAL Level
Bemi relies on logical replication that allows ingesting changes row-by-row, unlike physical replication that sends disk block changes.
You can check the wal_level
to make sure logical replication is enabled:
SHOW wal_level;
+-------------+
| wal_level |
|-------------|
| logical |
+-------------+
If your current WAL level is set to replica
, you need to update it to logical
and restart your PostgreSQL instance.
Updating this value won't break replication, it will just slightly increase the WAL volume (disk space and network traffic if there are replicas).
You can find more information in the following guides on how to change the WAL level and connect depending on your hosting platform:
Selective tracking
Tracking by Tables
During the Source Database connection setup or any time after, you can configure what tables you want to track:
Bemi automatically tracks changes in the default public
schema. If you would like to enable tracking for other schemas in your Bemi organization, please contact us.
Ignoring by Columns
Bemi allows to configure ignore-change columns, such as public.tableName.updatedAt
, to ignore meaningless data changes.
This prevents the creation of a new audit trail entry (called "change") for a record in tableName
if updatedAt
was the only column value that was updated.
In other words, public.tableName.updatedAt
is used to determine whether an audit trail entry should be recorded or not.
Note that this column will always be recorded if there were updated values in other columns.
SSH Tunnel
If your PostgreSQL source database is not accessible over the internet, you can specify SSH credentials to enable an SSH tunnel via a jump host.
Once the source database connection settings are submitted, we'll generate a public SSH key. Add this public SSH key to your SSH host to allow Bemi workers to connect and SSH-tunnel to the PostgreSQL database:
touch ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys
echo 'ssh-ed25519 AAAAC3Nz...' >> ~/.ssh/authorized_keys
If you need a public SSH Key before you know the SSH host address, just specify any address and later reach out to us to update it.
Bemi Static IPs
If you restrict access to your databases by IP addresses, contact us. We will share our static IP addresses, which you can add to an allowlist, so we can connect to your Source PostgreSQL database.
Disconnecting
To disconnect from Bemi, you can to execute the following queries to remove the triggers that set REPLICA IDENTITY FULL
for tracking the previous state:
DROP EVENT TRIGGER _bemi_set_replica_identity_trigger;
DROP FUNCTION _bemi_set_replica_identity_func;
DROP PROCEDURE _bemi_set_replica_identity;
If you used Bemi packages for the Supported ORMs, you can execute the following queries to remove the lightweight triggers used for passing application context:
DROP EVENT TRIGGER _bemi_create_table_trigger;
DROP FUNCTION _bemi_create_table_trigger_func;
DROP PROCEDURE _bemi_create_triggers;
DROP FUNCTION _bemi_row_trigger_func CASCADE;
To completely disable logical replication, run the following queries:
(!!!) If you later decide to resume Bemi, we won't be able to recover and ingest data changes after this point.
SELECT pg_drop_replication_slot('bemi');
DROP PUBLICATION bemi;