Skip to main content

Destination Database

Bemi automatically provisions and manages a serverless PostgreSQL destination database cluster of three nodes to store all changes made in a source database. You have full control over this database which comes with additional features:

  • Autoscaling, managed table partitioning and index optimization to improve performance
  • Automatic failover, read-replicas, and backups for high availability and read scalability
  • Automatic data retention: 15 days, 30 days, or unlimited
  • Automatic PostgreSQL version upgrades
  • Standard cloud support
  • Control plane and monitoring through Bemi Dashboard (coming soon)

The infrastructure is automatically provisioned in the us-west-1 region. If you have specific data residency requirements, please contact us.

Data Structure

Changes performed by creating, updating, or deleting each row are stored in a table called changes and have the following structure:

ColumnTypeDescription
iduuidA unique identifier of the change record
databasevarchar(255)Database name where the changed record was stored
schemavarchar(255)Schema name where the changed record was stored
tablevarchar(255)Table name where the changed record was stored
primary_keyvarchar(255)A unique identifier of the changed record (optional)
operationtextEnum that can be either CREATE, UPDATE, or DELETE
beforejsonbRecord's values before the change
afterjsonbRecord's values after the change
contextjsonbApp context passed by using our recommended ORM packages
committed_attimestamptz(0)When the record was changed
queued_attimestamptz(0)When the changed record was ingested from WAL
created_attimestamptz(0)When the change record was stored in the database
transaction_idbigintPostgreSQL transaction ID
positionbigintPostgreSQL WAL position

Querying Changes

You can query changes by using our ORM packages or by directly connecting and executing SQL queries. For example, if you need to find when and how a user record with ID b7267340-5011-40f4-ab9a-902b68fc5b25 had its email updated to [email protected] in the last 3 months:

SELECT *
FROM "changes"
WHERE
"database" = 'postgres' AND "schema" = 'public' AND "table" = 'users' AND
"primary_key" = 'b7267340-5011-40f4-ab9a-902b68fc5b25' AND "operation" = 'UPDATE' AND
"after" @> '{"email": "[email protected]"}' AND NOT ("before" @> '{"email": "[email protected]"}') AND
"committed_at" BETWEEN (NOW() - INTERVAL '3 months') AND NOW()
LIMIT 1;`

The JSONB columns are indexed with GIN Index with jsonb_path_ops operator class that is perfomance-optimized for operators like:

  • jsonb @> '{"key": value}' to check if a key/value pair matches JSONB
  • jsonb @? '$.key' to check if a key exists in JSONB

Connection pooling

PostgreSQL databases have a fixed maximum number of connections. Once that limit is hit, additional clients can’t connect. The reason is that PostgreSQL has to fork a separate process to handle each client's connection concurrently, which is very resource-intensive.

Modern frameworks and database drivers use a connection pool on an application level to reduce the number of consumed database connections. They maintain a "pool" of open connections that can be passed from one session to another session as needed.

However, if you run multiple instances of your application, then the number of direct database connections will still keep growing. That is why a destination database comes with an automatically provisioned highly scalable connection pooler that can handle hundreds of connections.

IP-Based Access Control

It is possible to restrict access to a provisioned database by IP addresses in CIDR notation. This feature allows you to limit access to the database to specific IP addresses, such as those associated with your applications' servers, VPN network, office or home network, etc.

By default, a database has a single 0.0.0.0/0 CIDR notation and is accessible from any IP address.

TLS/SSL certificates

A destination database uses TLS/SSL certificates for both encryption in transit and authentication on the server and client sides. The database comes with a certification authority (CA) and uses it for both client and server certificates, which are managed and renewed automatically.