PostgreSQL to CrateDB loader research¶
Full-load¶
This subsystem is already covered by ingestr/dlt.
CDC¶
You can replicate data from PostgreSQL using different methods.
Those replication implementations are relevant.
- DMS (logical, plugin: pglogical) 
- DMS (logical, plugin: wal2json) 
- asyncpg (logical, native) 
- pgbelt (logical, plugin: pglogical, uses asyncpg) 
- pg_replicate (logical, native) 
- psycopg2 (protocol, native) 
- pypg-cdc (uses psycopg2, modern pgoutput) 
- pypgoutput (uses psycopg2) 
- python-postgres-cdc (uses psycopg2, even more modern pgoutput) 
- wal2json (logical, plugin) 
Evaluation¶
Better use Apache Flink via flink-cdc and the Flink Postgres CDC Connector for protocol replication.
Alternatives¶
Python to the rescue?
Use Kinesis as a data sink.
- AWS Blog: Stream changes from Amazon RDS for PostgreSQL using Kinesis & Lambda 
- Medium: Stream changes from Amazon RDS for PostgreSQL with Kinesis & Lambda 
- Commerce Architects: Strategies for replicating data from RDS to Kinesis 
More resources.
supabase/etl¶
It looks like pg_replicate made progress to become a full-blown ETL framework,
now renamed to supabase/etl.
- That’s a sweet introduction to - pg_replicateby its author:- For the past few months, as part of my job at Supabase, I have been working on pg_replicate. pg_replicate lets you easily build applications which can copy data (full table copies and cdc) from Postgres to any other data system. - – pg_replicate is a Rust crate to build Postgres logical replication applications