-
Notifications
You must be signed in to change notification settings - Fork 988
Description
Feature Request: Allow automatic purge of target table on snapshot "first"
event
Hi Confluent team,
I’m currently using the Kafka Connect JDBC Sink Connector together with Debezium as the source.
In some production setups — particularly when using PostgreSQL as the source with Debezium and snapshot.mode=when_needed
— a failover or promotion of a new primary node may result in a different WAL position or replication slot on the new instance. In such cases, Debezium may not be able to resume from the previous WAL offset and instead triggers a snapshot of the affected tables.
Since this snapshot does not include delete events, any records previously deleted (but not replicated before the failover) may not be cleaned from the target database by the sink connector — leading to stale or inconsistent data.
Proposed Solution
Introduce a new connector configuration option:
"delete.before.snapshot": true
When enabled:
The connector inspects incoming records.
If the source.snapshot field equals "first", it triggers a :
DELETE FROM <table_name>
operation on the target before applying the incoming snapshot records.
This purge would occur only once per table per snapshot cycle.
Use case
PostgreSQL clusters with asynchronous failover or promotion
Environments where WAL retention is limited and deletes might not reach Debezium
Situations where the sink connector must be able to handle incomplete CDC continuity gracefully
Willing to Contribute
If this enhancement aligns with the goals of the JDBC Sink Connector project, I would be happy to implement and contribute a pull request for this feature.
Thanks for your consideration!