Skip to content

Commit 1a60b45

Browse files
fix: Use tuples for pkey check (#446)
https://github.com/MeltanoLabs/target-postgres/blob/d07b41583e8ff77ee770a0d40779ea9485772461/target_postgres/sinks.py#L161-L169 Currently the code above uses string concatenation to check for duplicate primary key values, however this is problematic since the records below will be treated as the same record and the first will be omitted: Record 1: - Primary Key 1: AB - Primary Key 2: C Record 2: - Primary Key 1: A - Primary Key 2: BC Changing to a tuple should mitigate this. Co-authored-by: Edgar Ramírez Mondragón <16805946+edgarrmondragon@users.noreply.github.com>
1 parent bbc2a63 commit 1a60b45

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

target_postgres/sinks.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -157,15 +157,15 @@ def bulk_insert_records( # type: ignore[override]
157157
data_to_insert: list[dict[str, t.Any]] = []
158158

159159
if self.append_only is False:
160-
insert_records: dict[str, dict] = {} # pk : record
160+
insert_records: dict[tuple, dict] = {} # pk tuple: record
161161
for record in records:
162162
insert_record = {
163163
column.name: record.get(column.name) for column in columns
164164
}
165165
# No need to check for a KeyError here because the SDK already
166166
# guarantees that all key properties exist in the record.
167-
primary_key_value = "".join([str(record[key]) for key in primary_keys])
168-
insert_records[primary_key_value] = insert_record
167+
primary_key_tuple = tuple(record[key] for key in primary_keys)
168+
insert_records[primary_key_tuple] = insert_record
169169
data_to_insert = list(insert_records.values())
170170
else:
171171
for record in records:

0 commit comments

Comments
 (0)