-
-
Notifications
You must be signed in to change notification settings - Fork 61
feat(postgres): add partitioned table support #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(postgres): add partitioned table support #297
Conversation
e65c749
to
230f109
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits:
- We generally use only lowercase queries (this is a standard we adopt at Supabase).
- Lets first merge the PR related to PG 14 support and then rebase it on this one to have a clean diff.
// Verify no data is there. | ||
let table_rows = destination.get_table_rows().await; | ||
assert!(table_rows.is_empty()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was likely deleted by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't deleted by mistake (at the time), I originally had a fix in this branch that fixed a race condition I found. I solved this through a pretty hacky mean, so this test failed unless I removed this statement. #291 contains to the correct fix, so I reverted the change.
if let Some(server_version) = self.server_version | ||
&& server_version.get() >= 150000 | ||
{ | ||
// PostgreSQL 15+ supports FOR ALL TABLES IN SCHEMA syntax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// PostgreSQL 15+ supports FOR ALL TABLES IN SCHEMA syntax | |
// PostgreSQL 15+ supports FOR TABLES IN SCHEMA syntax |
etl/src/replication/client.rs
Outdated
"select schemaname, tablename from pg_publication_tables where pubname = {};", | ||
"select pt.schemaname, pt.tablename from pg_publication_tables pt | ||
join pg_class c on c.relname = pt.tablename | ||
join pg_namespace n on n.oid = c.relnamespace AND n.nspname = pt.schemaname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there the need to perform this JOIN
? Seems like we don't really need to validate namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There wasn't, no. I think this was leftover code that I included attempting to fix the race issue I encountered. I cannot replicate it anymore.
4670481
to
c983ec9
Compare
This MR has mostly been rewritten, as I discovered there wasn't an easy way to capture new partitions being created. I did discover |
Tests are broken again on the partial branch. Can be solved by this patch: uni-intelligence@c0ec81f |
96cc8de
to
6522d79
Compare
67cc277
to
4893b04
Compare
4893b04
to
6d5132b
Compare
Signed-off-by: Abhi Agarwal <abhi@airspace-intelligence.com>
What kind of change does this PR introduce?
Adds support for partitioned tables. It does this through a very convoluted sql query that I basically guess-and-checked until it worked. It basically uses a bunch of heuristics to confirm that if a table is a child table of a partitioned table, it's allowed to treat their PKs as its own.
What is the current behavior?
Fixes #296
What is the new behavior?
Partitioned tables are treated as a single table, and replicate as one unit.
Additional context
Note that this is stacked on top of my other MR because I found this bug after fixing that bug. I will also note that I happened to find a race condition in this PR where a replicate worker tries to push events while a sync worker is waiting for a schema. I solved this by simply blocking all workers until all schemas are fixed, but this is a suboptimal solution, to say the least.