Skip to content

Conversation

morganda
Copy link

Summary

This attempts to fix #19

Currently, if you try to fetch the columns of the table, it will select * from collection limit 1 which assumes a fixed schema.
This PR adds a version of this implementation, but instead uses describe collection option(max_field_depth=1) where max_field_depth is configurable. There are two minor caveats

  • Rhe describe method, does not return the _meta field whereas the select method would.
  • Rollup collections do not currently support describe so if a RocksetException is thrown, it will fall back to the select method. The format of the response is the same so there shouldn't be too many surprises.

Misc

  • Pulled some executions out to separate functions to make things easier to unittest

Tests

  • Added unittests for RocksetDialect.get_columns

@morganda morganda requested review from pmenglund, sbaldwin-rs and a team May 29, 2024 22:57
@sbaldwin-rs
Copy link

Will review this shortly

"sqlalchemy.dialects": [
"rockset_sqlalchemy = rockset_sqlalchemy.sqlalchemy:RocksetDialect",
"rockset = rockset_sqlalchemy.sqlalchemy:RocksetDialect"
"rockset = rockset_sqlalchemy.sqlalchemy:RocksetDialect",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version needs to be bumped, right?

return columns

def _validate_query(self, connection: Engine, query: str):
import rockset.models

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have this import at the top of the file?

if not isinstance(max_field_depth, int):
raise ValueError("Query option max_field_depth, must be of type 'int'")

q = f"DESCRIBE {table_name} OPTION(max_field_depth={max_field_depth})"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema needs to be in the DESCRIBE query, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Getting column definitions assumes fixed schema for all rows
2 participants