Skip to content

Convert WASBS to ABFSS experimental workflow #4031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

FastLee
Copy link
Contributor

@FastLee FastLee commented May 16, 2025

Added support for converting Azure Blob Storage (WASBS) URLs to Azure Data Lake Storage Gen2 (ABFSS) URLs through a new experimental workflow. This feature helps users modernize their storage paths while keeping their table metadata intact. The implementation transforms URLs by changing the protocol and hostname suffix, identifying tables that use WASBS, and updating them to the more performant ABFSS format. This provides an easy migration path for users wanting to take advantage of Azure Data Lake Storage Gen2's improved features.

@FastLee FastLee requested a review from a team as a code owner May 16, 2025 14:30
Copy link

github-actions bot commented May 16, 2025

❌ 59 failed, 1 skipped, 5h44m42s total

❌ test_all_grants_for_other_objects: TimeoutError: Timed out after 0:05:00 (5m1.696s)
TimeoutError: Timed out after 0:05:00
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_managed_tables: TimeoutError: Timed out after 0:05:00 (5m2.199s)
TimeoutError: Timed out after 0:05:00
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_save_external_location_mapping_missing_location: TimeoutError: Timed out after 0:05:00 (5m2.517s)
TimeoutError: Timed out after 0:05:00
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_all_grant_types: TimeoutError: Timed out after 0:05:00 (5m2.521s)
TimeoutError: Timed out after 0:05:00
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_tables_with_cache_should_not_create_table: TimeoutError: Timed out after 0:05:00 (5m3.286s)
TimeoutError: Timed out after 0:05:00
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_create_catalog_schema_with_legacy_hive_metastore_privileges: TimeoutError: Timed out after 0:05:00 (5m3.769s)
TimeoutError: Timed out after 0:05:00
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_create_catalog_schema_with_principal_acl_CLOUD_ENV: TimeoutError: Timed out after 0:05:00 (5m4.218s)
TimeoutError: Timed out after 0:05:00
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_all_grants_in_databases: TimeoutError: Timed out after 0:03:00 (5m4.771s)
TimeoutError: Timed out after 0:03:00
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_create_ucx_catalog_creates_catalog: TimeoutError: Timed out after 0:05:00 (5m5.272s)
TimeoutError: Timed out after 0:05:00
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_grant_findings: TimeoutError: Timed out after 0:05:00 (5m1.662s)
TimeoutError: Timed out after 0:05:00
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_dbfs_non_delta_tables: TimeoutError: Timed out after 0:05:00 (5m2.547s)
TimeoutError: Timed out after 0:05:00
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_table: TimeoutError: Timed out after 0:05:00 (5m1.75s)
TimeoutError: Timed out after 0:05:00
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_table_failed_sync: TimeoutError: Timed out after 0:05:00 (5m2.319s)
TimeoutError: Timed out after 0:05:00
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migration_job_ext_hms: TimeoutError: Timed out after 0:05:00 (10m5.739s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_all_grants_for_udfs_in_databases: TimeoutError: Timed out after 0:05:00 (5m2.086s)
TimeoutError: Timed out after 0:05:00
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_create_all_catalogs_schemas: TimeoutError: Timed out after 0:05:00 (5m2.134s)
TimeoutError: Timed out after 0:05:00
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_create_catalog_schema_when_users_group_in_warehouse_acl: TimeoutError: Timed out after 0:05:00 (5m3.785s)
TimeoutError: Timed out after 0:05:00
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_grant_ownership: TimeoutError: Timed out after 0:05:00 (5m11.228s)
TimeoutError: Timed out after 0:05:00
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_create_catalog_schema_with_principal_acl_aws: TimeoutError: Timed out after 0:05:00 (5m10.553s)
TimeoutError: Timed out after 0:05:00
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_managed_table_to_external_table_with_clone: TimeoutError: Timed out after 0:05:00 (5m1.817s)
TimeoutError: Timed out after 0:05:00
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_table_hiveserde_ctas: TimeoutError: Timed out after 0:05:00 (5m3.084s)
TimeoutError: Timed out after 0:05:00
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_external_locations: TimeoutError: Timed out after 0:05:00 (5m2.416s)
TimeoutError: Timed out after 0:05:00
[gw1] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_mapping_skips_tables_databases: TimeoutError: Timed out after 0:05:00 (5m2.506s)
TimeoutError: Timed out after 0:05:00
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_view: TimeoutError: Timed out after 0:05:00 (5m4.098s)
TimeoutError: Timed out after 0:05:00
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_view_alias_test: TimeoutError: Timed out after 0:05:00 (5m2.046s)
TimeoutError: Timed out after 0:05:00
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_table_hiveserde_in_place: TimeoutError: Timed out after 0:05:00 (5m9.629s)
TimeoutError: Timed out after 0:05:00
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_revert_migrated_table: TimeoutError: Timed out after 0:05:00 (5m1.344s)
TimeoutError: Timed out after 0:05:00
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_mapping_reverts_table: TimeoutError: Timed out after 0:05:00 (5m9.061s)
TimeoutError: Timed out after 0:05:00
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_managed_table_to_external_table_without_conversion: TimeoutError: Timed out after 0:05:00 (5m11.019s)
TimeoutError: Timed out after 0:05:00
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_tables_with_principal_acl_CLOUD_ENV: TimeoutError: Timed out after 0:05:00 (5m2.785s)
TimeoutError: Timed out after 0:05:00
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_tables_with_principal_acl_aws: TimeoutError: Timed out after 0:05:00 (5m1.597s)
TimeoutError: Timed out after 0:05:00
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_managed_tables_with_acl: TimeoutError: Timed out after 0:05:00 (5m4.592s)
TimeoutError: Timed out after 0:05:00
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_tables_with_spn_CLOUD_ENV: TimeoutError: Timed out after 0:05:00 (5m3.354s)
TimeoutError: Timed out after 0:05:00
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_external_tables_with_principal_acl_aws_warehouse: TimeoutError: Timed out after 0:05:00 (5m3.125s)
TimeoutError: Timed out after 0:05:00
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migrate_table_in_mount: TimeoutError: Timed out after 0:05:00 (5m9.734s)
TimeoutError: Timed out after 0:05:00
[gw1] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_move_tables_no_from_schema: TimeoutError: Timed out after 0:05:00 (5m2.493s)
TimeoutError: Timed out after 0:05:00
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_pipeline_migrate: TimeoutError: Timed out after 0:05:00 (5m10.647s)
TimeoutError: Timed out after 0:05:00
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_migration_index_deleted_source: TimeoutError: Timed out after 0:05:00 (5m4.347s)
TimeoutError: Timed out after 0:05:00
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_table_migration_ownership: TimeoutError: Timed out after 0:05:00 (5m5.224s)
TimeoutError: Timed out after 0:05:00
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_move_tables_table_properties_mismatch_preserves_original: TimeoutError: Timed out after 0:05:00 (5m1.887s)
TimeoutError: Timed out after 0:05:00
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw6] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_move_tables_no_to_schema: TimeoutError: Timed out after 0:05:00 (5m3.135s)
TimeoutError: Timed out after 0:05:00
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw5] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_move_tables: TimeoutError: Timed out after 0:05:00 (5m7.256s)
TimeoutError: Timed out after 0:05:00
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw4] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_partitioned_tables: TimeoutError: Timed out after 0:05:00 (5m1.3s)
TimeoutError: Timed out after 0:05:00
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw8] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_move_views: TimeoutError: Timed out after 0:05:00 (5m1.127s)
TimeoutError: Timed out after 0:05:00
[gw1] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw1] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_alias_tables: TimeoutError: Timed out after 0:05:00 (5m9.659s)
TimeoutError: Timed out after 0:05:00
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw9] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_table_ownership: TimeoutError: Timed out after 0:05:00 (5m4.124s)
TimeoutError: Timed out after 0:05:00
[gw2] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_describe_all_udfs_in_databases: TimeoutError: Timed out after 0:05:00 (5m1.699s)
TimeoutError: Timed out after 0:05:00
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_udf_ownership: TimeoutError: Timed out after 0:05:00 (5m0.87s)
TimeoutError: Timed out after 0:05:00
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_describe_all_tables_in_databases: TimeoutError: Timed out after 0:05:00 (5m10.039s)
TimeoutError: Timed out after 0:05:00
[gw7] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_make_ucx_group: TimeoutError: Timed out after 0:05:00 (5m1.425s)
TimeoutError: Timed out after 0:05:00
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw0] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_make_ucx_group_with_names: TimeoutError: Timed out after 0:05:00 (5m11.701s)
TimeoutError: Timed out after 0:05:00
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
[gw3] linux -- Python 3.10.18 /home/runner/work/ucx/ucx/.venv/bin/python
❌ test_table_migration_job_refreshes_migration_status[regular-migrate-tables]: TimeoutError: Timed out after 0:05:00 (10m2.875s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_table_migration_job_refreshes_migration_status[hiveserde-migrate-external-tables-ctas]: TimeoutError: Timed out after 0:05:00 (10m6.113s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_hiveserde_table_in_place_migration_job[migrate-external-tables-ctas]: TimeoutError: Timed out after 0:05:00 (10m3.904s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_table_migration_job_refreshes_migration_status[hiveserde-migrate-external-hiveserde-tables-in-place-experimental]: TimeoutError: Timed out after 0:05:00 (10m12.351s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_hiveserde_table_in_place_migration_job[migrate-external-hiveserde-tables-in-place-experimental]: TimeoutError: Timed out after 0:05:00 (10m7.375s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_table_migration_convert_manged_to_external: TimeoutError: Timed out after 0:05:00 (10m6.169s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_hiveserde_table_ctas_migration_job: TimeoutError: Timed out after 0:05:00 (10m12.902s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00
❌ test_table_migration_job_publishes_remaining_tables: TimeoutError: Timed out after 0:05:00 (10m7.093s)
TimeoutError: Timed out after 0:05:00
TimeoutError: Timed out after 0:05:00

Running from acceptance #8722

@FastLee FastLee force-pushed the fix/support-for-wasbs branch from d63a95e to 1772d84 Compare May 16, 2025 19:14
@FastLee FastLee force-pushed the fix/support-for-wasbs branch from 1772d84 to de8cb1c Compare July 21, 2025 15:02
"""
if not location:
return None
if not location.startswith("wasbs://"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the next if are both checking for wasbs. Can be consolidated.

*([entity_storage_locations] if entity_storage_locations is not None else []),
)
self._catalog.alterTable(new_table)
except Exception as e: # pylint: disable=broad-exception-caught
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this is intentionally meant to be broad exceptions, we should narrow it down to Py4JJavaError, ValueError that could occur here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants