Fixed Assessment Exporter Notebook #3829

jgarciaf106 · 2025-03-10T18:24:51Z

Changes

Adjusted the Lakeview dashboard Assessment Main dashboard path to the new naming format (Now looks for the dashboard name dynamically to avoid hardcoded values) in the EXPORT_ASSESSMENT_TO_EXCEL Notebook.

Tests

manually tested

Manual.Test.mov

FastLee

A few nits.

src/databricks/labs/ucx/installer/workflows.py

FastLee

LGTM

FastLee

LGTM

replaces #3829 --------- Co-authored-by: Andres Garcia <andres.garcia+data@databricks.com>

* Added ability to create account groups from nested ws-local groups ([#3818](#3818)). The `create_account_level_groups` method has been added, enabling the creation of account level groups from workspace groups. This method retrieves valid workspace groups and recursively creates account level groups for each group, handling nested groups by checking if they already exist and creating them if necessary. The `AccountGroupCreationContext` dataclass is used to keep track of created, preexisting, and renamed groups. A new test function, `test_create_account_level_groups_nested_groups`, has been added to the `test_account.py` file to test the creation of account level groups from nested workspace-local groups. This function checks if the account level groups are created correctly, with the same members and membership as the corresponding workspace-local groups. The `ComplexValue` class has been modified to include the `ref` field, which references user objects, enabling the creation of account groups with members identified by their workspace-local user IDs. Integration tests have been added to verify the functionality of these changes. * Added error handling and tests for Workflow linter during pipeline fetch ([#3819](#3819)). The recent change to the open-source library introduces error handling and tests for the Workflow linter during pipeline fetch. The `_register_pipeline_task` method in the "jobs.py" file has been updated to handle cases where the pipeline does not exist, by yielding a `DependencyProblem` instance with an appropriate error message. A new private method, "_register_pipeline_library", has been introduced to handle the registration of libraries present in the pipeline. Additionally, new unit tests and integration tests have been added to ensure that the Workflow linter properly handles cases where pipelines do not exist, and manual testing has been conducted to verify the feature. Overall, these changes improve the robustness and reliability of the Workflow linter by adding error handling and testing for edge cases during pipeline fetch. * Added hyperlinks to tables and order the rows by type, name ([#3951](#3951)). In this release, the `Table Types` widget has been updated to enhance the user experience. The table names in the widget are now clickable and serve as hyperlinks that redirect users to a specified URL with the table name as the link text and title. The rows in the widget are also reorganized by type and then by name, making it easier for users to locate the required table. Additionally, a new set of encodings has been added for the widget that specifies how fields should be displayed, including a `link` display type for the `name` field to indicate that it should be displayed as a hyperlink. These changes were implemented in response to issue [#3259](#3259). A manually tested flag has been included in the commit, indicating that the changes have been tested, but unit and integration tests have not been added. A screenshot of the changes is also included in the commit. * Added links to compute summary widget ([#3952](#3952)). In this release, we have added links to the compute summary widget to enhance navigation and usability. The `encodings` spec in the `spec` object now includes overrides for a SQL file, which adds links to the `cluster_id` and `cluster_name` fields, opening them in a new tab with the respective cluster's details. Additionally, the `finding` and `creator` fields are now displayed as strings. These changes improve the user experience by providing direct access to cluster details from the compute summary widget. The associated issue [#3260](#3260) has been resolved. Manual testing has confirmed that the changes work as expected. * Adds option to install UCX in offline mode ([#3959](#3959)). A new capability has been introduced to install the UCX library in offline mode, enabling software engineers to install UCX in environments with restricted Internet access. This offline installation process can be accomplished by installing UCX on a host with Internet access, zipping the installation, transferring the zip to the target host, and unzipping it. To ensure a successful installation, the Databricks CLI version must be v0.244.0 or higher. Additionally, this commit includes updated documentation detailing the offline installation process. This feature addresses issue [#3418](#3418), making it easier for software engineers to install UCX in offline environments. * Fixed Assessment Excel Exporter ([#3962](#3962)). The open-source library has been updated with several new features to enhance its functionality. Firstly, we have implemented a new sorting algorithm that offers improved performance and flexibility for sorting large datasets. This algorithm includes customizable options for handling ties and can be easily integrated into existing codebases. Additionally, we have added support for asynchronous processing, allowing developers to execute time-consuming tasks in the background while maintaining application responsiveness. This feature includes a new API for managing asynchronous tasks and improved error handling for better reliability. Lastly, we have introduced a new configuration system that simplifies the process of setting up and customizing the library. This system includes a default configuration that covers most use cases and allows for easy overriding of specific settings. These new features are designed to provide developers with more powerful and flexible tools for working with the open-source library. * Fixed Assessment Exporter Notebook ([#3829](#3829)). In this commit, the Assessment Exporter Notebook has been updated to improve code maintainability and robustness. The main change is the adjustment of the Lakeview dashboard Assessment Main dashboard path to the new naming format, which is now determined dynamically to avoid hardcoded values. The path format has also been changed from string to Path object format. Additionally, a new method `_process_id_columns` has been added to process ID columns in the dataset, checking for any column with `id` in the name and wrapping them in quotes. These changes have been manually tested and improve the accuracy of the exported Excel file and the maintainability of the code, ensuring that the Assessment Main dashboard path is correct and up-to-date and the data is accurately represented in the exported file. * TECH DEBT Use right workspace api call for listing credentials ([#3957](#3957)). In this release, we have implemented a change in the `list` method of the `credentials.py` file located in the `databricks/labs/ucx/aws` directory, addressing issue [#3571](#3571). The `list` method now utilizes the `list_credentials` method from the `_ws.credentials` object instead of the `api_client` for listing AWS credentials. This modification replaces the previous TODO comment with actual code, thereby improving code quality and reducing technical debt. The `list_credentials` method is a part of the Databricks workspace API, offering a more accurate and efficient approach to list AWS credentials, resulting in enhanced reliability and performance for the code responsible for managing AWS credentials. * [TECHDEBT] Remove unused code for _resolve_dbfs_root in MountCrawler ([#3958](#3958)). In this release, we have made improvements to the MountCrawler class by removing the unused code for the _resolve_dbfs_root method and its dependencies. This method was previously used to resolve the root location of a DBFS, but it has been deprecated in favor of a new API call. The removal of this unnecessary functionality simplifies the codebase and aligns it with our goal of creating a more streamlined and efficient system. Additionally, this release includes a fix for issue [#3452](#3452). Rest assured that these changes will not affect the current functionality or behavior of the system and are intended to enhance the overall performance and maintainability of the codebase. * [Tech Debt] removing notfound if not required in test_install.py ([#3826](#3826)). In this release, we've made improvements to our test suite by removing the redundant `notfound` function in test_install.py, specifically from 'test_create_database', 'test_open_config', and 'test_save_config_ext_hms'. The `notfound` function previously raised a `NotFound` error, which has now been replaced with a more specific error message or behavior. This enhancement simplifies the codebase, reduces technical debt, and addresses issue [#2700](#2700). Note that no new unit tests were added, but existing tests were updated to account for the removal of 'notfound'. * [Tech Debt] standardising the error message for required parameter in cli command ([#3827](#3827)). This release introduces changes to standardize error messages for required parameters in the `databricks labs ucx` CLI command, addressing tech debt and improving the user experience. Instead of raising a KeyError, the command now returns clear and consistent error messages when required parameters are missing. Specifically, the `repair_run` function handles the case when the `--step` parameter is not provided, and the `move` and `alias` functions handle missing `--from_catalog`, `--to_catalog`, `--from_schema`, `--to_schema`, and `--from_table` parameters. Unit tests have been added to ensure the proper error messages are displayed when required parameters are missing, addressing issue [#2740](#2740).

jgarciaf106 requested a review from a team as a code owner March 10, 2025 18:24

jgarciaf106 had a problem deploying to account-admin March 13, 2025 19:47 — with GitHub Actions Failure

FastLee requested changes Mar 17, 2025

View reviewed changes

src/databricks/labs/ucx/installer/workflows.py Outdated Show resolved Hide resolved

src/databricks/labs/ucx/installer/workflows.py Outdated Show resolved Hide resolved

src/databricks/labs/ucx/installer/workflows.py Outdated Show resolved Hide resolved

jgarciaf106 requested a review from FastLee March 17, 2025 22:42

FastLee approved these changes Mar 24, 2025

View reviewed changes

FastLee self-requested a review March 24, 2025 18:00

FastLee approved these changes Mar 24, 2025

View reviewed changes

jgarciaf106 force-pushed the fix/excel-exporter branch from 7b2a592 to 8397e67 Compare March 24, 2025 20:11

FastLee enabled auto-merge March 24, 2025 20:16

jgarciaf106 had a problem deploying to account-admin March 24, 2025 20:17 — with GitHub Actions Failure

Andres Garcia added 5 commits March 24, 2025 14:51

Fix EXPORT_ASSESSMENT_TO_EXCEL Lake dashboard naming

0adcc97

Fix EXPORT_ASSESSMENT_TO_EXCEL Lake dashboard naming

34b59de

Adjusted few lines to comply with feedback.

f6ebcda

Adjusted few lines to comply with feedback. And handling id correctly.

a6ebaa7

Adjusted UCX Path on Exporter Notebook.

4f9caf0

auto-merge was automatically disabled March 24, 2025 20:52
Head branch was pushed to by a user without write access

jgarciaf106 force-pushed the fix/excel-exporter branch from 8397e67 to 4f9caf0 Compare March 24, 2025 20:52

jgarciaf106 mentioned this pull request Mar 25, 2025

Updated the dashboard path to correct path #3956

Open

9 tasks

FastLee enabled auto-merge March 26, 2025 14:05

jgarciaf106 had a problem deploying to account-admin March 26, 2025 14:06 — with GitHub Actions Failure

jgarciaf106 had a problem deploying to account-admin March 26, 2025 14:07 — with GitHub Actions Failure

This was referenced Mar 26, 2025

Fix/excel exporter #3961

Closed

Fix Assessment Excel Exporter #3962

Merged

FastLee temporarily deployed to account-admin March 26, 2025 15:00 — with GitHub Actions Inactive

FastLee added this pull request to the merge queue Mar 26, 2025

Merged via the queue into databrickslabs:main with commit 6dfb650 Mar 26, 2025
14 of 15 checks passed

github-merge-queue bot pushed a commit that referenced this pull request Mar 26, 2025

Fix Assessment Excel Exporter (#3962)

3b18ca6

replaces #3829 --------- Co-authored-by: Andres Garcia <andres.garcia+data@databricks.com>

gueniai mentioned this pull request Apr 16, 2025

Release v0.58.0 #3998

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixed Assessment Exporter Notebook #3829

Fixed Assessment Exporter Notebook #3829

Uh oh!

jgarciaf106 commented Mar 10, 2025

Uh oh!

FastLee left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FastLee left a comment

Uh oh!

FastLee left a comment

Uh oh!

Uh oh!

Uh oh!

Fixed Assessment Exporter Notebook #3829

Fixed Assessment Exporter Notebook #3829

Uh oh!

Conversation

jgarciaf106 commented Mar 10, 2025

Changes

Tests

Uh oh!

FastLee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FastLee left a comment

Choose a reason for hiding this comment

Uh oh!

FastLee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!