Skip to content

[ZEPPELIN-6232] Prevent duplicate interpreter repositories on server restart #4981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shmruin
Copy link
Contributor

@shmruin shmruin commented Jul 16, 2025

What is this PR for?

OVERVIEW
This PR fixes a bug where modifying a default interpreter repository (e.g., 'central') and restarting the Zeppelin server would result in a duplicate repository entry instead of an overwrite.

For example, if a user added a new central repository to interpreter, it just replace the default central repository as we expected.

but, when restart, the UI would incorrectly display two central repositories: the user-modified one and the default one.

duplicated_central_repo

This happens because interpreter.json file have two central in interpreterRepositories field.

ROOT CAUSE
The issue was in the repository loading logic within InterpreterSettingManager. When loading settings from interpreter.json on startup, the code checked for the existence of a repository using List.contains(repo).

This check failed because the default repository object and the user-modified repository 'object' were not identical (e.g., one had authentication details, the other did not), even though their IDs were the same ('central'). This led to the manager incorrectly treating the user's repository as a new entry and adding it, causing the duplication.

MODIFICATION
The repository loading logic in InterpreterSettingManager's loadFromFile() has been modified to correctly handle user-defined repositories.

Instead of using List.contains(), the new logic now iterates through the existing list of repositories and explicitly checks for matching IDs.

  • If a repository with the same ID as the user-saved one is found, the existing repository is replaced with the user's version.
  • If no repository with a matching ID is found, the user-saved repository is added as a new entry.

This ensures that user modifications to default repositories are correctly persisted across server restarts and prevents duplicate entries.

  • And also, Instead of using dependencyResolver.getRepos() as before, I change this to use just interpreterRepositories in InterpreterSettingManager. interpreterRepositories already initialized with dependencyResolver.getRepos() on initialize stage, and so far there is no side effect to change this. It is more intuitive to iterate interpreterRepositories to change interpreterRepositories.

SIDE EFFECT
When user overwrite the default repo, like central, or local, they are still default repos with different info.
(Only when interpreter.json has this changed info. when this file removed, then original default repo show up.)
So I guess there's not so much things to be affected by this change.

What type of PR is it?

Bug Fix

Todos

What is the Jira issue?

[ZEPPELIN-6232]

How should this be tested?

A new unit test, testShouldNotDuplicateRepoOnReloadWhenDefaultRepoIsModified, has been added to InterpreterSettingManagerTest.java.

This test simulates the exact scenario of the bug:

  • It programmatically modifies the 'central' repository and saves the configuration.
  • It reloads the InterpreterSettingManager to mimic a server restart.
  • It asserts that only one 'central' repository exists and that it correctly retains the user's modifications.
  • Finally, it checks if non-modified default repo ('local', in this case) still exist as a default.

If you want to test manually:

  • In zeppelin notebook, on interpreter menu, you can add a new central repo, and it replace the default one immediately. (Works either current and fixed version)
  • Upon server restart, you can find out two central repo. (current version)
  • Upon server restart, you can find out one central repo, which is user-defined. (fixed version)
  • you can also check interpreter.json in /conf.

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? N
  • Is there breaking changes for older versions? N
  • Does this needs documentation? N

Copy link
Contributor

@tbonelee tbonelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I agree with the root cause analysis and the proposed solution.
The updated logic looks correct and the test case covers the scenario.
I also confirmed that the duplication issue is reproducible before the patch and no longer occurs after applying the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants