Update to `imara-diff` 0.2 in diff slider test #2288

cruessler · 2025-12-08T17:17:38Z

This is a companion to #2287. It updates the diff slider test to use imara-diff 0.2. In my non-representative sample of ~3000 diffs from gitoxide, it matches the git baseline in ~95 % of cases.

❯ cargo test --package gix-diff-tests slider -- --nocapture
[…]

thread 'blob::slider::baseline' (23361) panicked at gix-diff/tests/diff/blob/slider.rs:83:5:
assertion `left == right` failed: matching diffs 2930 == total diffs 3094 [94.70 %]

Byron · 2025-12-09T03:34:37Z

gix-diff/tests/diff/blob/slider.rs

-use gix_diff::blob::intern::TokenSource;
-use gix_diff::blob::unified_diff::ContextSize;
-use gix_diff::blob::{Algorithm, UnifiedDiff};
+use gix_diff::blob::v2::{Algorithm, BasicLineDiffPrinter, Diff, InternedInput, UnifiedDiffConfig};


It's interesting that v2 is available at all as it shouldn't be available by default.
Let me fix that

Hmm, seems like this code should not compile, it's fine on main.

Sorry about that, I didn't realise that this is a test in gix-diff-tests, which enables the feature. It was a long day yesterday 😅.

No worries, I also could have provided that context. :-)

cruessler · 2025-12-14T12:40:06Z

I’ve re-added the v1 slider test that I had updated in this PR’s first commit. Now, there’s both the baseline_v1 and the baseline_v2 test, so they can be easily compared.

- deduplicate a little, and do away with blanket `unreachable!()`.

Byron

That's great to have! I think we should also think about having a minimal default set of slider tests that we keep, at some point, so we have some CI protections for this. But it's a note for the future, when it settles a bit more.

Besides that, I asked Copilot to take a look, but would also love it if you could try it once more as I didn't have a slider setup to actually run the code.

But if you give the green light, I will happily merge this.

Copilot

Pull request overview

This PR updates the diff slider test to support imara-diff 0.2 by adding a new v2 API test alongside the existing v1 test. The changes refactor common logic into helper functions to reduce duplication and support both API versions.

Key changes:

Renames existing baseline() test to baseline_v1() and adds new baseline_v2() test for the v2 API
Extracts common parsing logic into parse_dir_entry() helper function
Moves diff assertion logic to assert_diffs() helper function
Adds fold_to_unidiff() method to the Baseline type for cleaner code reuse

Copilot · 2025-12-16T08:39:12Z

gix-diff/tests/diff/blob/slider.rs

+    let old_data = std::fs::read(asset_dir.join(format!("{old_blob_id}.blob"))).unwrap();
+    let new_data = std::fs::read(asset_dir.join(format!("{new_blob_id}.blob"))).unwrap();


The .unwrap() calls violate the project's error handling guidelines. According to the coding guidelines, tests should avoid .unwrap() in favor of .expect() with meaningful context or ?. These file reads should either be propagated with ? or use .expect("meaningful context") to explain why the files are expected to exist.

Copilot · 2025-12-16T08:39:12Z

gix-diff/tests/diff/blob/slider.rs

+            continue;
+        };
+
+        let input = InternedInput::new(old_data.to_str().unwrap(), new_data.to_str().unwrap());


The .unwrap() calls on .to_str() are problematic. The project guidelines prohibit .unwrap() in favor of .expect() with context or proper error handling with ?. Additionally, this assumes the blob data is valid UTF-8, which may not always be true for binary files or files with different encodings. This could cause the test to panic instead of handling non-UTF-8 data gracefully.

Copilot · 2025-12-16T08:39:12Z

gix-diff/tests/diff/blob/slider.rs

+        let algorithm = match algorithm {
+            gix_diff::blob::Algorithm::Myers => Algorithm::Myers,
+            gix_diff::blob::Algorithm::Histogram => Algorithm::Histogram,
+            gix_diff::blob::Algorithm::MyersMinimal => Algorithm::MyersMinimal,
+        };


The algorithm mapping is duplicated between parse_dir_entry (lines 138-142) and this section. Since parse_dir_entry already returns the Algorithm enum from the v1 API, this remapping to v2 Algorithm could be extracted into a helper function to avoid duplication and make the relationship between v1 and v2 algorithms clearer.

Copilot · 2025-12-16T08:39:13Z

gix-diff/tests/diff/blob/slider.rs

+        let mut diff = Diff::compute(algorithm, &input);
+        diff.postprocess_lines(&input);


The variable name diff is shadowed after mutation (line 98 creates it, line 99 mutates it). Consider renaming to clarify the mutation, or combine the operations if the intermediate state isn't needed.

Copilot · 2025-12-16T08:39:13Z

gix-diff/tests/diff/blob/slider.rs

    }

+    impl Baseline<'_> {
+        /// Fold all [`DiffHunk`]s we produce into a unified_diff string


The doc comment uses backticks around DiffHunk and refers to "we produce" but doesn't fully explain what this method does or when it should be used. Consider clarifying that this method converts the baseline hunks into a unified diff format string for comparison purposes.

Suggested change

/// Fold all [`DiffHunk`]s we produce into a unified_diff string

/// Converts all baseline [`DiffHunk`]s into a single unified diff format string.

///

/// This method is typically used to produce a unified diff representation of the baseline hunks,

/// suitable for comparison with other diff outputs in tests or analysis.

Copilot · 2025-12-16T08:39:13Z

gix-diff/tests/diff/blob/slider.rs

+        unreachable!("BUG: Need file named '<name>.<algorithm>'")
+    };
+    let algorithm = match algorithm {
+        "myers" => Algorithm::Myers,
+        "histogram" => Algorithm::Histogram,
+        other => unreachable!("'{other}' is not a supported algorithm"),
+    };
+
+    let parts: Vec<_> = name.split('-').collect();
+    let [old_blob_id, new_blob_id] = parts[..] else {
+        unreachable!("BUG: name part of filename must be <old_blob_id>-<new_blob_id>");


The error message format is inconsistent. Line 136 uses "BUG: Need file named", line 141 uses single quotes around the variable without "BUG:", and line 146 uses "BUG: name part of filename must be". Consider using a consistent format for all unreachable error messages, such as starting all with "BUG:" prefix.

cruessler · 2025-12-16T08:46:28Z

That's great to have! I think we should also think about having a minimal default set of slider tests that we keep, at some point, so we have some CI protections for this. But it's a note for the future, when it settles a bit more.

Yep, having a set of default tests definitely is on my todo list as well.

Besides that, I asked Copilot to take a look, but would also love it if you could try it once more as I didn't have a slider setup to actually run the code.

I just re-ran the tests with your updates on my machine and did not observe any differences to the previous state.

Just for context: in case you were wondering, I intentionally did not de-duplicate the tests right away, in order to facilitate review.

Update to imara-diff 0.2 in diff slider test

618c37a

Byron reviewed Dec 9, 2025

View reviewed changes

Re-add slider test v1 to compare both versions

c0be918

cruessler marked this pull request as ready for review December 14, 2025 12:38

refactor

1e3efab

- deduplicate a little, and do away with blanket `unreachable!()`.

Byron requested a review from Copilot December 16, 2025 08:27

Byron approved these changes Dec 16, 2025

View reviewed changes

Copilot started reviewing on behalf of Byron December 16, 2025 08:37 View session

Copilot AI reviewed Dec 16, 2025

View reviewed changes

		let old_data = std::fs::read(asset_dir.join(format!("{old_blob_id}.blob"))).unwrap();
		let new_data = std::fs::read(asset_dir.join(format!("{new_blob_id}.blob"))).unwrap();

		let mut diff = Diff::compute(algorithm, &input);
		diff.postprocess_lines(&input);

-        /// Fold all [`DiffHunk`]s we produce into a unified_diff string
+        /// Converts all baseline [`DiffHunk`]s into a single unified diff format string.
+        ///
+        /// This method is typically used to produce a unified diff representation of the baseline hunks,
+        /// suitable for comparison with other diff outputs in tests or analysis.

Uh oh!

Update to imara-diff 0.2 in diff slider test #2288

Are you sure you want to change the base?

Update to imara-diff 0.2 in diff slider test #2288

Conversation

cruessler commented Dec 8, 2025

Uh oh!

Byron Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Byron Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Byron Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

cruessler Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

cruessler commented Dec 14, 2025

Uh oh!

Byron left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

cruessler commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update to `imara-diff` 0.2 in diff slider test #2288

Update to `imara-diff` 0.2 in diff slider test #2288