-
Notifications
You must be signed in to change notification settings - Fork 193
Description
Is your feature request related to a problem? Please describe.
I'm currently using differential-dataflow (0.15.2) as part of a Datalog-based query engine. In our logic-oriented context, we frequently need to explicitly set difference values for records independently of their original differences.
Specifically, two scenarios arise frequently:
- Set-difference queries (e.g.,
A - B
): explicitly setting differences to+1
or-1
for tuples in the computation. - Exact counting (cardinality): where original differences must be ignored, and each tuple must explicitly be assigned a difference of exactly
1
.
Currently, the built-in method explode
multiplies provided differences by the original tuple's differences, which does not match our intended semantics when original differences vary.
Describe the solution you'd like
I'd like to propose adding a method named expand
, which explicitly sets new differences, independent of original tuple differences:
/// Replaces each record explicitly with another, with a specified difference type.
///
/// Unlike `explode`, the new difference is not multiplied by the original difference.
pub fn expand<D2, R2, I, L>(&self, mut logic: L) -> Collection<G, D2, R2>
where
D2: Data,
R2: Semigroup + 'static,
I: IntoIterator<Item = (D2, R2)>,
L: FnMut(D) -> I + 'static,
{
self.inner
.flat_map(move |(x, t, _)| logic(x).into_iter().map(move |(x, d2)| (x, t.clone(), d2)))
.as_collection()
}
Describe alternatives you've considered
- Using the built-in explode method: Not suitable, as it multiplies differences.
- Using custom patches: Currently, we've patched our local version, but we'd prefer an official method in differential-dataflow for clarity, correctness, and ease of maintenance.
Additional context
A concrete Datalog-based example illustrating the need for expand:
// Represents Datalog's negation operation: Out(x) :- A(x), not B(x)
let difference = A.expand(|x| Some((x, 1)))
.concat(&B.expand(|x| Some((x, -1))))
.threshold_semigroup(|_, _, old| old.is_none().then_some(semiring_one()));
This clearly expresses "tuples in A but not in B," assigning explicit differences independent of original values, and can't be directly and correctly implemented with explode
.
I believe this method is simple, intuitive, and complementary to the existing methods (map
, flat_map
, explode
). I'd appreciate your consideration of this addition.
Thank you very much!