Skip to content

Allow containers on half-join's stream input #621

@antiguru

Description

@antiguru

The half-join operator consumes a stream of updates and joins it with an arrangement. It needs updates in the form of vectors of data, currently it cannot handle any other format.

However, it should support containers on its input to avoid forcing data into an owned representation. This involves:

  • The data needs to be sorted, and consolidated.
    • We could arrange the input data, and drop the trace to form batches of ready proposals.
    • We could work on chunks of sorted and consolidated data. This amortizes the work of traversing the lookup arrangement, at the expense of not guaranteeing that the inputs are consolidated.
    • We could use a merge batcher (without an arrangement) to force the stream inputs to be sorted and consolidated.

Since #619, we can push outputs into a container builder, which allows arbitrary containers (and transformations).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions