Skip to content

Harmonize GC handling #10

Open
Open
@jblachly

Description

@jblachly

Source of really hard to track down bugs.

Currently, all three tree implementations manage their own memory. Being containers, they copy in whatever object is passed to them. If this has, say, a string, with a pointer to the GC heap it may be reaped when its original reference goes out of scope, even though the tree holds on to it. Potential disaster. Of course this can be avoided by passing only value types.

Because I first discovered this when dealing with IITree (indeed at that time I think the splaytree and avltree were using the GC to alloc anyway so it wasn't manifest there) I did implement in the IITree insert options
(a) specify whether the passed in pointer (IITree takes a pointer instead of an object) was a GC managed pointer (GCptr=true), and
(b) indicate whether it may contain other pointers to the GC heap (trackGC=true)

See discussion here:

/// Insert interval for contig
///
/// Note this differs from the other trees in the package in inclusion of "contig" parameter,
/// because underlying cgranges has built-in hashmap and essentially stores multiple trees.
/// Passing a \0-terminated C string contig will be faster than passing a D string or char[],
/// due to the need to call toStringz before calling the C API.
///
/// last param "label" of cr_add not used by cgranges as of 2019-05-04
///
/// Params:
/// S = string-like contig name (if absent will pass "default" to cr_add)
/// i = IntervalType or IntervalType*
/// trackGC = (Default true) add i to the GC scanned ranges. see discussion
/// GCptr = (Default true) when i is IntervalType*, is the pointer to GC-mgd memory?
///
/// Non-payload variant: params contig, start, end
///
/// Discussion:
/// This container structure may store an IntervalType in one of two ways.
/// First, you can pass a pointer which will be stored directly without additional
/// memory allocations. Note however that if this pointer is to garbage collected
/// memory and no other reference exists in the D code, it could be swept away.
/// Even if it is not to GC memory, if it encapsulates a pointer (e.g., a string )
/// you can still be bitten. Thus trackGC defaults to true. If unneeded, you might
/// obtain a marginal speedup by setting trackGC = false.
///
/// The interval to be contained may also be passed by value. In this case,
/// memory will be allocated with malloc and the pointer will be stored. In this case
/// the same caveats regarding contained GC-pointing objects like D-arrays and strings
/// still apply. For example, passing a struct by value may not necessitate tracking
/// the memory in the GC if it contains only value types, but if the struct contains
/// a D-style array or string, it must be tracked
///
/// Finally, there is a variant missing the first contig parameter. Unlike other trees,
/// cgranges requires a string contig parameter, so a default value of "default"
/// will be supplied.
cr_intv_t* insert(S)(S contig, IntervalType* i, bool trackGC = true, bool GCptr = true)

Need to add trackGC option to the other tree implementations. Since they don't take pointer, probably skip adding GCptr.

Also TODO: document the difference in the APIs, and document that non-value-types shouldn't be added to splaytree and avltree for now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions