Skip to content

v0.7.0 - API overhaul, introduced ParserBuilder

Compare
Choose a tag to compare
@matthewtolman matthewtolman released this 07 Dec 22:15
· 17 commits to main since this release

Major Change - Upgrade Notes

This is a major change in the usage interface. This includes changes to parser names and namespaces, retiring slow parsers, and moving type conversion methods.

Parser Name Changes

List of old parsers to new names are as follows:

  • zcsv.map_ck.Parser -> zcsv.allocs.map.Parser
  • zcsv.map_ck.init -> zcsv.allocs.map.init
  • zcsv.map_sk.Parser -> zcsv.allocs.map_temporary.Parser
  • zcsv.map_sk.init -> zcsv.allocs.map_temporary.init
  • zcsv.column.Parser -> zcsv.allocs.column.Parser
  • zcsv.column.init -> zcsv.allocs.column.init
  • zcsv.slice.rows.Parser -> zcsv.zero_allocs.slice.Parser
  • zcsv.slice.rows.init -> zcsv.zero_allocs.slice.init
  • zcsv.slice.fields.Parser -> zcsv.zero_allocs.slice.FieldsParser
  • zcsv.slice.fields.init -> zcsv.zero_allocs.slice.fieldsInit
  • zcsv.stream_fast.Parser -> zcsv.zero_allocs.stream.Parser
  • zcsv.stream_fast.init -> zcsv.zero_allocs.stream.init
  • zcsv.raw.Parser - REMOVED (use zcsv.zero_allocs.slice.Parser)
  • zcsv.raw.init - REMOVED (use zcsv.zero_allocs.slice.init)
  • zcsv.stream.Parser - REMOVED (use zcsv.zero_allocs.stream.Parser)
  • zcsv.stream.init - REMOVED (use zcsv.zero_allocs.stream.init)

Field struct changes

All of the field type conversion methods (e.g. asInt, asFloat, asBool, isNull, asSlice, etc) have been moved off of the structs and into a shared decode namespace. This namespace provides the methods fieldToInt, fieldToFloat, fieldToBool, fieldIsNull, and writeFieldStrTo as replacements for the old methods. These replacements work on any field type, regardless of whether it's allocating (and decoded) or raw (and encoded). Similar methods have been provided for slices (though the slice methods assume the input is already decoded). Additional methods are provided for more niche use cases.

For the non-allocating raw Field, we no longer have the data property. Instead, we have the raw and opts methods. raw will provide the raw data in the original CSV. opts will provide the CSV options which can be used for decoding.

Other Changes

The decode_writer (documented to be internal) has been moved to a non-exported namespace called internal. Several methods have been renamed, removed, or added to each Row struct to make the interface more uniform. This does not impact map parsers as those parsers work on maps and not sequential ranges.

A new ParserBuilder has been added.

What's Changed

  • Separated field parsing (e.g. asInt) from the Field structs and put parsing logic into a standalone module
  • Removed parsers that were obsolete (slow raw, slow stream)
  • Split parsers into namespaces allocs and zero_allocs based on allocating properties
  • Renamed map_ck to map and map_sk to map_temporary
  • Reworked field to have shared clone API and different data vs raw, opts methods
    • Allocating fields have data
    • Zero-allocating fields have raw and opts
  • Reworked Row structs to have interchangeable interface (though they do have different performance complexities)
    • Both now have iter(), len(), field(usize), fieldOrNull(usize)
  • Unified the Field structs for map and column parsers
  • Unified the Row and Field structs for both map and map_temporary parsers
  • Moved internal decode_writer namespace to the internal namespace
  • Updated the library to only export the zero_allocs, allocs, writer, decode namespaces. It also exports the CsvWriteError, CsvReadError, and ParseBoolError error sets
  • Renamed stream_fast to stream
  • Renamed slices.row to slice
  • Made slices.row the default slice-based parser for zero-allocation in-memory parsing
    • Done by making zcsv.zero_allocs.slice.init create a slices.row parser
  • Made slices.fields a sub-parser under the slices namespace
    • Done by making zcsv.zero_allocs.slice.fieldsInit create a slices.field parser
  • Added new builder for parsers in zcsv.ParserBuilder
    • Builder allows creating new parsers based on more usage focused needs (e.g. with reader input and headers)
    • Builder also provides methods for abstracting away clean up of created parser and rows
      • Automatically becomes a no-op if cleanup isn't needed
      • Makes switching between parsers without introducing memory leaks easier
  • Updated examples, removed obsolete examples, and added new examples as needed
  • Updated README

Full Changelog: v0.6.0...v0.7.0