Skip to content

Latest commit

 

History

History
160 lines (135 loc) · 4.98 KB

CHANGELOG.md

File metadata and controls

160 lines (135 loc) · 4.98 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

Unreleased

Changed

  • Pyarrow >=19 required
  • Python >=3.10 required

1.8 - 2024-11-01

Changed

  • Pyarrow >=18 required
  • isodate dependency for durations
  • Acero engine used for scanning
  • Grouping defaults to parallelized but unordered
  • Partitioning supports arbitrary functions
  • group optimized for dictionary arrays
  • rank optimized for out-of-core

1.7 - 2024-07-19

Changed

  • Pyarrow >=17 required
  • Partitioning supports original indices
  • Acero engine declaration
  • Duration format improvements

Fixed

  • Strawberry >=0.236 compatible

1.6 - 2024-04-30

Changed

  • Pyarrow >=16 required
  • group optimized for datasets
  • Duration scalar

Removed

  • Interval type

1.5 - 2024-01-24

Changed

  • Pyarrow >=15 required

Fixed

  • Strawberry >=0.212 compatible
  • Starlette >=0.36 compatible

1.4 - 2023-11-05

Changed

  • Pyarrow >=14 required
  • Python >=3.9 required
  • group optimized for memory

Removed

  • fragments replaced by group
  • min and max replaced by rank
  • partition replaced by runs
  • list aggregation must be explicit
  • group list functions are in apply

1.3 - 2023-08-25

Changed

  • Pyarrow >=13 required
  • List filtering and sorting moved to functions and optimized
  • Dataset filtering, grouping, and sorting on fragments optimized
  • group can aggregate entire table

Added

  • flatten field for list columns
  • rank field for min and max filtering
  • Schema extensions for metrics and deprecations
  • optional field for partial query results
  • dropNull, fillNull, and size fields
  • Command-line utilities
  • Allow datasets with invalid field names

Deprecated

  • fragments field deprecated and functionality moved to group field
  • Implicit list aggregation on group deprecated
  • partition field deprecated and renamed to runs

1.2 - 2023-05-07

Changed

  • Pyarrow >=12 required
  • Grouping fragments optimized
  • Group by empty columns
  • Batch sorting and grouping into lists

1.1 - 2023-01-29

  • Pyarrow >=11 required
  • Python >=3.8 required
  • Scannable functions added
  • List aggregations deprecated
  • Group by fragments
  • Month day nano interval array
  • min and max fields memory optimized

1.0 - 2022-10-28

  • Pyarrow >=10 required
  • Dataset schema introspection
  • Dataset scanning with selection and projection
  • Binary search on sorted columns
  • List aggregation, filtering, and sorting optimizations
  • Compute functions generalized
  • Multiple datasets and federation
  • Provisional dataset join and take

0.9 - 2022-08-04

  • Pyarrow >=9 required
  • Multi-directional sorting
  • Removed unnecessary interfaces
  • Filtering has stricter typing

0.8 - 2022-05-08

  • Pyarrow >=8 required
  • Grouping and aggregation integrated
  • AbstractTable interface renamed to Dataset
  • Binary scalar renamed to Base64

0.7 - 2022-02-04

  • Pyarrow >=7 required
  • FILTERS use query syntax and trigger reading the dataset
  • FEDERATED field configuration
  • List columns support sorting and filtering
  • Group by and aggregate optimizations
  • Dataset scanning

0.6 - 2021-10-28

  • Pyarrow >=6 required
  • Group by optimized and replaced unique field
  • Dictionary related optimizations
  • Null consistency with arrow count functions

0.5 - 2021-08-06

  • Pyarrow >=5 required
  • Stricter validation of inputs
  • Columns can be cast to another arrow data type
  • Grouping uses large list arrays with 64-bit counts
  • Datasets are read on-demand or optionally at startup

0.4 - 2021-05-16

  • Pyarrow >=4 required
  • sort updated to use new native routines
  • partition tables by adjacent values and differences
  • filter supports unknown column types using tagged union pattern
  • Groups replaced with Table.tables and Table.aggregate fields
  • Tagged unions used for filter, apply, and partition functions

0.3 - 2021-01-31

  • Pyarrow >=3 required
  • any and all fields
  • String column split field

0.2 - 2020-11-26

  • Pyarrow >= 2 required
  • ListColumn and StructColumn types
  • Groups type with aggregate field
  • group and unique optimized
  • Statistical fields: mode, stddev, variance
  • is_in, min, and max optimized