Skip to content

feat: Add sqlite offline store #5319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions docs/reference/offline-stores/sqlite.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# SQLite offline store

## Description

The SQLite offline store provides support for reading and writing feature data using SQLite databases. It's a lightweight, file-based offline store option that's ideal for local development, testing, and small-scale deployments.

* Entity dataframes can be provided as a Pandas dataframe or as a SQL query.
* The SQLite offline store supports all core Feast functionality including point-in-time joins.

## Getting started
In order to use this offline store, you'll need to run `pip install 'feast[sqlite]'`.

## Example

{% code title="feature_store.yaml" %}
```yaml
project: my_project
registry: data/registry.db
provider: local
offline_store:
type: sqlite
path: data/offline.db
online_store:
path: data/online_store.db
```
{% endcode %}

## Functionality Matrix

The set of functionality supported by offline stores is described in detail [here](overview.md#functionality).
Below is a matrix indicating which functionality is supported by the SQLite offline store.

| | SQLite |
| :----------------------------------------------------------------- | :----- |
| `get_historical_features` (point-in-time correct join) | yes |
| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes |
| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes |
| `offline_write_batch` (persist dataframes to offline store) | yes |
| `write_logged_features` (persist logged features to offline store) | yes |

Below is a matrix indicating which functionality is supported by `SqliteRetrievalJob`.

| | SQLite |
| ----------------------------------------------------- | ------ |
| export to dataframe | yes |
| export to arrow table | yes |
| export to arrow batches | no |
| export to SQL | no |
| export to data lake (S3, GCS, etc.) | no |
| export to data warehouse | no |
| export as Spark dataframe | no |
| local execution of Python-based on-demand transforms | yes |
| remote execution of Python-based on-demand transforms | no |
| persist results in the offline store | yes |
| preview the query plan before execution | no |
| read partitioned data | no |

To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix).
1 change: 1 addition & 0 deletions infra/feast-operator/api/v1alpha1/featurestore_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,7 @@ var ValidOfflineStoreFilePersistenceTypes = []string{
"dask",
"duckdb",
"file",
"sqlite",
}

// OfflineStoreDBStorePersistence configures the DB store persistence for the offline store service
Expand Down
Loading
Loading