Fixing some bugs in example feature repo for spark #5407

Felix-neko · 2025-05-31T18:13:20Z

Hi folks!

I've just tried to do feast init -t spark and to run generated test_workflow.py (Python 3.11, PySpark 3.5, OS Ubuntu 24.04).

There were some errors that I was able to fix, maybe it will be helpful.

- creating parquet files with timestamps in us units instead of ns (pyspark has problems reading ns timestamps) - removing on-the-fly features that aren't declared before - settings join keys for entities. Signed-off-by: Felix-neko <felix-neko@list.ru>

franciscojavierarceo · 2025-05-31T18:42:22Z

sdk/python/feast/templates/spark/feature_repo/test_workflow.py

@@ -109,8 +107,6 @@ def fetch_online_features(store, use_feature_service: bool):
        features_to_fetch = [
            "driver_hourly_stats:acc_rate",
            "driver_hourly_stats:avg_daily_trips",
-            "transformed_conv_rate:conv_rate_plus_val1",


Oh these didn't work?

This on-the-fly feature was not declared in the example feature repo for Spark and i have just removed it from this example.

Looking at the code under aws/feature_repo/example_repo.py had this ODFV:

# Define an on demand feature view which can generate new features based on # existing feature views and RequestSource features @on_demand_feature_view( sources=[driver_stats_fv, input_request], schema=[ Field(name="conv_rate_plus_val1", dtype=Float64), Field(name="conv_rate_plus_val2", dtype=Float64), ], ) def transformed_conv_rate(inputs: pd.DataFrame) -> pd.DataFrame: df = pd.DataFrame() df["conv_rate_plus_val1"] = inputs["conv_rate"] + inputs["val_to_add"] df["conv_rate_plus_val2"] = inputs["conv_rate"] + inputs["val_to_add_2"] return df

Great, it works. I've restored it.

…ving unimplemented part with stream data pushing). Signed-off-by: Felix-neko <felix-neko@list.ru>

Signed-off-by: Felix-neko <felix-neko@list.ru>

Felix-neko requested a review from a team as a code owner May 31, 2025 18:13

Felix-neko force-pushed the master branch from 7ba59b7 to 0ce35f4 Compare May 31, 2025 18:32

franciscojavierarceo reviewed May 31, 2025

View reviewed changes

Felix-neko added 2 commits June 1, 2025 00:10

Fixing online features part of this example (adding key columns, remo…

0b6ed9a

…ving unimplemented part with stream data pushing). Signed-off-by: Felix-neko <felix-neko@list.ru>

Restoring ODFV for Spark example repo

62f7313

Signed-off-by: Felix-neko <felix-neko@list.ru>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixing some bugs in example feature repo for spark #5407

Fixing some bugs in example feature repo for spark #5407

Uh oh!

Felix-neko commented May 31, 2025

Uh oh!

franciscojavierarceo May 31, 2025

Uh oh!

Felix-neko May 31, 2025 •

edited

Loading

Uh oh!

franciscojavierarceo Jun 1, 2025

Uh oh!

Felix-neko Jun 1, 2025

Uh oh!

Uh oh!

Fixing some bugs in example feature repo for spark #5407

Are you sure you want to change the base?

Fixing some bugs in example feature repo for spark #5407

Uh oh!

Conversation

Felix-neko commented May 31, 2025

Uh oh!

franciscojavierarceo May 31, 2025

Choose a reason for hiding this comment

Uh oh!

Felix-neko May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

Felix-neko Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Felix-neko May 31, 2025 •

edited

Loading