Skip to content

feat: add JSON support to file_utils module #364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

AlbertHuangKSFO
Copy link

📝 Summary

  • Add get_examples_from_json function for loading test examples from JSON files
  • Add comprehensive module documentation with usage examples
  • Add json import for JSON file parsing
  • Fix formatting issues in YAML documentation
  • Enhance file_utils to support both YAML and JSON file formats

✅ Checklist

  • Tagged Linear ticket in PR title. Ie. PR Title (JUD-XXXX)
  • Video demo of changes
  • Reviewers assigned

- Add get_examples_from_json function for loading test examples from JSON files
- Add comprehensive module documentation with usage examples
- Add json import for JSON file parsing
- Fix formatting issues in YAML documentation
- Enhance file_utils to support both YAML and JSON file formats
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @AlbertHuangKSFO, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the file_utils module by introducing support for loading test examples from JSON files, alongside the existing YAML functionality. This provides greater flexibility for users to define and manage their test datasets in either format, making the module more versatile and user-friendly. I've also improved the overall documentation for clarity and ease of use.

Highlights

  • New Feature: JSON Support: I've added a new function, get_examples_from_json, to the file_utils module. This function allows loading test examples from JSON files, mirroring the existing functionality for YAML files. It expects the JSON file to contain a top-level 'examples' key, whose value is a list of dictionaries representing Example objects.
  • Enhanced Module Documentation: I've introduced a comprehensive module-level docstring to file_utils.py. This docstring provides clear usage examples for both get_examples_from_yaml and the newly added get_examples_from_json functions, demonstrating how to load examples and use them in an evaluation context.
  • Error Handling for JSON Parsing: The new get_examples_from_json function includes robust error handling for common issues such as FileNotFoundError if the specified JSON file does not exist, and json.JSONDecodeError if the file content is not valid JSON. It also checks for an empty JSON file.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for loading test examples from JSON files. The review focuses on correctness and documentation clarity, specifically addressing the function's return type hint and the example JSON in the docstring.

Comment on lines +86 to +115
{
"examples": [
{
"input": "test input",
"actual_output": "test output",
"expected_output": "expected output",
"context": [
"context1",
"context2"
],
"retrieval_context": [
"retrieval1"
],
"additional_metadata": {
"key": "value"
},
"tools_called": [
"tool1"
],
"expected_tools": [
{"tool_name": "tool1", "parameters": {"query": "test query 1"}},
{"tool_name": "tool2", "parameters": {"query": "test query 2"}}
],
"name": "test example",
"example_id": null,
"timestamp": "20241230_160117",
"trace_id": "123"
}
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The example JSON provided in the docstring has critical issues:

  1. Invalid Syntax: It is missing commas between most of the key-value pairs, making it invalid JSON.
  2. Inconsistent Fields: It uses "example_id": null and a timestamp field. These are inconsistent with the Example Pydantic model, where example_id is a non-optional str and the time-related field is named created_at.

Providing a syntactically invalid and factually incorrect example is highly misleading. The example should be corrected to be valid JSON and accurately reflect the Example data model.

        {
            "input": "test input",
            "actual_output": "test output",
            "expected_output": "expected output",
            "context": [
                "context1",
                "context2"
            ],
            "retrieval_context": [
                "retrieval1"
            ],
            "additional_metadata": {
                "key": "value"
            },
            "tools_called": [
                "tool1"
            ],
            "expected_tools": [
                {"tool_name": "tool1", "parameters": {"query": "test query 1"}},
                {"tool_name": "tool2", "parameters": {"query": "test query 2"}}
            ],
            "name": "test example",
            "example_id": "a-valid-uuid-string",
            "timestamp": "20241230_160117",
            "trace_id": "123"
        }

Remove unnecessary type

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants