Skip to content

examples: minor tweak on llm_as_a_judge example #1284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 29, 2025

Conversation

mshsheikh
Copy link
Contributor

@mshsheikh mshsheikh commented Jul 28, 2025

  • Added trailing spaces between adjacent string literals for clearer agent prompts
  • Inserted missing “the” in evaluator instructions for grammatical accuracy

* Introduced `max_attempts` counter to prevent infinite judging loops (defaults to 5)
* Added trailing spaces between adjacent string literals for clearer agent prompts
* Inserted missing “the” in evaluator instructions for grammatical accuracy
@@ -30,9 +30,9 @@ class EvaluationFeedback:
evaluator = Agent[None](
name="evaluator",
instructions=(
"You evaluate a story outline and decide if it's good enough."
"If it's not good enough, you provide feedback on what needs to be improved."
"Never give it a pass on the first try. After 5 attempts, you can give it a pass if story outline is good enough - do not go for perfection"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the instructions allow running 5+ times, so the changes in this PR are inconsistent. if you remove the max_attempts etc., we are happy to merge other changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. The max_attempts logic has been removed, and only the instruction formatting fixes are kept.

@seratch seratch added the documentation Improvements or additions to documentation label Jul 28, 2025
@mshsheikh mshsheikh requested a review from seratch July 29, 2025 04:07
@@ -46,6 +46,8 @@ async def main() -> None:

# We'll run the entire workflow in a single trace
with trace("LLM as a judge"):
max_attempts = 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as i mentioned above, please remove this additional logic, which is not necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I clicked the review request by mistake before reverting the changes.

* Reverted the `max_attempts` safeguard logic to restore the intended behavior of the LLM-as-a-judge pattern. The evaluator instructions use the phrase “After 5 attempts, you *can* give it a pass,” indicating discretion rather than a strict limit.
* Retained fixes to instruction string formatting to avoid run-together text (e.g., “input.If there” → “input. If there”). 
* Corrected a minor grammatical issue by adding “the” before “story outline” for proper English usage.
@seratch seratch changed the title feat: add loop safeguard and fix instruction spacing/grammar examples: minor tweak on llm_as_a_judge example Jul 29, 2025
@seratch seratch merged commit 4cb07d5 into openai:main Jul 29, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants