Skip to content

WIP ragnar_chat #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

WIP ragnar_chat #55

wants to merge 9 commits into from

Conversation

dfalbel
Copy link
Collaborator

@dfalbel dfalbel commented Jun 27, 2025

Implements an initial version of ragnar chat, adding support for eg:

  • Forced tool calling
  • Query rewriting
  • Context pruning

This is pretty much work in progress for discussion. Relies on a few ellmer internals, etc

@t-kalinowski
Copy link
Collaborator

Thanks Daniel, this is a great start!

Some questions that came to mind while I was reading through, in no particular order:

  • What do you think about the main interface taking a reified Chat object instead of a function and lazy ...?

  • Pruning all tool calls seems too broad. Should we constrain it to just retrieval results? Would it make sense to extend and define a custom class like S7::new_class("ContentRagnarRetrievalResult", parent = ContentToolResult)?

  • Would it make sense to leave the chunks as dataframes, instead of converting to JSON?

Out of curiosity, I generated a turns list with tool calls and results and used it to seed a chat history with a model that does not natively support tool calls (ollama serving gemma3n). The conversation works, but when asked to perform additional tool calls, the model stated it performed a search when it did not, then proceeded to hallucinate details.

  • I think we can tweak this approach lightly to better support RAG chat with models that do not support tool calling. Perhaps we can maintain a single rolling RAG context at the head of the chat that is updated every turn?

  • Any thoughts about how the forced tool calling could allow for simple query rewriting by the model?

  • What do you think about including a unique id exclusion check in the tool definition, so that the retrieve tool call cannot return chunks that are already in context (similar to what we did in quartohelp)?

@dfalbel
Copy link
Collaborator Author

dfalbel commented Jun 30, 2025

What do you think about the main interface taking a reified Chat object instead of a function and lazy ...?

IIUC you are suggesting that we do ragnar_chat(chat_openai(...), store). I think we can do that. The reasoning for the current API is making sure that a new chat is created. Ie:

chat <- chat_openai()
out <- ragnar_chat(chat)

Could cause the impression that out and chat share history, etc. But really, I don't have any data on what's best.

Pruning all tool calls seems too broad. Should we constrain it to just retrieval results? Would it make sense to extend and define a custom class like S7::new_class("ContentRagnarRetrievalResult", parent = ContentToolResult)?

We don't prune all tool calls, only those added by ragnar, by using the tool definition name:

ragnar/R/ragnar-chat.R

Lines 156 to 159 in 993c042

if (content@name != self$ragnar_tool_def@name) {
next
}

Would it make sense to leave the chunks as dataframes, instead of converting to JSON?

The LLM will see just text. What kind of text representation of data.frames do you think it should see?
The justification for using json is that the tool call result is already a json string, so it could make sense for results to keep using this format.


I think we can tweak this approach lightly to better support RAG chat with models that do not support tool calling. Perhaps we can maintain a single rolling RAG context at the head of the chat that is updated every turn?

Yes, we can have a method for doing so. Eg managing tool calls at the top of the chat.

Any thoughts about how the forced tool calling could allow for simple query rewriting by the model?

I think we can provide helpers for this. Not hard currently though, just write a function that takes a user chat and returns
a few queries...

What do you think about including a unique id exclusion check in the tool definition, so that the retrieve tool call cannot return chunks that are already in context (similar to what we did in quartohelp)?

Yes, we can do this. I think this is part of deoverlapping, so I think we could add this once we know the interface for ragnar_deoverlap.

I think we can discuss this in follow up PR's though, once we agreee on the initial ihnterface.

@t-kalinowski t-kalinowski marked this pull request as draft June 30, 2025 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants