WIP ragnar_chat #55

dfalbel · 2025-06-27T21:25:22Z

Implements an initial version of ragnar chat, adding support for eg:

Forced tool calling
Query rewriting
Context pruning

This is pretty much work in progress for discussion. Relies on a few ellmer internals, etc

t-kalinowski · 2025-06-30T16:34:35Z

Thanks Daniel, this is a great start!

Some questions that came to mind while I was reading through, in no particular order:

What do you think about the main interface taking a reified Chat object instead of a function and lazy ...?
Pruning all tool calls seems too broad. Should we constrain it to just retrieval results? Would it make sense to extend and define a custom class like S7::new_class("ContentRagnarRetrievalResult", parent = ContentToolResult)?
Would it make sense to leave the chunks as dataframes, instead of converting to JSON?

Out of curiosity, I generated a turns list with tool calls and results and used it to seed a chat history with a model that does not natively support tool calls (ollama serving gemma3n). The conversation works, but when asked to perform additional tool calls, the model stated it performed a search when it did not, then proceeded to hallucinate details.

I think we can tweak this approach lightly to better support RAG chat with models that do not support tool calling. Perhaps we can maintain a single rolling RAG context at the head of the chat that is updated every turn?
Any thoughts about how the forced tool calling could allow for simple query rewriting by the model?
What do you think about including a unique id exclusion check in the tool definition, so that the retrieve tool call cannot return chunks that are already in context (similar to what we did in quartohelp)?

dfalbel · 2025-06-30T16:48:53Z

What do you think about the main interface taking a reified Chat object instead of a function and lazy ...?

IIUC you are suggesting that we do ragnar_chat(chat_openai(...), store). I think we can do that. The reasoning for the current API is making sure that a new chat is created. Ie:

chat <- chat_openai()
out <- ragnar_chat(chat)

Could cause the impression that out and chat share history, etc. But really, I don't have any data on what's best.

Pruning all tool calls seems too broad. Should we constrain it to just retrieval results? Would it make sense to extend and define a custom class like S7::new_class("ContentRagnarRetrievalResult", parent = ContentToolResult)?

We don't prune all tool calls, only those added by ragnar, by using the tool definition name:

ragnar/R/ragnar-chat.R

Lines 156 to 159 in 993c042

    
           if (content@name != self$ragnar_tool_def@name) { 
        
             next 
        
           }

Would it make sense to leave the chunks as dataframes, instead of converting to JSON?

The LLM will see just text. What kind of text representation of data.frames do you think it should see?
The justification for using json is that the tool call result is already a json string, so it could make sense for results to keep using this format.

I think we can tweak this approach lightly to better support RAG chat with models that do not support tool calling. Perhaps we can maintain a single rolling RAG context at the head of the chat that is updated every turn?

Yes, we can have a method for doing so. Eg managing tool calls at the top of the chat.

Any thoughts about how the forced tool calling could allow for simple query rewriting by the model?

I think we can provide helpers for this. Not hard currently though, just write a function that takes a user chat and returns
a few queries...

What do you think about including a unique id exclusion check in the tool definition, so that the retrieve tool call cannot return chunks that are already in context (similar to what we did in quartohelp)?

Yes, we can do this. I think this is part of deoverlapping, so I think we could add this once we know the interface for ragnar_deoverlap.

I think we can discuss this in follow up PR's though, once we agreee on the initial ihnterface.

t-kalinowski marked this pull request as draft June 30, 2025 18:00

dfalbel mentioned this pull request Jul 4, 2025

Refactor app: t-kalinowski/quartohelp#8

Draft

dfalbel added 9 commits July 7, 2025 16:28

WIP ragnar_chat

5633921

Add support for removing chunks by id

480a49e

Add uniqueness checks

884f686

Add support for inserting chunks at the top of the turns list

d1e665e

fix insert document

d2f8686

Update signature

ffa94bc

Summarize tool calls

c429ffe

Make ellmer required

67a84f3

Fixes for rebasing

0db8110

dfalbel force-pushed the ragnar-chat branch from f503848 to 0db8110 Compare July 7, 2025 19:34

t-kalinowski mentioned this pull request Jul 18, 2025

Issue recognizing tool for local LLM #84

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP ragnar_chat #55

WIP ragnar_chat #55

Uh oh!

dfalbel commented Jun 27, 2025 •

edited

Loading

Uh oh!

t-kalinowski commented Jun 30, 2025

Uh oh!

dfalbel commented Jun 30, 2025

Uh oh!

Uh oh!

WIP ragnar_chat #55

Are you sure you want to change the base?

WIP ragnar_chat #55

Uh oh!

Conversation

dfalbel commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

t-kalinowski commented Jun 30, 2025

Uh oh!

dfalbel commented Jun 30, 2025

Uh oh!

Uh oh!

dfalbel commented Jun 27, 2025 •

edited

Loading