Red Candle Provider #404

cpetersen · 2025-09-11T20:36:42Z

What this does

This PR adds support for the Red Candle provider, enabling local LLM execution using quantized GGUF models directly in Ruby without requiring external API calls.

Key Implementation Details

Red Candle is fundamentally different from other providers: While all other RubyLLM providers communicate via HTTP APIs, Red Candle runs models locally using the Candle Rust crate. This brings true local inference to Ruby, with no network
latency or API costs.

Dependency Management

Since Red Candle requires a Rust toolchain at build time, we've made it optional at two levels:

For end users: red-candle is NOT a gemspec dependency. Users must explicitly add gem 'red-candle' to their Gemfile to use this provider.
For contributors: We've added an optional Bundler group so developers can work on RubyLLM without installing Rust. Enable with bundle config set --local with red_candle.

Testing Strategy

We implemented a comprehensive mocking system to keep tests fast:

Stubbed mode (default): Uses MockCandleModel to simulate responses without actual inference
Real inference mode: Set RED_CANDLE_REAL_INFERENCE=true to run actual model inference (downloads models on first run, ~4.5 GBs)
Not installed mode: Tests skip gracefully when Red Candle isn't available

Changes Made

Added RubyLLM::Providers::RedCandle with full chat support including streaming
Implemented model management with automatic GGUF file downloads from HuggingFace
Created comprehensive test mocks in red_candle_test_helper.rb
Added conditional loading in ruby_llm.rb and spec_helper.rb to handle optional dependency
Updated models_to_test.rb to conditionally include Red Candle models
Added documentation in CONTRIBUTING.md for managing the optional dependency
Implemented proper Content object handling for structured responses

How to Test

# Test without Red Candle (default for new contributors)
bundle install
bundle exec rspec  # Red Candle tests will be skipped

# Test with Red Candle stubbed (fast)
bundle config set --local with red_candle
bundle install
bundle exec rspec  # Uses mocked responses

# Test with real inference (slow, downloads models)
bundle config set --local with red_candle
bundle install
huggingface-cli login # Make sure to accept mistral terms
RED_CANDLE_REAL_INFERENCE=true bundle exec rspec

Once red-candle is enabled turn it back off with:

bundle config unset with

And turn it BACK on with:

bundle config set --local with red_candle

Try it out

bundle exec irb

require 'ruby_llm'

chat = RubyLLM.chat(
  provider: :red_candle,
  model: 'Qwen/Qwen2.5-1.5B-Instruct-GGUF' # 'TheBloke/Mistral-7B-Instruct-v0.2-GGUF' is another option
)
response = chat.ask("What are the benefits of functional programming?")
puts response.content

Type of change

Scope check

I read the Contributing Guide
This aligns with RubyLLM's focus on LLM communication
This isn't application-specific logic that belongs in user code
This benefits most users, not just my specific use case

Quality check

I ran overcommit --install and all hooks pass
I tested my changes thoroughly
- For provider changes: Re-recorded VCR cassettes with bundle exec rake vcr:record[provider_name]
- All tests pass: bundle exec rspec
I updated documentation if needed
I didn't modify auto-generated files manually (models.json, aliases.json)

API changes

Breaking change
New public methods/classes
Changed method signatures
No API changes

Related issues

Fixes #394

…le or not

…the future

tpaulshippy

I do see 4 rubocop offenses. Can you clean those up?

Love the test helper.

tpaulshippy · 2025-09-12T03:35:30Z

lib/ruby_llm/providers/red_candle/chat.rb

+            # Red Candle doesn't provide token counts, but we can estimate them
+            content = result[:content]
+            # Rough estimation: ~4 characters per token
+            estimated_output_tokens = (content.length / 4.0).round


Is this just for funsies?

I noticed a few things like this (infinity tokens per dollar), that I don't see in the ollama provider. While adding these lines of code may have value, I'm not really seeing it.

You're definitely right about the infinity tokens per dollar, that was a little too cute. I removed the whole pricing bit (I don't think it's necessary).

As for the estimated_output_tokens, the specs require this in these two places:

https://github.com/crmne/ruby_llm/blob/main/spec/ruby_llm/chat_spec.rb#L18-L19

https://github.com/crmne/ruby_llm/blob/main/spec/ruby_llm/chat_streaming_spec.rb#L43-L44

We can get real token counts from red-candle but we need to retokenize which seems wasteful (and I couldn't figure out how to reasonably get access to the underlying Candle::LLM right here), so we decided to estimate. I'm open to other methods of estimating, we could split on a regex or something, this just seemed simple and efficient.

tpaulshippy · 2025-09-12T03:35:59Z

lib/ruby_llm/providers/red_candle/chat.rb

+        end
+
+        def render_payload(messages, tools:, temperature:, model:, stream:, schema:) # rubocop:disable Metrics/ParameterLists
+          # Red Candle doesn't support tools


Sad. At least it has structured generation.

this is a planned red-candle feature - just not there yet.

tpaulshippy · 2025-09-12T03:41:10Z

spec/spec_helper.rb

 require_relative 'support/streaming_error_helpers'
+require_relative 'support/provider_capabilities_helper'
+
+# Handle Red Candle provider based on availability and environment


May consider putting this in a separate file to follow the pattern set.

tpaulshippy · 2025-09-12T04:42:56Z

Looks like this won't work with Ruby 3.1 (which is currently part of the CI for RubyLLM). Probably need to figure this out.
https://github.com/tpaulshippy/ruby_llm_community/actions/runs/17664667094/job/50204162482?pr=15

I tried it out though - love it!

cpetersen · 2025-09-13T17:12:26Z

@tpaulshippy Thank you for the review and the feedback! We've made some changes and I think this is ready for another look.

crmne · 2025-09-14T09:39:31Z

Looks like this won't work with Ruby 3.1 (which is currently part of the CI for RubyLLM). Probably need to figure this out.

This is unfortunately a blocker. I'm not gonna drop Ruby 3.1 support soon as I know many users of RubyLLM are still running that. I think this patch may need to wait.

tpaulshippy · 2025-09-14T14:18:24Z

This is unfortunately a blocker. I'm not gonna drop Ruby 3.1 support soon as I know many users of RubyLLM are still running that. I think this patch may need to wait.

9ab992d resolved this.

orangewolf · 2025-09-15T17:24:30Z

@crmne with the 3.1 blocker removed (turned out to be already working and just needed testing) is there anything else keeping this from moving forward in your opinion?

cpetersen and others added 30 commits September 8, 2025 14:08

Initial red-candle provider implementation

5e8c1bb

Starting to work

5c770dd

Swap qwen for mistral

fe199a8

Trying to add red-candle to the models_to_test.rb

b8bf331

Adding red-candle to the models_to_test file

d98834c

Trying to fix the way tool calling support is checked in the specs

b207f69

Deconvoluting local model checks and tool calling support

ab46320

I think we finally got the local tool calling check correct

97d58d2

Enable context length validation for the RedCandle Provider

9c7f9dc

Working on rubocop fixes

d5c9129

Fixing the rubocop errors

70e1b24

stubbing the red-candle inference stuff to speed up specs

6956724

Adding an ENV variable so you toggle real red-candle inference on

0aad7d7

Adding red-candle to the list of providers in the README

52a13ca

Adding a new bundle group so developer can choose to include red-cand…

b883989

…le or not

Adding a comment about possibly supporting more red-candle models in …

685230c

…the future

Remove red-candle from the gemfiles

a928bb1

Properly register red-candle models

ee5b762

Removed some unused config options

43cc0b8

Updating the gemfiles again

4b67818

Make the capabilities file match the actual capabilities

c1ac17d

Deep merge chat options

54b9834

make red-candle off by default

c78ce40

improve error messages

6816be9

improved error message

a258a39

add additional models

004563e

seperate out tokenizers from gguf

c4895d6

more complete error message

0dc8e9a

Working on documentation

8c87b59

Merge branch 'main' into red-candle

252f97f

cpetersen added 2 commits September 11, 2025 13:44

red-candle is optional

d437f73

require 'candle' is standard

9bdb434

tpaulshippy reviewed Sep 12, 2025

View reviewed changes

tpaulshippy mentioned this pull request Sep 12, 2025

Red Candle Provider tpaulshippy/ruby_llm_community#15

Merged

orangewolf and others added 7 commits September 12, 2025 11:26

rubocop

d52e26e

use a spec helper

8ec93e8

Remove the too cute pricing method

d1696ff

Fix the comment for RubyLLM::Providers::RedCandle::Capabilities

62a0389

Make the require_relative actually relative

90128bb

Updatae to red-candle 1.3.0 to support ruby 3.1

9ab992d

Update the comment

922e0e9

Merge branch 'main' into red-candle

369e9d2

orangewolf added 2 commits September 17, 2025 10:00

Merge branch 'main' into red-candle

1e581cf

Merge branch 'main' into red-candle

25ea0d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Red Candle Provider #404

Red Candle Provider #404

Uh oh!

cpetersen commented Sep 11, 2025

Uh oh!

tpaulshippy left a comment •

edited

Loading

Uh oh!

tpaulshippy Sep 12, 2025

Uh oh!

tpaulshippy Sep 12, 2025

Uh oh!

cpetersen Sep 13, 2025

Uh oh!

tpaulshippy Sep 12, 2025

Uh oh!

orangewolf Sep 12, 2025

Uh oh!

tpaulshippy Sep 12, 2025

Uh oh!

orangewolf Sep 15, 2025

Uh oh!

tpaulshippy commented Sep 12, 2025

Uh oh!

cpetersen commented Sep 13, 2025

Uh oh!

crmne commented Sep 14, 2025

Uh oh!

tpaulshippy commented Sep 14, 2025 •

edited

Loading

Uh oh!

orangewolf commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Red Candle Provider #404

Are you sure you want to change the base?

Red Candle Provider #404

Uh oh!

Conversation

cpetersen commented Sep 11, 2025

What this does

Key Implementation Details

Dependency Management

Testing Strategy

Changes Made

How to Test

Try it out

Type of change

Scope check

Quality check

API changes

Related issues

Uh oh!

tpaulshippy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tpaulshippy Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

tpaulshippy Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

cpetersen Sep 13, 2025

Choose a reason for hiding this comment

Uh oh!

tpaulshippy Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

orangewolf Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

tpaulshippy Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

orangewolf Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

tpaulshippy commented Sep 12, 2025

Uh oh!

cpetersen commented Sep 13, 2025

Uh oh!

crmne commented Sep 14, 2025

Uh oh!

tpaulshippy commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

orangewolf commented Sep 15, 2025

Uh oh!

Uh oh!

tpaulshippy left a comment •

edited

Loading

tpaulshippy commented Sep 14, 2025 •

edited

Loading