"I was at SF Ruby, in San Francisco, a few weeks ago. Most of the tracks were, of course, heavily focused on AI"
It may be the current "Zeitgeist", but I find the addiction to AI annoying. I am not denying that there are use cases to be had that can be net-positive, but there are also numerous bad examples of AI use. And these, IMO, are more prevalent than the positive ones overall.
What does the end user do with the AI chat? It sounds like they can just use it to do searches of client information… which the existing site would already do.
For what it’s worth: yes, it’s not technically true, but the reason it’s sticking around is because it conveys a deeply felt (and actually true) sentiment that many many people have: the output of generative AI isn’t worth the input.
Was there any concern about giving the LLM access to this return data? Reading your article I wondered if there could be an approach that limits the LLM to running the function calls without ever seeing the output itself fully, e.g., only seeing the start of a JSON string with a status like “success” or “not found”. But I guess it would be complicated to have a continuous conversation that way.
> No model should ever know Jon Snow’s phone number from a SaaS service, but this approach allows this sort of retrieval.
This reads to me like they think that the response from the tool doesn’t go back to the LLM.
I’ve not worked with tools but my understanding is that they’re a way to allow the LLM to request additional data from the client. Once the client executes the requested function, that response data then goes to the LLM to be further processed into a final response.
That would be the normal pattern. But you could certainly stop after the LLM picks the tool and provides the arguments, and not present the result back to the model.
It's interesting the use of RubyLLM here. I'm trying to contrast that with my own use of DSPy.rb, which so far I've been quite happy with for small experiments.
Does anyone have a comparison of the two, or any other libraries?
Maintainer of DSPy.rb here. The key difference is the level of abstraction:
RubyLLM gives you a clean API for LLM calls and tool definitions. You're still writing prompts and managing conversations directly.
DSPy.rb treats prompts as functions with typed signatures. You define inputs/outputs and the framework handles prompt construction, JSON parsing, and structured extraction. Two articles that might help:
1. "Building Your First ReAct Agent" shows how to build tool-using agents with type-safe tool definitions [0].
2. "Building Chat Agents with Ephemeral Memory" demonstrates context engineering patterns (what the LLM sees vs. what you store), cost-based routing between models, and memory management [1].
The article's approach (RubyLLM + single tool) works great for simple cases. DSPy.rb shines when you need to decompose into multiple specialized modules with different concerns. Some examples: separate signatures for classification vs. response generation, each optimized independently with separate context windows and memory to maintain.
Would love to learn how dspy.rb is working for you!
Note that RubyLLM and DSPy.rb aren't mutually exclusive (`gem 'dspy-ruby_llm'`) adapter gives us access to a TON of providers.
Thanks for sharing your experience! I know there's many of us out there dabbling with LLMs and some solid businesess built on Ruby, lurking in the background without publishing much.
Your single-tool approach is a solid starting point. As it grows, you might hit context window limits and find the prompt getting unwieldy. Things like why is this prompt choking on 1.5MB of JSON coming from this other API/Tool?
When you look at systems like Codex CLI, they run at least four separate LLM subsystems: (1) the main agent prompt, (2) a summarizer model that watches the reasoning trace and produces user-facing updates like "Searching for test files...", (3) compaction and (4) a reviewer agent. Each one only sees the context it needs. Like a function with their inputs and outputs. Total tokens stay similar, but signal density per prompt goes up.
DSPy.rb[0] enables this pattern in Ruby: define typed Signatures for each concern, compose them as Modules/Prompting Techniques (simple predictor, CoT, ReAct, CodeAct, your own, ...), and let each maintain its own memory scope. Three articles that show this:
- "Ephemeral Memory Chat"[1] — the Two-Struct pattern (rich storage vs. lean prompt context) plus cost-based routing between cheap and expensive models.
- "Evaluator Loops"[2] — decompose generation from evaluation: a cheap model drafts, a smarter model critiques, each with its own focused signature.
- "Workflow Router"[3] — route requests to the right model based on complexity, only escalate to expensive LLMs when needed.
And since you're already using RubyLLM, the dspy-ruby_llm adapter lets you keep your provider setup while gaining the decomposition benefits.
Thanks for coming to my TED talk. Let me know if you need someone to bounce ideas off.
If all this does is give you the data from a contact API, why not just let the users directly interact with the API? The LLM is just extra bloat in this case.
Surely a fuzzy search by name or some other field is a much better UI for this.
- Made a RAG in ~50 lines of ruby (practical and efficient)
- Perform authorization on chunks in 2 lines of code (!!)
- Offload retrieval to Algolia. Since a RAG is essentially LLM + retriever, the retriever typically ends up being most of the work. So using an existing search tool (rather than setting up a dedicated vector db) could save a lot of time/hassle when building a RAG.
I built a similar system for php and I can tell you what is the smart thing here: accessing data using tools.
Of course tool calling and MCP are not new. But the smart thing is that by defining the tools in the context of an authenticated request, one can easily enforce the security policy of the monolith.
In my case (we will maybe write a blog post one day), it's even neater as the agent is coded in Python so the php app talks with Python through local HTTP (we are thinking about building a central micro service) and the tool calls are encoded as JSON RPC, and yet it works.
I had to do something similar. Ruby is awful and very immature compared to python, so I "outsourced" the machine learning / LLM interaction to python. The rails service talks to it through grpc / protobuf and it works wonderfully.
It may be the current "Zeitgeist", but I find the addiction to AI annoying. I am not denying that there are use cases to be had that can be net-positive, but there are also numerous bad examples of AI use. And these, IMO, are more prevalent than the positive ones overall.
This reads to me like they think that the response from the tool doesn’t go back to the LLM.
I’ve not worked with tools but my understanding is that they’re a way to allow the LLM to request additional data from the client. Once the client executes the requested function, that response data then goes to the LLM to be further processed into a final response.
Does anyone have a comparison of the two, or any other libraries?
RubyLLM gives you a clean API for LLM calls and tool definitions. You're still writing prompts and managing conversations directly.
DSPy.rb treats prompts as functions with typed signatures. You define inputs/outputs and the framework handles prompt construction, JSON parsing, and structured extraction. Two articles that might help:
1. "Building Your First ReAct Agent" shows how to build tool-using agents with type-safe tool definitions [0].
2. "Building Chat Agents with Ephemeral Memory" demonstrates context engineering patterns (what the LLM sees vs. what you store), cost-based routing between models, and memory management [1].
The article's approach (RubyLLM + single tool) works great for simple cases. DSPy.rb shines when you need to decompose into multiple specialized modules with different concerns. Some examples: separate signatures for classification vs. response generation, each optimized independently with separate context windows and memory to maintain.
Would love to learn how dspy.rb is working for you!
Note that RubyLLM and DSPy.rb aren't mutually exclusive (`gem 'dspy-ruby_llm'`) adapter gives us access to a TON of providers.
[0] https://oss.vicente.services/dspy.rb/blog/articles/react-age... [1] https://oss.vicente.services/dspy.rb/blog/articles/ephemeral...
Your single-tool approach is a solid starting point. As it grows, you might hit context window limits and find the prompt getting unwieldy. Things like why is this prompt choking on 1.5MB of JSON coming from this other API/Tool?
When you look at systems like Codex CLI, they run at least four separate LLM subsystems: (1) the main agent prompt, (2) a summarizer model that watches the reasoning trace and produces user-facing updates like "Searching for test files...", (3) compaction and (4) a reviewer agent. Each one only sees the context it needs. Like a function with their inputs and outputs. Total tokens stay similar, but signal density per prompt goes up.
DSPy.rb[0] enables this pattern in Ruby: define typed Signatures for each concern, compose them as Modules/Prompting Techniques (simple predictor, CoT, ReAct, CodeAct, your own, ...), and let each maintain its own memory scope. Three articles that show this:
- "Ephemeral Memory Chat"[1] — the Two-Struct pattern (rich storage vs. lean prompt context) plus cost-based routing between cheap and expensive models.
- "Evaluator Loops"[2] — decompose generation from evaluation: a cheap model drafts, a smarter model critiques, each with its own focused signature.
- "Workflow Router"[3] — route requests to the right model based on complexity, only escalate to expensive LLMs when needed.
And since you're already using RubyLLM, the dspy-ruby_llm adapter lets you keep your provider setup while gaining the decomposition benefits.
Thanks for coming to my TED talk. Let me know if you need someone to bounce ideas off.
[0] https://github.com/vicentereig/dspy.rb
[1] https://oss.vicente.services/dspy.rb/blog/articles/ephemeral...
[2] https://oss.vicente.services/dspy.rb/blog/articles/evaluator...
[3] https://oss.vicente.services/dspy.rb/blog/articles/workflow-...
(edit: minor formatting)
Surely a fuzzy search by name or some other field is a much better UI for this.
We build front ends for the API to make our applications easier to use. This is just another type of front end.
- Made a RAG in ~50 lines of ruby (practical and efficient)
- Perform authorization on chunks in 2 lines of code (!!)
- Offload retrieval to Algolia. Since a RAG is essentially LLM + retriever, the retriever typically ends up being most of the work. So using an existing search tool (rather than setting up a dedicated vector db) could save a lot of time/hassle when building a RAG.
Of course tool calling and MCP are not new. But the smart thing is that by defining the tools in the context of an authenticated request, one can easily enforce the security policy of the monolith.
In my case (we will maybe write a blog post one day), it's even neater as the agent is coded in Python so the php app talks with Python through local HTTP (we are thinking about building a central micro service) and the tool calls are encoded as JSON RPC, and yet it works.