Show HN: Mysti – Claude, Codex, and Gemini debate your code, then synthesize

(github.com)

65 points | by bahaAbunojaim 4 days ago

15 comments

MrDunham 5 minutes ago
[delayed]
d4rkp4ttern 2 hours ago
A workflow I find useful is to have multiple CLI agents running in different Tmux panes and have one consult/delegate to another using my Tmux-CLI [1] tool + skill. Advantage of this is that the agents’ work is fully visible and I can intervene as needed.
[1] https://github.com/pchalasani/claude-code-tools?tab=readme-o...
[-]
- throwaway12345t 4 minutes ago
  This is cool, if Codex or Gemini CLI is supported it would be good to have a section in the readme indicating shortcomings etc (may have missed)
  [-]
  - bahaAbunojaim 2 minutes ago
    Claude code, Gemini and codex are all supported but need more testing so I would really value the feedback, bug reports and contributions as well :D
    Contributions will be highly appreciated and credited
- bahaAbunojaim 1 hour ago
  I will look it up indeed
- sharifabdel 20 minutes ago
  What prompted you to build this?
  [-]
  - bahaAbunojaim 14 minutes ago
    I used to get stuck sometimes with Claude and needing a different agent to take a look and the switch back and forth between those agents is a headache and also you won’t be able to port all the context so thought this might help solve real blockers for many devs on larger projects
- petesergeant 12 minutes ago
  I've had good success with a similar workflow, most recently using it to help me build out a captive-wifi debugger[0]. In short, it worked _pretty_ well, but it was quite time intensive. That said, I think removing the human from the loop would have been insanity on this: lots of situations where there were some very poor ideas suggested that the other LLMs went along with, and others where one LLM was the sole voice of reason against the other two.
  I think my only real take-away from all of it was that Claude is probably the best at prototyping code, where Codex make a very strong (but pedantic) code-reviewer. Gemini was all over the place, sometimes inspired, sometimes idiotic.
  0: https://github.com/pjlsergeant/captive-wifi-tool/tree/main
  [-]
  - bahaAbunojaim 0 minutes ago
    This is exactly why I built Mysti because I used that flow very often and it worked well, I also added personas and skills so that it is easy to customize the agents behavior and if you have any ideas to make the behavior better then please don’t hesitate to share! Happy to jump on a call and discuss it as well
mlrtime 3 hours ago
Why make it a vscode extension if the point of these 3 tools is a cli interface? Meaning most of the people I know use these tools without VSCode. Is VSC required?
[-]
- davidmurdoch 11 minutes ago
  Huh. I know hundreds that use LLMs in a VSCode based IDE, and 3 that use the CLI.
- KronisLV 2 hours ago
  > Meaning most of the people I know use these tools without VSCode.
  I guess it depends?
  You can usually count on Claude Code or Codex or Gemini CLI to support the model features the best, but sometimes having a consistent UI across all of them is also nice - be it another CLI tool like OpenCode (that was a bit buggy for me when it came to copying text), or maybe Cline/RooCode/KiloCode inside of VSC, so you don't also have to install a custom editor like Cursor but can use your pre-existing VSC setup.
  Okay, that was a bit of a run on sentence, but it's nice to be able to work on some context and then to switch between different models inline: "Hey Sonnet, please look at the work of the previous model up until this point and validate its findings about the cause of this bug."
  I'd also love it if I could hook up some of those models (especially what Cerebras Code offers) with autocomplete so I wouldn't need Copilot either, but most of the plugins that try to do that are pretty buggy or broken (e.g. Continue.dev). KiloCode also added autocomplete, but it doesn't seem to work with BYOK.
  [-]
  - bahaAbunojaim 1 hour ago
    Very true, I like the fact that I can now use them with a consistent UI, shared context and ability to brainstorm
    Will definitely try to add those features in a future release as well
- bahaAbunojaim 2 hours ago
  That’s a great idea! I can make it a CLI too
tiku 3 hours ago
Anyone knows of something similar but for terminal?
Update:
I've already found a solution based on a comment, and modified it a bit.
Inside claude code i've made a new agent that uses the MCP gemini through https://github.com/raine/consult-llm-mcp. this seems to work!
Claude code:
Now let me launch the Gemini MCP specialist to build the backend monitoring server:
gemini-mcp-specialist(Build monitoring backend server) ⎿ Running PreToolUse hook…
[-]
- tikimcfee 2 minutes ago
  Here's a portable binary you drop in a directory to allow agentic cli to cross communicate with other agents, store and read state, or act as the driver of arbitrary tmux sessions in parallel: https://github.com/tikimcfee/gomuxai
- pella 2 hours ago
  https://github.com/just-every/code "Every Code - push frontier AI to it limits. A fork of the Codex CLI with validation, automation, browser integration, multi-agents, theming, and much more. Orchestrate agents from OpenAI, Claude, Gemini or any provider." Apache 2.0 ; Community fork;
  [-]
  - bahaAbunojaim 1 hour ago
    When you say orchestrate agents then what it would do? Would it allow the same context across agents and can I make agents brainstorm?
- rane 2 hours ago
  My similar workflow within Claude Code when it gets stuck is to have it consult Gemini. Works either through Gemini CLI or the API. Surprisingly powerful pattern because I've just found that Gemini is still ahead of Opus in architectural reasoning and figuring out difficult bugs. https://github.com/raine/consult-llm-mcp
  [-]
  - bahaAbunojaim 2 hours ago
    This is one of the reasons I actually built it but wanted to make it more generalized to work with any agent and on the same context without switching
  - tiku 1 hour ago
    I like this solution that you can ask Gemini
    [-]
    - bahaAbunojaim 1 hour ago
      Any other ideas that you think would make it more powerful?
      [-]
      - tiku 1 hour ago
        Perhaps that you can tell it to "use gemini for task x, claude for task y" as sub-agents.
        [-]
        bahaAbunojaim 1 hour ago
        How about adding the ability to tag an agent. for example:
        @gemini could you review the code and then provide a summary to @claude?
        @claude can you write the classes based on an architectural review by @codex
        What do you think? Does that make sense ?
- esafak 1 hour ago
  http://opencode.ai/
  [-]
  - bahaAbunojaim 1 hour ago
    Interesting indeed but would it behave the same as Claude code or will it have its own behavior, I think the system prompt is one of the key things that differentiate every agent
    [-]
    - esafak 1 hour ago
      I do not understand your question. Even in Claude code you have access to multiple models. You can have one critique the other.
- bahaAbunojaim 2 hours ago
  I can make it for the terminal if that would be helpful, what do you think?
- vulture916 2 hours ago
  Pal MCP (formerly Zen) is pretty awesome.
  https://github.com/BeehiveInnovations/pal-mcp-server
  [-]
  - bahaAbunojaim 1 hour ago
    Will give it a look indeed, I think one of the challenges with the MCP approach is that the context need to be passed and that would add to the overhead of the main agent. Is that right?
    [-]
    - vulture916 1 hour ago
      The CLINK command will spawn separate CLI.
      Don’t quote me, but I think the other methods rely on passing general detail/commands and file paths to Gemini to avoid the context overhead you’re thinking about.
DenisM 46 minutes ago
Multi agent collaboration is quite likely the future. All agents have blind spots, collaboration is how they are offset.
You may want to study [1] - this is the latest thinking on agent collaboration from Google.
[1] https://www.linkedin.com/posts/shubhamsaboo_we-just-ran-the-...
[-]
- bahaAbunojaim 34 minutes ago
  Thank you so much for sharing Denis! I definitely believe in the that as the world start switching from single agent to agentic teams where each agent does have specific capabilities. do you know of any benchmarks that covers collaborative agents ?
sorokod 32 minutes ago
Have you tried executing multiple agents on a single model with modified prompts and have them try to reach consensus?
That may solve the original problem of paying for three different models.
[-]
- bahaAbunojaim 28 minutes ago
  I think you will still pay for 3 times the tokens for a single model rather than 3 but will consolidate payment.
  I was thinking to make the model choice more dynamic per agent such that you can use any model with any agent and have one single payment for all so you won’t repeat and pay for 3 or more different tools. Is that in line with what you are saying ?
  [-]
  - sorokod 25 minutes ago
    Neither the original issue (having three models) nor this one (un consolidated payments) have anything to do with the end result / quality of the output.
    Can you comment on that?
    [-]
    - bahaAbunojaim 16 minutes ago
      Executing multiple agents on the same model also works.
      I find it helpful to even change the persona of the same agent “the prompt” or the model the agent is using. These variations always help but I found having multiple different agents with different LLMs in the backend works better
      [-]
      - markab21 10 minutes ago
        I love where you're going with this. In my experience it's not about a different persona, it's about constantly considering context that triggers, different activations enhance a different outcome. You can achieve the same thing, of course by switching to an agent with a separate persona, but you can also get it simply by injecting new context, or forcing the agent to consider something new. I feel like this concept gets cargo-culted a little bit.
        I personally have moved to a pattern where i use mastra-agents in my project to achieve this. I've slowly shifted the bulk of the code research and web research to my internal tools (built with small typescript agents).. I can now really easily bounce between different tools such as claude, codex, opencode and my coding tools are spending more time orchestrating work than doing the work themselves.
    - sorokod 14 minutes ago
      (BTW, givent token cashing your argument of 3 x 1 = 1 x 3 deserves more scrutiny)
- mmaunder 28 minutes ago
  Yeah having codex eval its own commits is highly effective. For example.
  [-]
  - bahaAbunojaim 21 minutes ago
    I agree, I find it very helpful to ask agents to think using a different persona too
RobotToaster 31 minutes ago
That sounds like it could get expensive?
[-]
- bahaAbunojaim 25 minutes ago
  Not if you optimize the tokens used. This is what DeepMyst actually do, one of the things we offer is token optimization where we can reduce up to 80% of the context so even if you use twice the optimized context you will end up with 60% less tokens.
  Note that this functionality is not yet integrated with Mysti but we are planning to add it in the near future and happy to accelerate.
  I think token optimization will help with larger projects, longer context and avoiding compact.
Tarrosion 2 hours ago
> Is multi-agent collaboration actually useful or am I just solving my own niche problem?
I often write with Claude, and at work we have Gemini code reviews on GitHub; definitely these two catch different things. I'd be excited to have them working together in parallel in a nice interface.
If our ops team gives this a thumbs-up security wise I'll be excited to try it out when back at work.
[-]
- bahaAbunojaim 2 hours ago
  Would love to hear your feedback! Please let me know if I can make it any better or if there is anything that would make it very useful
adiga1005 1 hour ago
I have been using it for some time and it getting better and better with time in many cases it’s giving better output than other tools the comparison is great feature too keep up the good work
[-]
- bahaAbunojaim 23 minutes ago
  Thank you so much! Let me know if you face any issues and happy to address it
prashantsengar 2 hours ago
This is very useful! I frequently copy the response of one model and ask another to review it and I have seen really good results with that approach.
Can you also include Cursor CLI for the brainstorming? This would allow someone to unlock brainstorming with just one CLI since it allows to use multiple models.
[-]
- bahaAbunojaim 1 hour ago
  I’m planning to add Cursor and Cline in the next major release, will try to get in out in Jan
  [-]
  - reachableceo 1 hour ago
    Please also add qwen cli support
    [-]
    - bahaAbunojaim 1 hour ago
      Will do. I was thinking of also making the LLMs configurable across the agents. I saw a post from the founder of openrouter that you can use DeepSeek with Claude code and was thinking of making it possible to use more LLMs across agents
danr4 2 hours ago
licensing with BSL when basically every month the AI world is changing is not a smart decision.
[-]
- rynn 1 hour ago
  > licensing with BSL when basically every month the AI world is changing is not a smart decision
  This turned me off as well. Especially with no published pricing and a link to a site that is not about this product.
  At minimum, publish pricing.
  [-]
  - bahaAbunojaim 52 minutes ago
    Regarding DeepMyst. In the future will offer “optionally” the ability to use smart context where the context will be automatically optimized such that you won’t hit the context window limit “ basically no need for compact” and you would get much higher usage limits because the number of tokens needed will be reduced by up to 80% so you would be able to achieve with a 20 USD claude plan the same as the Pro plan
  - bahaAbunojaim 1 hour ago
    It is free and open source. Will make it MIT
- bahaAbunojaim 1 hour ago
  Thinking of switching to MIT, what do you think? Is there any other license you would recommend ?
  [-]
  - RobotToaster 34 minutes ago
    AGPL, it requires anyone who creates a derivative to publish the code of said derivative.
    [-]
    - bahaAbunojaim 24 minutes ago
      Good idea! Very good point
altmanaltman 2 hours ago
> Would love feedback on the brainstorm mode. Is multi-agent collaboration actually useful or am I just solving my own niche problem?
If it's solving even your own niche problem, it is actually useful though right? Kind of a "yes or yes" question.
[-]
- bahaAbunojaim 2 hours ago
  True and hearing feedback is always helpful and helps validate if it is a common problem or not
dunkmaster 1 hour ago
Any benchmarks? For example vs a single model?
[-]
- bahaAbunojaim 1 hour ago
  It would be great if the community can run some benchmarks and post it on the repo, planning to do that sometime in Jan
p1esk 1 hour ago
Why limit to 2 agents? I typically use all 3.
[-]
- bahaAbunojaim 1 hour ago
  Planning to make it work without that limit, did that to avoid complexity but contributions are welcome
  I think once I add cursor and cline then will also try to make it work with any number of agents
Alifatisk 2 hours ago
This reminds me a lot of eye2.ai, but outside of coding
[-]
- bahaAbunojaim 2 hours ago
  I will check it out indeed. What is common between the two?
  [-]
  - Alifatisk 1 hour ago
    I guess both consult multiple llms and draw conclusion from them to cover blindspots
    [-]
    - bahaAbunojaim 55 minutes ago
      I think the main difference is that Mysti consults with agents rather than the underlying LLM and in the future potentially the agents can switch LLMs as well