Hello, I've been trying to use cursor with locally served Chat GPT or Qwen, but once I get offline it fails. It seems that there once had been a possibility to override API url, but now this is not the case.
Is there any development environment or plugin that you're using for local LLM?
For tools...
I find Aider to be the most reliable and least "invasive" (from following instructions and nothing more standpoint), but it's also the most hands-on.
Cline is a nice step up in hand holding, but does require a significantly larger context size.
Roo Code is a fork of Cline that's more focused on agentic functionality, but it also has some very poor default prompts and is much more finicky.
For running inference...
Ollama is the easiest, but generally the slowest.
Llama.cpp is a tad more involved, but faster.
vLLM is even more involved, but even faster.
Sglang can be difficult and sometimes challenging, but tends to be the fastest.
VS Code with Cline is a very analogous solution to Cursor and I can fully recommend it. If you use Ollama there are some models specifically optimized for agent tasks in Cline.
The result is pretty good in my opinion, probably depends on your development cases.
For running inference... Ollama is the easiest, but generally the slowest. Llama.cpp is a tad more involved, but faster. vLLM is even more involved, but even faster. Sglang can be difficult and sometimes challenging, but tends to be the fastest.
The result is pretty good in my opinion, probably depends on your development cases.