Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

2026-01-2123:22534153huggingface.co

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Show article

A 1.5B parameter model for next-edit autocomplete, quantized to Q8_0 GGUF format.

Model Description

Sweep Next-Edit predicts your next code edit before you make it. It runs locally on your laptop in under 500ms (with speculative decoding) and outperforms models over 4x its size on next-edit benchmarks.

Usage

Download run_model.py and the model file, then:

uv pip install llama-cpp-python huggingface_hub
python run_model.py

Model Details

Format: GGUF (Q8_0 quantization)
Parameters: 1.5B
Context Length: 8192 tokens
Base Model: Qwen2.5-Coder

Example

The model uses a specific prompt format with file context, recent diffs, and current state to predict the next edit. See run_model.py for a complete example.

License

Apache 2.0

Read the original article

Comments

By leonardcser 2026-01-2210:294 reply

Hi, I tried the model and I am super impressed by the performance/quality. Thanks for making this open source!

I am the author of this Neovim plugin for edit completions. I was able to integrate it with the Sweep Edit model.

For anyone who is interested: https://github.com/leonardcser/cursortab.nvim

By lasgawe 2026-01-2216:30

Hey this is really interesting. I'll try your nvim plugin

By treyd 2026-01-2217:472 reply

Is there a port of this to Emacs or integration with gptel?

By leonardcser 2026-01-2218:13

Hi, not that I know of. Most of the code would not change. It could easily be ported to different editors. The core is the go server (`server/`).

By 9999gold 2026-01-2219:42

It seems it would be possible to use this with minuet.el. I’m not familiar with it, though.

By kevinlu1248 2026-01-2220:41

this is awesome, i'm going to try this out

By zekejohn 2026-02-051:12

Nice, could this be used to auto complete terminal/cli commands?

By KronisLV 2026-01-228:151 reply

I remember using Qwen 2.5 Coder for autocomplete with Continue.dev, that experience was a mess both in JetBrains IDEs, as well as Visual Studio Code.

People posting stuff like this is really cool because otherwise it kinda feels like nobody gives a crap, for example even with Cline/RooCode/KiloCode there’s no good way for me to hook up an autocomplete model that either runs in Ollama or maybe a remote Cerebras Code model, like KiloCode doesn’t have a proper model configuration option even if it has it for the chat or regular agentic stuff - I don’t get why autocomplete is such a special case.

I guess what I’m saying is that I’m glad someone’s at least trying so I don’t have to keep a Copilot subscription just because I genuinely like their autocomplete and the rest of it is basically wasted: Claude Code and Codex and others are better for the actual chat/agentic stuff, KiloCode and others are really nice IDE plugins.

By lostmsu 2026-01-2210:14

llama.cpp has an extension for VS Code, but configuration UX is utter crap

Hacker News