Good results fine tuning a local LLM like Qwen 3:0.6B to categorize questions
dev-experiments
84 points
17 comments
June 21, 2026
Related Discussions
Found 5 related stories in 117.2ms across 11,176 title embeddings via pgvector HNSW
- Qwen3.5 Fine-Tuning Guide bilsbie · 311 pts · March 04, 2026 · 60% similar
- Local LLMs perform better when you teach them to ask before they answer froh · 31 pts · May 24, 2026 · 58% similar
- Surpassing vLLM with a Generated Inference Stack lukebechtel · 31 pts · March 10, 2026 · 50% similar
- How to run Qwen 3.5 locally Curiositry · 26 pts · March 07, 2026 · 50% similar
- Fine-tuning an LLM to write docs like it's 1995 taubek · 181 pts · June 05, 2026 · 49% similar
Discussion Highlights (11 comments)
jszymborski
I think the Qwen 0.6B is so cool. It is super fast and as illustrated here it has a clear niche, esp. when fine-tuned. I'm also interested in it as a student for distillation.
nl
If you are going to go to the bother of fine tuning for trivial problems like subject classification then I think you'll find Scikit Learn with a SGDClassifier on 2-grams will do probably just as well and be under 1MB for the trained classifier. You can train it in under a minute, and it will work perfectly well on embedded devices. Small LLMs are good choices for text classification in two cases: - If you next to provide in-context examples and classifier based on them. - Your classification goes beyond simple subject-type classifiers. For example, multiple choice question answering is classification where small LLM will work but traditional ML methods won't/
mickael-kerjean
If you are interested in small language model to fine tune, gemma3:270m is quite interesting for its size
deepsquirrelnet
If you want to go deeper on language models, try these project ideas: - Zero-shot encoders like tasksource or GliNER - Natural language inference: https://huggingface.co/blog/dleemiller/nli-xenc-ways-to-use - GRPO training - GEPA prompt tuning Qwen 0.6B (or GEPA, then GRPO) - Use an embedding model and train a classifier (MLP, logistic, svm) - Use a larger LLM to generate a synthetic dataset (beware of lack of diversity, mine "seed text" from real sources first) - Synthetically generate "hard examples" where more than one category may be valid and DPO tune your preferred responses
nextaccountic
> The model invents new categories (e.g. apartments) and doesn’t stick to the provided list of allowed categories Can this specific failure mode be solved by providing a grammar that the output must adhere to? (Not sure if Qwen has this feature, it's used for eg. to ensure the output is parseable json)
electroglyph
existing embedding models like alibaba's modernbert tune or one of the jina v5s would probably map query to category automatically. (i.e. store embeddings of each category and calculate cosine sim for each incoming query vs. categories and pick the closest) also, you could stick a classifier head on a BERT model as another option.
abhashanand1501
Do small language models run on cpus or you still need a gpus to run them?
zwaps
Has anyone compared recently doing something like ModernBERT plus classifier vs. full or lora FT of a small LM like qwen?
doubtfuluser
But why using an encoder model instead of a BERT based model? For a pure classification that should be easier to train and work quite well
pj_mukh
“As an example, the question “When did we replace our pool pump?” will be mapped to a category called “pool” before querying the Index database.” Cool write up! Really appreciate it but incidentally how does this categorization help you get better retrieval results?
throwa356262
Are 0.6b models useful without fine tuning? Half of the times I ask qwen 0.6b "what is 1 + 2?" it ends up in a thinking loop of "but wait, the user is asking me to ..."