My Journey to a reliable and enjoyable locally hosted voice assistant (2025)

Vaslo 353 points 103 comments March 16, 2026
community.home-assistant.io · View on Hacker News

Discussion Highlights (14 comments)

dewey

Their first version is most likely already 10x better than Siri. > Understands when it is in a particular area and does not ask “which light?” when there is only one light in the area, but does correctly ask when there are multiple of the device type in the given area.

yanis_t

I'm still waiting till the promise of voice AI that was showed during the OpenAI demo in 2024 turn real somehow. It's not clear to me, why there has been zero progress since then.

voidUpdate

Do people like talking to voice assistants? I've used one occasionally (mostly for timers when I'm cooking), but most of the time it would be faster for me to just do it myself, and feels much less awkward than talking to empty air, asking it to do things for me. It might be because I just really don't like making more noise than I have to (Yes, I appreciate that some people may be disabled in such a way that it makes sense to use voice assistants, eg motor problems)

gausswho

This is five months old now. Any substantial changes to the recommended setup?

hamdingers

If you're less concerned about privacy, I use Gemini 2.5 Flash for this and it's exceptionally good and fast as a HA assistant while being much cheaper than the electricity that would be needed to keep a 3090 awake. The thing that kills this for me (and they even mentioned it) is wake word detection. I have both the HA voice preview and FPH Satellite1 devices, plus have experimented with a few other options like a Raspberry Pi with a conference mic. Somehow nothing is even 50% good as my Echo devices at picking up the wake word. The assistant itself is far better, but that doesn't matter if it takes 2-3 tries to get it to listen to you. If someone solves this problem with open hardware I'll be immediately buying several.

daveoc64

I've recently purchased a couple of the Home Assistant Voice Preview Edition devices, and they leave a lot to be desired. The wake word detection isn't great, and the audio quality is abysmal (for voice responses, not music). Amazon has ruined their Alexa and Echo devices with ads and annoying nag messages. I'd really like an open alternative, but the basics are lacking right now.

tkems

One that I have been experimenting with is using analog phones (including rotary ones!) to act as the satellites. I live in an older home and have phone jacks in most of the rooms already so I only had to use a single analog telephone adapter. [0] The downside is I don't have wake word support, but it makes it more private and I don't find myself missing my smart speakers that much. At some point I would like to also support other types of calls on the phones, but for now I need to get an LLM hooked up to it. [0] https://www.home-assistant.io/voice_control/worlds-most-priv...

ljclifford

actually the hardest part of a locally hosted voice assistant isn't the llm. it's making the tts tolerable to actually talk to every day. the core issue is prosody: kokoro and piper are trained on read speech, but conversational responses have shorter breath groups and different stress patterns on function words. that's why numbers, addresses, and hedged phrases sound off even when everything else works. the fix is training data composition. conversational and read speech have different prosody distributions and models don't generalize across them. for self-hosted, coqui xtts-v2 [1] is worth trying if you want more natural english output than kokoro. btw i'm lily, cofounder of rime [2]. we're solving this for business voice agents at scale, not really the personal home assistant use case, but the underlying problem is the same. [1] https://github.com/coqui-ai/TTS [2] https://rime.ai

xrd

I've been having a lot of fun using my old Mycroft AI device. Neon is the new software package. It didn't solve the issues highlighted in this thread, but it is a fun open device to hack on. I wrote a little web app that will speak in the standard voice and say things like "hey kids, I'm AI and know everything, and your dad is really cool." They love to yell at me when I do that.

kbuck

I bought a Home Assistant Voice Preview Edition to try out. It's surprisingly good, but still falls short when compared to Google Home speakers: - Wake word detection isn't as good as the Google Homes (more false positives, more false negatives - so I can't just tune sensitivity). - Mic and speakers are both of poor quality in comparison to Google Home devices. - Flow is awkward. On a Google Home device, you can say "Okay Google, turn on the lights" with no pause. On the Voice PE, you have to say "Hey Mycroft [awkward pause while you wait for the acknowledgement noise] turn on the lights" - it seems like the Google Home devices start buffering immediately after the wake word, but the Voice PE doesn't. - Voice fingerprints don't exist, so this prevents the device from figuring out that two separate people are talking, or who is talking to it. - The device has poor identification of background noise, so if you talk to it while there is a TV playing speech in the background, it will continue to listen to the speech from the TV. It will eventually transcribe everything you said + everything from the TV and get confused. (This probably folds into the voice print thing as well.) On the upside, though: - Setting it up was really easy. - All of the entities I want to control with it are already available, without needing to export them or set them up separately in Google Home. - Despite all of the above complaints, the device is probably 80-90% of what I realistically need to use it day-to-day. If they throw a better speaker and mic array in, I'd likely be comfortable replacing all of my Google Homes.

quirk

The best fix I've made to any voice-mode AI is giving it a "done" word. So it has to listen for "pineapple" before it's allowed to process what I said. Just like radio comms (over and out).

leeeeeep1012

nice i run one dictatorflow.com that i open sourced lee101/voicetype

jimmcslim

I’m keen to see if Nabu Casa release an update to the Voice Assist hardware sometime soon. Something with the same fidelity and finish of the Amazon and Google options but open would be fantastic.

Animats

Is there a locally hosted voice assistant for Android phones? One available through F-Droid, if possible.

Semantic search powered by Rivestack pgvector
3,471 stories · 32,344 chunks indexed