I also ended up writing a classifier using some python library that seems to outperform home assistant's implementation. Not sure what the issue is there. I just followed the instructions from an LLM and the internet.
1. Define intents, notate keywords for intents that consist of a couple of phrases.
2. Tokenize, handle stopwords, replace synonyms, run a spell checker algorithm (get the best match from a fuzzy comparison).
3. Extract intent, process it, get the best matching entity.
Some of the magic numbers had to be hand-cultivated by a suite of tests I used to derive them, but other than that, it feels pretty straightforward.
I don't know anything about ML or classifiers or intents, I'm just a software engineer that got the rough outline from GPT-4 and executed the task.
I also wrote a machine learning classifier, but I didn't like the results. I ended up going with nltk/fuzzywuzzy because I felt the performance was superior for my dataset. Perhaps this is where HA goes wrong.
Anyways, I use porcupine to listen, VAD to actively listen, and local whisper on a 24 core server to transcribe.
I also ended up writing a classifier using some python library that seems to outperform home assistant's implementation. Not sure what the issue is there. I just followed the instructions from an LLM and the internet.