I mentioned before about using Archy (as I call him for short, due to IRC
username limits) as a text-interface, but I'm excited to be able to chat with
him. I had to build each piece separately, but it was well worth it. Being
able to simply pull my phone out and interact with him has made some things a
lot easier, and a lot more fun. I need to work on the latency a bit, but
overall I'm happy with the results.
Here's the breakdown of each part: Whisper as the speech transcription (giving Archimedes ears). Mimic3 as the text-to-speech system (giving Archimedes a voice). Mistral-7b-instruct as the underlying AI model (giving Archimedes a brain).
It was a pain to set up, but that was mostly due to my dislike of JS and
modern web development. Getting the actual services running was easy enough.
It's all in a docker container. I haven't put it into k8s yet, but it's only a
matter of time as I need to spend more time with specifying which
hardware it should run on in k8s, and do some tests. It does take a while to
build the container, as moving large AI files around is no fun, but I don't
forsee it going down often, so the reliability of the service should pretty
good.
Overall I'm pretty excited for this. I've got some ideas to give him simple
tasks (draft emails, change music, set timers, etc etc), but that's all going
to need a lot of build-out and testing. I can't wait to get Archimedes doing
a lot of the mundane tasks that I need, so that I don't have to!