Hello world!
This is my first post! Glad to be a part of the community.
I'll get right to it; I'm trying to build a robot that does quite a bit of AI using image manipulation and object recognition. I'm starting with a Raspberry Pi 3, and have, so far, implemented a node server where it displays facial recognition and detection using a webcam and opencv.
I would like to experiment with voice commands, conversational speech and audible cues. I would like to do it in node, but this is where I'm asking for help. I dont know where to start - all the tutorials that I come across have primitive keyboard interfaces (cmd+c) to stop listening and what not...
Is it possible to setup an interface where i can simply say "robot, what was the score of last nights game?" using Google (or something) and not have to wait for a listening cue - or better yet - not have to say a trigger word "robot", and just use an audible proximity - have the robot detect if I'm talking to it using some sort of dopler range/ direction detection.
Ultimately, I would like to use a combination of these technologies to trigger image detection, processing and the associated AI in correlation with a cloud based javascript API.
I know it was a lot for a first post, but any help would be greatly appreciated.
I've scoured Pi forums and OpenCV forums - "You're our only Hope!"
Thanks
Keith