go_away

Author Topic: Text to speech synthesizer  (Read 3148 times)

0 Members and 1 Guest are viewing this topic.

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Text to speech synthesizer
« on: October 23, 2008, 12:12:24 PM »
I've spent quite a while creating a program that will accept English text and then speak it.

The program takes up 12018 bytes of code (ie 73% of an ATMega168) and 516 of SRAM (ie 50% of an ATMega168) - but this may increase - and it accepts the English text via the USART.

Since its so big then my intention is to have a dedicated ATMega168 just for doing it. This could be useful for debugging as an alternative to plugging in an LCD display?

Currently it outputs the speech via port B4 as a square wave, or via B2 as a PWM output.

The processor must be running at 8MHz so you need to clear the fuse bit which, by default, makes the 168 run at 1Mhz.

The only additional hardware required is a couple of resistors, one transistor, and a 8 ohm 0.1 watt speaker - which together make up the speaker driver. But I'm looking into what benefits I can get by adding a low pass RC filter. So it's all breadboarded at the moment - no Eagle schematic/board.

Since its driven by the USART then you could connect it your existing controller to speak the text you send it from your UART. Alternatively you could add a MAX232 chip to convert to RS232 levels and drive it via Hyperterminal etc from your PC.

If anyone is interested I could release the 8MHz ATMega168 .hex file but not the source code at the moment !!

Since this is a 'home-made' system then the output quality is not as good as you may get from using dedicated text-to-speech ICs - but my goal was to do it all in software without any extra hardware.

Anyone interested?

Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline airman00

  • Contest Winner
  • Supreme Robot
  • ****
  • Posts: 3,653
  • Helpful? 21
  • narobo.com
Re: Text to speech synthesizer
« Reply #1 on: October 23, 2008, 12:21:29 PM »
interesting indeed

can't wait till you release the source code

Do you have any sample audio that it spoke ?
Check out the Roboduino, Arduino-compatible board!


Link: http://curiousinventor.com/kits/roboduino

www.Narobo.com

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #2 on: October 23, 2008, 12:25:58 PM »
Hmm - knew someone would ask that  ::) So its a good question!

So I need to dig out an old microphone and start recording, along with any background hiss, the dog barking, the cockerels crowing etc etc And upload them as mp3s

Will need some time to do this and will post back once done.

Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #3 on: October 23, 2008, 05:08:49 PM »
Ok - so to start with here's the picture of my hardware layout.

This uses my board from the tutorial at
Code: [Select]
http://www.societyofrobots.com/member_tutorials/node/190This is similar to a standard $50 robot tutorial board but allows each I/O pin to decide if it uses the regulated 5v supply or the unregulated motor supply. Since there are no motors involved here then there is only the one 9v battery that connects to the regulator to give the 5v regulated supply which is then used to power the MAX232 RS232 interface and the loud speaker.

In this instance it is also different in that it also uses the plug-in-compatible AVR ATMega168 rather than the ATMega8 as the ATMega8 doesn't have enough on-board memory for the job.

In summary: this is a standard $50 board and the only extra hardware are the 2xresistors + 1xtransistor + 1xSpeaker. Alll the rest of the 'magic' is in the software. My next post will have the sound clips.
Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #4 on: October 23, 2008, 05:54:50 PM »
Here are some sound clips. Since they are mp3 sound files and this site doesn't support them then I have had to compress them into a zip file.


Obviously including a few sound clips is no testament to how good/bad the system is. But judge for yourselves.

I've included  4 sound clips, at random, that say the following:-
1. 'Admin has gone crazy'
2. 'What is your name?'
3. 'My name is Airman00.'   (He asked for the clips so he is now immortally digitized!)
4. 'Where is Admin? He is in Thailand.'

These examples just accept the plain vanilla English text and come out with the recorded results - no fudging by me.

The software does the following:-
1. Change English text to phonemes and pass to stage 2
2. Change phonemes into list of 'sound' commands and pass to stage 3
3. Play sound commands.

So interacting with the program at steps 2 or 3 allows you to 'tweak' the spoken result - which is an option if your program only needs to say a few things. If it needs to say 'anything on earth' then step 1 is the answer - as per these attached examples.

To try and keep memory small, for now, there are a few things that it isn't very good at:
1. A decimal number such as '12.34' - the '.' comes out as a pause just like the end of a sentence
2. Mnemonics such as 'AVR' - which it would try to say as if it was a proper word and may come out as something like  'avver'. Equally 'IBM' would be similarly trashed.


Anyone want the hex file to program their own 168?

« Last Edit: October 23, 2008, 06:05:30 PM by Webbot »
Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline airman00

  • Contest Winner
  • Supreme Robot
  • ****
  • Posts: 3,653
  • Helpful? 21
  • narobo.com
Re: Text to speech synthesizer
« Reply #5 on: October 23, 2008, 06:20:13 PM »
dude that is awesome !

3. 'My name is Airman00.'   (He asked for the clips so he is now immortally digitized!)
thanks!  ;D

when are you releasing the source code???
Check out the Roboduino, Arduino-compatible board!


Link: http://curiousinventor.com/kits/roboduino

www.Narobo.com

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #6 on: October 23, 2008, 06:43:05 PM »
Source code? Well the problem is that SoR isn't the best place for software version control - as its mainly just a forum. Equally I don't want the hassle of creating a project on 'sourceforge' or similar.

So my reasoning for holding back is that once it's out then its out - so I'd like to make sure it does 99% of what people want before the source comes out. By keeping it as a separate module (hardware and software) then it is usable by any mcu (PIC, AVR, STAMP) etc - so less for me to support!!

You can currently control it via your microcontroller's UART or from a PC via a terminal program. What else could I add in? And how useful would it be:
1 - Have a jumper to select baud rate 9600/19200. That 168 has loads of unused I/O pins!!!!
2 - Have a jumper to select between square wave output and PWM output
3 - Have a jumper to select between Serial or i2c comms
4 - Have some other commands to control software stuff like the default voice pitch etc
oh and I am still experimenting with audio drivers to get better outputs.

Once this is covered it would be nice to allow people to 'plug-in' vocabularies/phonemes for languages other than English. This could be achieved by allowing you to do a 'make' that links in my compiled 'core' with a user supplied dictionary.

So: as you can see there is a lot that could be done.

I know WHY you want the source code. Like me you are a techie and want to know how it all works.

If there is sufficient interest in the 'end result', and I have made it as flexible as possible, then I would then release the source code so that like AVRlib it could just be linked in. But for all of the above reasons - I don't want to do it now and then have to release a new version every week as I don't know how to do this via this forum..... unless someone knows better?

Quote
dude that is awesome !
Thanks man.



Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline dunk

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 1,086
  • Helpful? 21
Re: Text to speech synthesizer
« Reply #7 on: October 24, 2008, 03:30:07 AM »
i'm curious how it is you are creating the correct sounds in software.
have you digitised speech, saved the common sounds and are replaying them?

as for supporting projects on the net if you ever release the firmware,
i always put a disclaimer that i have no interest in providing support on any of my documentation and get very few time wasting emails.
most i do get are making genuinely helpful suggestions on ways to improve the project.
i appreciate it is a lot of work getting your source code up to a human readable standard though.


dunk.

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #8 on: October 24, 2008, 09:40:09 AM »
i'm curious how it is you are creating the correct sounds in software.
have you digitised speech, saved the common sounds and are replaying them?

Yes - in very simplistic terms. In practice it is more difficult than that as the pitch of individual sounds can be influenced by what comes before/after. Also: rather than playing one sound, then the next, then the next etc you get a better effect if you fade one sound out whilst fading the next one in.

Thanks for your comments re releasing. I'll get their eventually but as you say there is still some work to do to tidy up and document some of the code. Refactoring code that has no user interface is also a bit more tricky than normal as its hard to detect any small errors that have been introduced.
Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline Admin

  • Administrator
  • Supreme Robot
  • *****
  • Posts: 11,659
  • Helpful? 169
    • Society of Robots
Re: Text to speech synthesizer
« Reply #9 on: October 27, 2008, 08:06:42 AM »
Quote
Admin has gone crazy
lol

Nice work! The files were a bit hard to understand, but I definitely want to add this to my robot!

As for version control . . . I didn't want to go through the sourceforge effort either. For my Axon, I just post all version releases, and update a text file that summarizes any changes I made since the last version.

I hope you are making a members tutorial on how to add this to the $50 robot when you're done? (hint hint :P)

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #10 on: October 28, 2008, 03:16:09 PM »
Have done a bit of debugging and found a few problems. So the quality is a bit better. But synthetic speech will always be a bit awkward to understand. Have also now allowed you to speak a line of phonemes - as this gives you better control on the output compared with the mcu trying to convert English to phonemes.

Am now going to work on the output electronics - some low pass filters + amplifier etc which should also help a bit.

Tutorial in the pipeline !!  ::)
Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

Offline WebbotTopic starter

  • Expert Roboticist
  • Supreme Robot
  • *****
  • Posts: 2,150
  • Helpful? 109
Re: Text to speech synthesizer
« Reply #11 on: October 30, 2008, 11:24:32 PM »
So here is the tutorial, source code etc: http://www.societyofrobots.com/member_tutorials/node/211

Note that I have provided 3 different I/O methods to output the sound. Each method requires either 1 or 4 I/O pins, and varying amounts of electronics as an audio driver. In its simplest form it just needs one IO pin plus: 2xresistors, 1xtransistor, and 1xspeaker (of course!).

Setting which method is used can be done via the makefile (note that you can choose any number of the above IO methods), along with setting port addresses etc.

For the more advanced folk out there I would be keen to hear back about your thoughts about the different I/O methods - especially with regard to sound quality!!  I don't profess to be an electronic engineer and so if you come up with any better audio drivers I would love to hear about them.

I purposely haven't covered other IO stuff - like how to create a UART to PC serial link board using, say, the MAX232 chip as there is already a host of info already out there in the Google ether.

This project has taken up LOADS of my time - but I'm quite pleased by it (although my wife may beg to differ !) So get to work and give your 'bot a voice - and let me know how you get on...... In the meantime I really need to get some sleep!

Webbot
Webbot Home: http://webbot.org.uk/
WebbotLib online docs: http://webbot.org.uk/WebbotLibDocs
If your in the neighbourhood: http://www.hovinghamspa.co.uk

 


Get Your Ad Here

data_list