Using a digital device such as a processor makes it quite difficult to generate audio signals in an analogue domain. Digital devices talk about '1's and '0's whereas audio devices talk about a scalar volume as an analogue voltage. So how do we convert from one world to another?
At the end-of-the-day the speech program will end up trying to produce a sound wave from a volume in the range 0 to 15. Where 0=quiet and 15=full volume. Quite how it makes this 0 to 15 value available, or audible, to the outside world is one of the most difficult aspects.
The easy answer is that we need a DAC (digital to analogue convertor). But our processor doesn't have one!
So we will have to build one!
There are many/many ways to do this:-
1. One wire PWM
Look back at FM Radios. They used a constant carrier (high) frequency as the transport. The audio was then imposed by making small changes to the frequency. At the receiver end, you could cancel out the carrier frequency which was much higher, to be just left with the small 'audio' frequency. So if we used PWM with a very high frequency we could encode the audio on top (by changing the duty cycle) and then filter out the PWM base frequency at the recevier end to return to the audio information. Why all this encode/decode stuff? Well it allows us to transmit a signal over a single wire that can be returned back into an audio signal. But this is the most complex! So we only need one I/O pin from our controller but can still try to achieve the 16 volume levels to make the speech as clear as possible.
2. One wire Digital.
This is by far the easiest to understand and the cheapeast to implement. Given that the volume can be in the range 0 to 15 then we can turn it into a digital signal by saying: if input is 0.....7 then output=0 else if input is 8......15 then output=1. So we have changed the volume from 0...15 to 0..1. Of course, the output signal is somewhat of a distortion (or simplification) of the input signal. ie signal in = 0...15, but signal out = 0...1. So something has been lost - and it is the nuances of the sounds. But we can turn the resultant 0 or 1 into 'speaker off' or 'speaker on' commands very easily to make our sound. So we only need one I/O pin from our controllerand our resultant circuitry is simple, if less effective.
3. 4 wire Digital
The previous example is 'digital' and so can only say 'sound on' or 'sound off' - ie it 'shouts' or 'is quiet'. But since the speech core can generate sound envelope volumes from 0 to 15 then how do we implement them? This option gives us 2 to the power of 4 (ie 16) possible values. If we have 4 output pins available on our controller then we can output all of the values from 0 to 15. How we convert this back to the analogue world is done by the magic of R/2R ladders. See http://en.wikipedia.org/wiki/Resistor_Ladder. This allows us to continue to use the spectrum of volumes but requires 4 I/O pins to do so.
The additional hardware required for each of these options is given in the following sections:-
This mode uses a single PWM output pin (default is port B2). The PWM is set up to oscillate at an inaudible frequency. The volume levels 0 to 15 are then used to change the duty cycle between 0 percent and 50 percent.
The electronics to decode this signal are made fairly complex. First we use a low pass filter to filter out the high pitch carrier frequency. We then use an audio amplifier to amplify the resultant signal. I have included a potentiometer/trimmer as a volume control. If you set the volume too high then you may well get all sorts of squeal !!!. You could use a breadboard to find the best setting of the trimmer and then replace it with fixed resistors.
To activate this mode you need to edit global.h and make sure that the line:-
is not commented out. You cannot change the port without digging around in the code a bit and requires you to understand how to change the AVR registers for the different PWM modes.
The attached Eagle schematic shows you how to create the circuit:-
This mode uses a single output pin to drive the speaker. To activate this mode edit the 'global.h' file and make sure the line
is not commented out.
You can select which IO port and pin to use in global.h by changing these lines which currently use PORTB pin 4 :-
#define SOUND_PORT PORTB
#define SOUND_DDR DDRB
#define SOUND_BIT BV(PB4)
This mode only requires the following additional hardware:-
1 x 10k ohm resistor
1 x 47 ohm resistor
1 x 2N2222 or similar NPN transistor
1 x 8 ohm 0.1 watt speaker
The following schematic shows how to assemble the electronics. Note that we use 0v, +5v, and the I/O port pin so you can use a standard 3 way cable (like the ones you use to connect to servos or sensors).
This mode allows you to use 4 contiguous output pins from the same port to output the 16 different volume levels.
To activate this mode then edit the global.h and make sure that the line:-
is not commented out.
You can change the port and pins that are used by editing the following lines in global.h
#define QUAD_PORT PORTC
#define QUAD_DDR DDRC
#define QUAD_BIT PC0
#define QUAD_MASK (BV(QUAD_BIT) | BV(QUAD_BIT+1) | BV(QUAD_BIT+2) | BV(QUAD_BIT+3))
the above values will use Port C pins 0,1,2,3.
This mode may be preferable to PWM if you are using the PWM ports already, say for controlling motors, but it does need 4 I/O pins. The electronics of the audio driver board are also fairly complex. First of all we take the outputs from the pins and feed them into a resistor ladder, which acts like a digital-to-analogue convertor, and generates a signal between 0v and 5v in 16 steps. Since the 'ladder' requires lots of resistors with one value, and lots more with twice that value, then I have opted (in my schematic) to use several SIL Resistor arrays of 10k resistors. These things are quite cool in this scenario. A single package has 8 pins in a single line (hence SIL). Each adjacent pair of pins contains one 10k resistor. So an 8 pin package has 4 resistors. Note how I put some in parallel to generate 5k resistors in order to satisfy the requirement of the ladder. These devices are normally rated at +-2% which is better than your average resistor.
The output of the ladder (ie the analogue signal) is then fed into a unity gain operational amplifier just to give it some extra oomph. Note how the op-amp has power supplied via an RC network to try to protect it from power supply noise.
The output of the op amp is used to drive a transistor that drives the speaker.
Here's the schematic:-