Ling Yang technical company’s newest product SPCE061A, not only has the micro controller’s function, but also has the DSP operation function, may use for to carry on the digital pronunciation (music) the signal processing. According to this characteristic, we have designed a section of pronunciation remote control, she may the speech recognition commonly used broadcasting station and the control command, like “starting”, “the close-down”, “the channel increases”,
“the channel to reduce” and so on remote control function orders and “CCTV-I”, “Beijing two” and so on broadcasting stations orders the function, regarding these broadcasting stations, the user may “shout its to cut the corresponding broadcasting station straight”, does not need each time to search, but worry; Also convenience user’s in evening optical fiber dark situation use; Is the vision is bad, blind person’s gospel.
We have also joined temperature gathering, the pronunciation newspaper warm function, through the pressed key control, she gathers the current temperature, and disseminates news “the current temperature with the exquisite sound is xx degree Celsius”.
In does not increase the cost in the situation, joined ten thousand calendar computations, the pronunciation for the pronunciation remote control to report time the function, through the pressed key, she will disseminate news “in xxxx xx month xx date” or “the morning (afternoon, evening) x : x”.
1 system composition
At present designs the air conditioning pronunciation remote control, mainly by the keyboard entry, the MIC input, temperature gathering, the speech output, the infrared emission electric circuit is composed. The pressed key uses for to complete the pressed key remote control function——The retention pressed key function, the pronunciation newspaper warm, report time; SPCE061A has a group to use in voice signal gathering specially a/D switching circuit (the MIC input), may use for to carry on the speech recognition, the sound recording and so on voice signal input, this remote control uses for to gather the voice signal, carries on the speech recognition; SPCE061A has 7 group A/D, elects 1 group carries on temperature gathering, disseminates news by the pressed key control temperature; SPCE061A has the rich time base signal, uses 2Hz to carry on the counting, and carries on ten thousand calendar computations, disseminates news the time by the pressed key control. System composition as shown in Figure 1.
2 hardware compositions
Hardware circuit as shown in Figure 2:
Altogether has designed 13 pressed keys, uses 4×4 determinant keyboard entry, IOA0—IOA3 establishes the input port, IOA8—IOA11 establishes the outlet, reserves 3 pressed keys to take spare.
Speech recognition’s hardware circuit is quite simple, MIC selects the electret microphone, the electret microphone has the structure to be simple, the weight is light, volume small, tropism, frequency response width, fidelity with an improper method good and so on merits. The electret microphone’s bias provides by the SPCE500A VMIC foot.
SPCE061A provides double channel 10 D/A to output AUD1, AUD2, each DAC channel’s output capacity is 3mA, uses in outputting the voice signal, considers the power loss question, uses the single channel AUD1 output.
Using SPCE061A IOB8 the output PWM signal (IOB8, IOB9 mouth’s special function) may take the infrared intelligence signal, the carrier frequency by programmable timer TimerA (or TimerB) the overflow frequency decision. This remote control produces the infrared intelligence signal with IOB8.
The temperature sensor uses the ordinary negative temperature coefficient thermistor, the thermistor sensitivity is high, the price is cheap. (’the nSP(tm) essence is an general nuclear structure. The other functional module in addition is may choose the structure, i.e. this kind of structure may be possible greatly small or dispensable. Attaches with the aid of this kind of general structure may choose the structure building block system’s constitution, then may form each kind of different series derivation product, suits the different application situation. Will do this will enable each kind of derivation product to have a stronger function without doubt and a lower cost.
3 software designs
The software uses the modularized program structure, the program module including the initialization, the keyboard scanning, temperature gathering, the temperature disseminates news, ten thousand calendar count-down, ten thousand calendars to disseminate news, the infrared emission, the speech recognition, the pronunciation broadcast module and so on, the procedure flow see Figure 3 to show:
Figure 3 flow chart
The system initialization including the system clock, the IO mouth, ten thousand calendar starting values, the interrupt (turns on pressed key to awaken with the 2Hz interrupt), then scans the pressed key, has the pressed key to carry on corresponding processing, enters the sleep without the pressed key, ten thousand calendar computations carry on the 2Hz interrupt service.
The pronunciation broadcast uses the audio frequency encoding algorithm which the Ling Yang science and technology company provides, records the prompt pronunciation document first on PC machine (the WAV document), and the compression tool which provides after the Ling Yang company compresses processes the binary file loads to the user program, after the translation link, saves to monolithic integrated circuit FLASH, when broadcast solves the compression to deliver D/A again to return to original state the pronunciation. Uses in u’nSPTM the essence the SPCE series chip, the Ling Yang science and technology company provides three kind of different compression ratios the algorithms, the next table is each kind of compression algorithm name and the encoding rate type.
| Compression algorithm name |
Pronunciation compression encoding rate type |
| SACM_A2000 | 16KBits/s, 20KBits/s, 24KBits/s |
| SACM_S480 | 4.8KBits/s, 7.2KBits/s |
| SACM_S240 | 2.4KBits/s |
These three kind of compression algorithm’s difference lies in the compression ratio to be different, the acoustic fidelity is also different. The SACM_A2000 compression ratio is relatively small, the acoustic fidelity is good, the corresponding resources take many. The SACM_S240 compression ratio is biggest,
the acoustic fidelity is relatively bad. SACM_S480 is situated between this both. Each algorithm has the complete storehouse function to transfer for the procedure, the software compilation is convenient. This pronunciation remote control uses the SACM_S480 algorithm.
The speech recognition divides into the specific pronunciation person to distinguish (Speaker Dependent) and the non-specific pronunciation person distinguishes (Speaker Independent) two ways.
The specific pronunciation person recognition is refers to the pronunciation model to train by the single person, to trains person’s pronunciation order recognition rate of accuracy to be high, but other person’s pronunciation order recognition rate of accuracy is low or does not distinguish. The non-specific pronunciation person recognition is refers to the pronunciation model by the disparity in age, the different sex, the different voice person to carry on the training, may distinguish a group of person’s order. The pronunciation model’s extraction is important. The pronunciation remote control is uses the non-specific pronunciation person to distinguish the way.
The speech recognition divides into the pronunciation model training and the speech recognition two processes. We call standard mode’s storage space it “the word stock”, but calls the standard mode it “the model”. The so-called pronunciation model training, is waits the recognition the order to carry on the spectral analysis, the extraction characteristic parameter takes the recognition the standard mode. Speech recognition’s process is withdraws the pronunciation order the characteristic parameter, compares with the word stock in model, takes the similarity best model order serial number to take the recognition result. The Ling Yang science and technology company provides the pronunciation model training tool and the speech recognition storehouse function, each time may distinguish 30 pronunciation orders, order many words may divide the groups to carry on, the speech recognition flow see Figure 4 to show:
Above introduced used SPCE061A to complete the infrared remote control, the speech recognition, the pronunciation newspaper warm, to report time and so on synthesis function pronunciation remote controls, the system only uses the single chip to complete pronunciation processing and the control function, compared with the special-purpose pronunciation processing chip, has the structure to be simple, the cost low, easy to realize characteristic, and the Ling Yang science and technology company has provided the rich C function storehouse and the pronunciation processing function storehouse, transferred for the user, reduced the development cycle. This makes the modification slightly, may use for to control electric appliances and so on air conditioner, videocorder; May compose the pronunciation reply system, the speech synthesis system, the interactive type toy using the SPCE061A pronunciation processing superiority and so on, has the broad market prospect.