CN101419795B

CN101419795B - Audio signal detection method and device, and auxiliary oral language examination system

Info

Publication number: CN101419795B
Application number: CN2008102392007A
Authority: CN
Inventors: 李伟; 徐波
Original assignee: Beijing Zhichengzhuosheng Technology Dev Co ltd
Current assignee: Tianjin Xunfei Information Technology Co ltd
Priority date: 2008-12-03
Filing date: 2008-12-03
Publication date: 2011-04-06
Anticipated expiration: 2028-12-03
Also published as: CN101419795A

Abstract

The invention discloses a method used for detecting voice frequency signal and a device thereof, as well as a system for assisting oral exam; the method comprises: the voice frequency signal is recorded; when the voice signal existing in the recorded voice frequency signal is determined, the amplitude information and the noise-signal ratio information in the voice frequency signal are acquired; when the volume of the voice signal in the voice frequency signal is judged to be unnormal according to the acquired amplitude information and noise-signal ratio information, the prompt is sent out aiming at the unnormal volume of the voice signal. By adopting the technical proposal, the problems of lower reliability and flexibility of a computer system for assisting the oral exam in the prior art are solved.

Description

Audio signal detection method and device and auxiliary oral language examination system

Technical field

The present invention relates to the signal detection technique field, particularly a kind of audio signal detection method and device and auxiliary oral language examination system.

Background technology

At present, the SET of class of languages has been brought into use computer system to assist and has been carried out, the place operated by rotary motion of SET is at computer room, wherein a computing machine is invigilator's computing machine, all the other are taken an examination with computing machine (below be called the auxiliary oral language examination client) for the examinee, the supervisor monitors whole examination process by the invigilator with computing machine (following abbreviation monitoring server), after the examinee logs on oral language examination system, directly with virtual examination scene in the personage open a dialogue, thereby finish the examination task, examinee's voice under the auxiliary oral language examination client records are unified scoring to the voice of record after examination is finished.By above-mentioned this interactive examination mode, thereby having shortened the examinee simultaneously greatly yet and having waited the time of examining because of the phenomenon in the face of the normal performance of the nervous influence of supervisor can not appear in the examinee; Owing to adopting the networking scoring, realized in addition reexamining that the examinee shows, and to the dynamic tracking of teacher's performance of marking, thereby error score reduced.

But adopt the computer system auxiliary oral language examination very high to the requirement of computing machine sound pick-up outfit, because the difference of hardware and software, the situation that the speech volume that may occur recording is excessive or too small, examinee's examination recording is the fault recording under the above-mentioned situation, if just finding examinee's recording in the scoring process after examination is the fault recording, this examinee's mark just is difficult to determine so, and this just makes the reliability of computer system auxiliary oral language examination and dirigibility lower.

Summary of the invention

The embodiment of the invention provides a kind of audio signal detection method and device, in order to solve the reliability and all lower problem of dirigibility of the computer system auxiliary oral language examination that exists in the prior art.

Accordingly, the embodiment of the invention also provides a kind of auxiliary oral language examination system.

Technical solution of the present invention is as follows:

A kind of audio signal detection method, the method comprising the steps of: the recording audio signal; When in determining described sound signal, having voice signal, obtain amplitude information and signal to noise ratio (S/N ratio) information in the described sound signal; And, judge the volume of the voice signal in the described sound signal when undesired according to amplitude information that obtains and signal to noise ratio (S/N ratio) information, send prompting at the voice signal volume is undesired.

A kind of sound signal pick-up unit comprises: recording elements is used for the recording audio signal; Determining unit is used for determining whether the described sound signal that recording elements is recorded exists voice signal; First acquiring unit is used for obtaining amplitude information and signal to noise ratio (S/N ratio) information in the described sound signal when determining unit is determined described sound signal and had voice signal; First judging unit is used for amplitude information and the signal to noise ratio (S/N ratio) information obtained according to first acquiring unit, judges whether the volume of the voice signal in the described sound signal is normal; Tip element is used in the judged result of first judging unit for not the time, sends prompting at the voice signal volume is undesired.

A kind of auxiliary oral language examination system, comprise auxiliary oral language examination client and monitoring server, the auxiliary oral language examination client, be used for the recording audio signal, and when in determining described sound signal, having voice signal, obtain amplitude information and signal to noise ratio (S/N ratio) information in the described sound signal, and according to described amplitude information and signal to noise ratio (S/N ratio) information, judge the volume of the voice signal in the described sound signal when undesired, send prompting at the voice signal volume is undesired; And be used for obtaining second characteristic information of the sound signal of recording in the stipulated time length, and according to described second characteristic information that obtains, judge sound signal in this stipulated time length when undesired, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server; Monitoring server, be used to show in the stipulated time length that the auxiliary oral language examination client sends sound signal and should stipulated time length in sound signal be the information of abnormal signal.

Technical solution of the present invention is by in the computer system auxiliary oral language examination, sound signal by auxiliary oral language examination client recording examinee, when in determining the sound signal of recording, having voice signal, obtain amplitude information and signal to noise ratio (S/N ratio) information in the above-mentioned sound signal, according to amplitude information that obtains and signal to noise ratio (S/N ratio) information, whether the volume of judging the voice signal in the above-mentioned sound signal is normal, in judged result for not the time, send prompting at the voice signal volume is undesired, this has just been avoided just finding in the scoring process after examination that examinee's recording is the fault recording, make examinee's mark be difficult to determine, realized that the examination recording to the examinee detects in computer system auxiliary oral language examination process, and at detected fault recording is pointed out, thereby effectively raise the reliability and the dirigibility of computer system auxiliary oral language examination.

Description of drawings

Fig. 1 is in the embodiment of the invention, the audio signal detection method schematic flow sheet;

Fig. 2 is in the embodiment of the invention, periodically sound signal is carried out the testing process synoptic diagram;

Fig. 3 is in the embodiment of the invention, sound signal pick-up unit structural representation.

Embodiment

The embodiment of the invention proposes, in the computer system auxiliary oral language examination, by the voice signal volume of auxiliary oral language examination client in the sound signal that the examinee that judgement obtains recording sends when undesired, send prompting at the voice signal volume is undesired, thereby realized that the examination recording to the examinee detects in computer system auxiliary oral language examination process, and at detected fault recording is pointed out, thereby the reliability and the dirigibility that have improved the computer system auxiliary oral language examination.

Below in conjunction with Figure of description the embodiment of the invention is elaborated.

As shown in Figure 1, be embodiment of the invention sound intermediate frequency signal detecting method process flow diagram, its processing procedure is as follows:

Step 101, the recording audio signal.

Whether step 102 exists voice signal in the sound signal of determining to record, wherein the specific implementation of this process can be as follows:

At first obtain first characteristic information in the above-mentioned sound signal, according to first characteristic information that obtains, determine whether there is voice signal in the above-mentioned sound signal, first characteristic information that wherein obtains can but be not limited to fundamental frequency information or Mel cepstrum coefficient (MFCC, Mel-Frequency Cepstral Coefficient) information etc., preferable, first characteristic information that obtains can also be fundamental frequency information and MFCC information.

There is voice signal in step 103 if determine in the above-mentioned sound signal in the step 102, then obtains amplitude information and signal to noise ratio (S/N ratio) information in the above-mentioned sound signal.

Step 104 according to amplitude information that obtains in the step 103 and signal to noise ratio (S/N ratio) information, judges whether the volume of the voice signal in the above-mentioned sound signal is normal, and wherein the specific implementation process of this process can be as follows:

At first according to amplitude information and the signal to noise ratio (S/N ratio) information obtained, determine the volume value of voice signal in the above-mentioned sound signal, whether the above-mentioned volume value of judge determining then is between first defined threshold and second defined threshold, if, the volume of then determining voice signal is normal, if not, determine that then the volume of voice signal is unusual, stipulate here that wherein above-mentioned first defined threshold is less than second defined threshold.

The voice signal volume is further unusually may to comprise two kinds of situations, be the voice signal volume less than normal quantity and voice signal volume greater than normal quantity, concrete definite mode is: obtain volume value less than first defined threshold if judge, determine that then the volume of voice signal is less than normal quantity, if judge to obtain described volume value, determine that then the volume of voice signal is greater than normal quantity greater than second defined threshold.

Step 105 if the judged result in the step 104 is not for, is then sent prompting at the voice signal volume is undesired, wherein Ti Shi specific implementation can but be not limited to following:

If determine to obtain the volume of voice signal less than normal quantity, whether the ratio of intensity level of then judging the intensity level of the air-flow composition in the above-mentioned sound signal and voice signal is less than the 3rd defined threshold, when judgement obtains above-mentioned ratio less than the 3rd defined threshold, then send and reduce and the information of the distance between the input equipment of recording, when judgement obtains above-mentioned ratio and is not less than the 3rd defined threshold, then send the information that increases the pronunciation volume.

If determine to obtain the volume of voice signal greater than normal quantity, whether the ratio of intensity level of then judging the intensity level of the air-flow composition in the above-mentioned sound signal and voice signal is greater than the 4th defined threshold, when judgement obtains above-mentioned ratio greater than the 4th defined threshold, then send the information of distance between increase and the recording input media, when judgement obtains above-mentioned ratio and is not more than the 4th defined threshold, then send the information that reduces to pronounce volume.

When wherein the audio signal detection method of above-mentioned introduction is implemented in the auxiliary oral language examination client, just can realize that the auxiliary oral language examination client detects whether fault of the sound signal of examinee by microphone records, and when fault, in time point out the examinee to make the corresponding action adjustment, for example when the sound signal volume that detects the examinee is too small, the prompting examinee increase the pronunciation volume or reduce and microphone between distance, and when the sound signal volume that detects the examinee was excessive, the prompting examinee reduced the distance between volume or increase and the microphone or the like of pronouncing.

In addition, at the voice signal volume is undesired send prompting after, can also further obtain second characteristic information in the sound signal of recording in the stipulated time length, according to second characteristic information that obtains, judge whether the sound signal in this stipulated time length is normal, and in judged result for not the time, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server.Wherein second characteristic information can and overflow at least a information in the energy information for energy information, fundamental frequency information, pulse energy information, energy hunting information.So just can realize that the auxiliary oral language examination client can be in each stipulated time length (for example should stipulated time length can be time of having recorded one examination paper etc.), detect the sound signal fault whether in this stipulated time length, if the fault of detecting then in time sound signal in this section period and the information that breaks down are sent to monitoring server, thereby make the supervisor who is sitting in the monitoring server front can in time learn the auxiliary oral language examination client that sound signal breaks down, thereby in time make corresponding actions, guaranteed the stability of whole SET preferably.

Provide more specifically embodiment below.

In embodiments of the present invention, can periodically detect the sound signal of recording, for example establishing sense cycle is 1 second, so whenever, record 1 second sound signal, just the sound signal in this second is carried out relevant detection, also can the sound signal in the stipulated time section be detected, for example the sound signal in 2 seconds to 5 seconds is detected, again the sound signal in 10 seconds to 12 seconds is detected, whether the above-mentioned sound signal recorded of either way can detecting in the SET process is the fault recording, and sends prompting when detecting the fault recording.

As shown in Figure 2, for whenever recording 1 second sound signal, just the sound signal in this second is carried out the concrete implementing procedure figure of respective detection, its processing procedure is as follows:

Step 201, adopting length is 0.02 second, and overlapping 0.01 second rectangular window between window and the window, and the N sound signal of recording second is divided into 99 sections, and every section audio signal is as a frame, and wherein N is a natural number;

Step 201 is obtained the fundamental frequency information of each frame;

Step 203, judge whether to have more than 20 frames in this second and can get access to fundamental frequency information, if judged result is for being, then determine to have voice signal in the sound signal in this second, execution in step 204, if judged result is then determined not have voice signal, execution in step 216 in the sound signal in this second for not; Wherein can from vowel and part consonant, can get access to fundamental frequency information.

Step 204 is obtained the amplitude information and the signal to noise ratio (S/N ratio) information of sound signal in this second;

Step 205, amplitude information that obtains according to step 204 and signal to noise ratio (S/N ratio) information are determined in this second the volume value of voice signal in the sound signal;

Whether step 206, the volume value of judge determining be between first defined threshold and second defined threshold, if judged result determines that then the volume value of voice signal is unusual, execution in step 207 for not; If judged result is for being, then the volume value of definite voice signal is normal, execution in step 216;

Whether step 207 judges the volume value of determining less than first defined threshold, if then the volume of determining voice signal is less than normal quantity, execution in step 208 if not, determines that then the volume of voice signal is greater than normal quantity execution in step 211;

Step 208, whether the ratio of intensity level of judging the intensity level of the air-flow composition in the sound signal in this second and voice signal is less than the 3rd defined threshold, if then execution in step 209, if not, then execution in step 210;

If the signal to noise ratio (S/N ratio) of sound signal is in predesignating scope in this second, can increase the recording gain.

In advance by adding up the sound of speaking of different people, the average and the variance of the MFCC information of pure ground unrest and various air-flow compositions, set up corresponding mixed Gauss model, based on the model of setting up the MFCC information of the sound signal recorded is carried out Classification and Identification, determine the ratio of the intensity level of the intensity level of air-flow composition and voice signal.

Step 209 is sent and is reduced and the information of the distance between the input equipment of recording;

The output device of wherein recording is generally microphone, if the ratio of determining in the step 208 thinks then that less than the 3rd defined threshold examinee's face distance microphone is near excessively, sends the information that the prompting examinee reduces distance between face and the microphone this moment.

Step 210 is sent the information that increases the pronunciation volume;

If the ratio of determining in the step 208 is not less than the 3rd defined threshold, think that then examinee's face distance microphone is suitable, but the volume of examinee's pronunciation is too small, send the information that the prompting examinee increases the pronunciation volume this moment.

Step 211, whether the ratio of intensity level of judging the intensity level of the air-flow composition in the sound signal in this second and voice signal is greater than the 4th defined threshold, if then execution in step 212, if not, then execution in step 213;

Same, if the signal to noise ratio (S/N ratio) of sound signal is in predesignating scope in this second, can increase the recording gain.

Step 212 is sent the information that increases with the distance between the input media of recording;

If above-mentioned ratio greater than the 4th defined threshold, thinks that then examinee's face distance microphone is far away excessively, send the information that the prompting examinee increases distance between face and the microphone this moment.

Step 213 is sent the information that reduces to pronounce volume;

If above-mentioned ratio is not more than the 4th defined threshold, think that then examinee's face distance microphone is suitable, but the volume of examinee's pronunciation is excessive, send prompting examinee reduce the to pronounce information of volume this moment.

Step 214 is obtained energy information, fundamental frequency information, pulse energy information, the energy hunting information in the sound signal of recording in the stipulated time length and is overflowed energy information;

Because breaking down, sound pick-up outfit or computer system also may cause the fault recording, the waveform that comprises sound signal is straight line substantially, that is to say and do not have the record fault recording of voice signal down, the waveform of sound signal is that the fault recording of impact noise and the waveform of sound signal are the intensive fault recording of overflowing noise.

If the examinee need answer 10 road exercise questions when carrying out SET, the stipulated time length of per pass exercise question all is 5 minutes, whether when test taker answers is finished one exercise question, also will further detect this road exercise question corresponding audio signal is the fault recording that is caused by sound pick-up outfit or computer system so.

Step 215, as if the above-mentioned information of obtaining according to step 214, the sound signal of determining in this stipulated time length is unusual, the sound signal in then should stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server.

After the supervisor receives information by monitoring server, confirm whether the sound signal that receives is genuine unusual, if confirm as unusually, the supervisor can in time handle, perhaps allow this examinee adopt standby sound pick-up outfit to finish examination this time.

Step 216 after N is set to N+1, goes to step 201.

In the embodiment of the invention, judge whether there is voice signal in the sound signal according to fundamental frequency information, and according to energy information, fundamental frequency information, pulse energy information, energy hunting information with overflow energy information and judge whether sound signal is normal, all be those skilled in the art's common technology means, therefore concrete deterministic process repeats no more here.

By above-mentioned processing procedure as can be known, in the technical solution of the present invention, in the computer system auxiliary oral language examination, when in the sound signal that the examinee that judgement obtains recording sends, having voice signal by the auxiliary oral language examination client, obtain amplitude information and signal to noise ratio (S/N ratio) information in the above-mentioned sound signal, according to amplitude information that obtains and signal to noise ratio (S/N ratio) information, whether the volume of judging the voice signal in the above-mentioned sound signal is normal, in judged result for not the time, send prompting at the voice signal volume is undesired, this has just been avoided just finding in the scoring process after examination that examinee's recording is the fault recording, make examinee's mark be difficult to determine, realized that the examination recording to the examinee detects in area of computer aided SET process, and at detected fault recording is pointed out, thereby effectively raise the reliability and the dirigibility of computer system auxiliary oral language examination.

Accordingly, the present invention also provides a kind of sound signal pick-up unit, as shown in Figure 3, comprises recording elements 301, determining unit 302, first acquiring unit 303, first judging unit 304 and Tip element 305.

Wherein recording elements 301, are used for the recording audio signal;

Determining unit 302 is used for determining whether the sound signal that recording elements 301 is recorded exists voice signal;

First acquiring unit 303 is used for obtaining amplitude information and signal to noise ratio (S/N ratio) information in the above-mentioned sound signal when determining unit 302 is determined above-mentioned sound signal and had voice signal;

First judging unit 304 is used for amplitude information and the signal to noise ratio (S/N ratio) information obtained according to first acquiring unit 303, judges whether the volume of the voice signal in the above-mentioned sound signal is normal;

Tip element 305 is used in the judged result of first judging unit 304 for not the time, sends prompting at the voice signal volume is undesired.

Wherein determining unit 302 comprises that specifically obtaining subelement and first determines subelement, obtains first characteristic information that subelement is used for obtaining the sound signal that recording elements 301 records; First determines that subelement is used for according to obtaining first characteristic information that subelement obtains, and determines whether to have voice signal in the sound signal that recording elements 301 records.

First judging unit comprises that specifically second determines subelement, first judgment sub-unit and the 3rd definite subelement, second determines that subelement is used for amplitude information and the signal to noise ratio (S/N ratio) information of obtaining according to first acquiring unit 303, determines the volume value of voice signal in the sound signal; First judgment sub-unit is used to judge that second determines volume value that subelement determines whether between first defined threshold and second defined threshold, and wherein first defined threshold is less than second defined threshold; The 3rd determines that subelement is used in the judged result of first judgment sub-unit when being, the volume of determining voice signal is normal, and in the judged result of first judgment sub-unit for not the time, determine that the volume of voice signal is unusual, wherein first defined threshold is less than second defined threshold.

The 3rd definite subelement determines that the volume of voice signal comprises two kinds of embodiments unusually, first kind of embodiment: obtain the voice signal volume value less than first defined threshold if first judgment sub-unit is judged, then the 3rd definite subelement determines that the volume of voice signal is less than normal quantity; Second kind of embodiment: obtain the voice signal volume value greater than second defined threshold if first judgment sub-unit is judged, then the 3rd definite subelement determines that the volume of voice signal is greater than normal quantity.

At above-mentioned first kind of embodiment, promptly the 3rd definite subelement determines that the volume of voice signal is less than normal quantity, Tip element 305 specifically comprises second judgment sub-unit and the first prompting subelement, and second judgment sub-unit is used for judging that whether the ratio of intensity level of the intensity level of air-flow composition of sound signal and voice signal is less than the 3rd defined threshold; The first prompting subelement is used for when the judgement of second judgment sub-unit obtains above-mentioned ratio less than the 3rd defined threshold, send and reduce and the information of the distance between the input equipment of recording, and when second judgment sub-unit judges that obtaining above-mentioned ratio is not less than the 3rd defined threshold, send the information that increases the pronunciation volume.

At above-mentioned second kind of embodiment, promptly the 3rd definite subelement determines that the volume of voice signal is greater than normal quantity, Tip element 305 specifically comprises the 3rd judgment sub-unit and the second prompting subelement, and the 3rd judgment sub-unit is used for judging that whether the ratio of intensity level of the intensity level of air-flow composition of sound signal and voice signal is greater than the 4th defined threshold; The second prompting subelement is used for when the judgement of the 3rd judgment sub-unit obtains above-mentioned ratio greater than the 4th defined threshold, send the information of distance between increase and the recording input media, and when the 3rd judgment sub-unit judges that obtaining above-mentioned ratio is not more than the 4th defined threshold, send the information that reduces to pronounce volume.

Further, embodiment of the invention sound intermediate frequency signal supervisory instrument can further include second acquisition unit, second judging unit and transmitting element, and second acquisition unit is used for obtaining second characteristic information of the sound signal of recording in the stipulated time length; Second judging unit is used for described second characteristic information that obtains according to second acquisition unit, judges whether the sound signal in this stipulated time length is normal; Transmitting element is used in the judged result of second judging unit for not the time, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server.

Wherein, first characteristic information can but be not limited to fundamental frequency information and/or Mel cepstrum coefficient information; Second characteristic information is energy information, fundamental frequency information, pulse energy information, energy hunting information and overflows at least a information in the energy information.

The embodiment of the invention also provides a kind of auxiliary oral language examination system, comprise auxiliary oral language examination client and monitoring server, auxiliary oral language examination client wherein, be used for the recording audio signal, and when in determining sound signal, having voice signal, obtain amplitude information and signal to noise ratio (S/N ratio) information in the sound signal, and according to amplitude information and signal to noise ratio (S/N ratio) information, whether the volume of judging the voice signal in the sound signal is normal, and in judged result when being undesired, send prompting at the voice signal volume is undesired; And be used for obtaining second characteristic information of the sound signal of recording in the stipulated time length, and according to second characteristic information that obtains, judge whether the sound signal in this stipulated time length is normal, in judged result when being undesired, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server;

Monitoring server, be used to show in the stipulated time length that the auxiliary oral language examination client sends sound signal and should stipulated time length in sound signal be the information of abnormal signal.

Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims

1. an audio signal detection method is characterized in that, comprising:

The recording audio signal;

When in determining described sound signal, having voice signal, obtain amplitude information and signal to noise ratio (S/N ratio) information in the described sound signal; And

According to amplitude information that obtains and signal to noise ratio (S/N ratio) information, judge the volume of the voice signal in the described sound signal when undesired, send prompting at the voice signal volume is undesired;

Obtain second characteristic information in the sound signal of recording in the stipulated time length; And

According to described second characteristic information that obtains, judge sound signal in this stipulated time length when undesired, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server.

2. detection method as claimed in claim 1 is characterized in that, determines in the described sound signal to have voice signal, specifically comprises:

Obtain first characteristic information in the described sound signal; And

According to described first characteristic information, determine in the described sound signal and have voice signal.

3. detection method as claimed in claim 1 is characterized in that, according to amplitude information that obtains and signal to noise ratio (S/N ratio) information, the volume of judging the voice signal in the described sound signal is undesired, specifically comprises:

According to amplitude information that obtains and signal to noise ratio (S/N ratio) information, determine the volume value of voice signal in the described sound signal;

If judge the described volume value that obtains determining, determine that then described volume is undesired, and described volume is less than normal quantity less than first defined threshold; And

If judge the described volume value that obtains determining, determine that then described volume is undesired, and described volume is greater than normal quantity greater than second defined threshold;

Described first defined threshold is less than second defined threshold.

4. detection method as claimed in claim 3 is characterized in that, in definite described volume during less than normal quantity, sends prompting at the voice signal volume is undesired, specifically comprises:

Whether the ratio of intensity level of judging the intensity level of the air-flow composition in the described sound signal and voice signal is less than the 3rd defined threshold; And

When judgement obtains described ratio less than the 3rd defined threshold, send and reduce and the information of the distance between the input equipment of recording;

When judgement obtains described ratio and is not less than the 3rd defined threshold, send the information that increases the pronunciation volume.

5. detection method as claimed in claim 3 is characterized in that, in definite described volume during greater than normal quantity, sends prompting at the voice signal volume is undesired, specifically comprises:

Whether the ratio of intensity level of judging the intensity level of the air-flow composition in the described sound signal and voice signal is greater than the 4th defined threshold; And

When judgement obtains described ratio greater than the 4th defined threshold, send the information of distance between increase and the recording input media;

When judgement obtains described ratio and is not more than the 4th defined threshold, send the information that reduces to pronounce volume.

6. detection method as claimed in claim 2 is characterized in that, described first characteristic information is fundamental frequency information and/or Mel cepstrum coefficient information.

7. detection method as claimed in claim 1 is characterized in that, described second characteristic information is energy information, fundamental frequency information, pulse energy information, energy hunting information and overflows at least a information in the energy information.

8. a sound signal pick-up unit is characterized in that, comprising:

Recording elements is used for the recording audio signal;

Determining unit is used for determining whether the described sound signal that recording elements is recorded exists voice signal;

First acquiring unit is used for obtaining amplitude information and signal to noise ratio (S/N ratio) information in the described sound signal when determining unit is determined described sound signal and had voice signal;

First judging unit is used for amplitude information and the signal to noise ratio (S/N ratio) information obtained according to first acquiring unit, judges whether the volume of the voice signal in the described sound signal is normal;

Tip element is used in the judged result of first judging unit for not the time, sends prompting at the voice signal volume is undesired;

Second acquisition unit is used for obtaining second characteristic information of the sound signal of recording in the stipulated time length;

Second judging unit is used for described second characteristic information that obtains according to second acquisition unit, judges whether the sound signal in this stipulated time length is normal;

Transmitting element is used in the judged result of second judging unit for not the time, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server.

9. pick-up unit as claimed in claim 8 is characterized in that, described determining unit specifically comprises:

Obtain subelement, be used for obtaining first characteristic information of the described sound signal that recording elements records;

First determines subelement, is used for according to obtaining described first characteristic information that subelement obtains, and determines whether to have voice signal in the described sound signal that recording elements records.

10. pick-up unit as claimed in claim 8 is characterized in that, described first judging unit specifically comprises:

Second determines subelement, is used for amplitude information and the signal to noise ratio (S/N ratio) information obtained according to first acquiring unit, determines the volume value of voice signal in the described sound signal;

First judgment sub-unit is used to judge that second determines described volume value that subelement determines whether between first defined threshold and second defined threshold, and wherein said first defined threshold is less than second defined threshold;

The 3rd determines subelement, be used in the judged result of first judgment sub-unit when being, determine that described volume is normal, when the judgement of first judgment sub-unit obtains described volume value less than first defined threshold, determine that described volume is undesired, and described volume is less than normal quantity, and judges when obtaining described volume value greater than second defined threshold in first judgment sub-unit, determine that described volume is undesired, and described volume is greater than normal quantity.

11. pick-up unit as claimed in claim 10 is characterized in that, when the 3rd determined that subelement is determined described volume less than normal quantity, described Tip element specifically comprised:

Second judgment sub-unit, whether the ratio of intensity level that is used for judging the intensity level of air-flow composition of described sound signal and voice signal is less than the 3rd defined threshold;

The first prompting subelement is used for judging when obtaining described ratio less than the 3rd defined threshold in second judgment sub-unit, sends to reduce and the information of the distance between the input equipment of recording, and

When second judgment sub-unit judges that obtaining described ratio is not less than the 3rd defined threshold, send the information that increases the pronunciation volume.

12. pick-up unit as claimed in claim 10 is characterized in that, when the 3rd determined that subelement is determined described volume greater than normal quantity, described Tip element specifically comprised:

The 3rd judgment sub-unit, whether the ratio of intensity level that is used for judging the intensity level of air-flow composition of described sound signal and voice signal is greater than the 4th defined threshold;

The second prompting subelement is used for when the judgement of the 3rd judgment sub-unit obtains described ratio greater than the 4th defined threshold, sends the information of distance between increase and the recording input media, and

When the 3rd judgment sub-unit judges that obtaining described ratio is not more than the 4th defined threshold, send the information that reduces to pronounce volume.

13. an auxiliary oral language examination system comprises auxiliary oral language examination client and monitoring server, it is characterized in that, wherein:

The auxiliary oral language examination client, be used for the recording audio signal, and when in determining described sound signal, having voice signal, obtain amplitude information and signal to noise ratio (S/N ratio) information in the described sound signal, and according to described amplitude information and signal to noise ratio (S/N ratio) information, judge the volume of the voice signal in the described sound signal when undesired, send prompting at the voice signal volume is undesired; And

Be used for obtaining second characteristic information of the sound signal of recording in the stipulated time length, and according to described second characteristic information that obtains, judge sound signal in this stipulated time length when undesired, with the sound signal in this stipulated time length and should stipulated time length in sound signal be that the information of abnormal signal sends to monitoring server;