| Voice Recognition |
Articles about Speech Recognition |
Website Links For Speech Recognition |
Information AboutVoice Recognition |
|
CLASSIFICATION Such systems can be classified as to
Speaker-dependent systems requiring a short amount of training can capture continuous speech with a large vocabulary at normal pace with an accuracy of about 98% (getting two words in one hundred wrong) if operated under optimal conditions. Other "limited vocabulary" systems require no training can recognize a small number of words (for instance, the ten digits) from most speakers. Such systems are popular for routing incoming phone calls to their destinations in large organisations. USE Commercial systems for speech recognition have been available off-the-shelf since the 1990s . Despite the apparent success of the technology, few people use such speech recognition systems on their Desktop Computers . It appears that most computer users can create and edit documents and interact with their computer more quickly with conventional input devices, a Keyboard and Mouse , despite the fact that most people are able to speak considerably faster than they can type. Using both keyboard and speech recognition simultaneously, however, can in some cases be more efficient than using any one of these inputs alone. A typical office environment, with a high amplitude of background speech, is one of the most adverse environments for current speech recognition technologies, and large-vocabulary systems with speaker-independence that are designed to operate within these adverse environments have significantly lower recognition accuracy. The typical achievable recognition rate As Of 2005 for large-vocabulary speaker-independent systems is about 80%-90% for a clear environment, but can be as low as 50% for scenarios like cellular phone with background noise. Additionally, heavy use of the speech organs can result in Vocal Loading . Speech recognition systems have found use where the speed of text input is required to be extremely fast. They are used in legal and medical transcription, the generation of subtitles for live sports and current affairs programs on television; not directly but via an operator that re-speaks the dialog into software trained in the operator's voice; in such cases the operator also has special training, first to speak clearly and consistently to maximize recognition accuracy, second to indicate punctuation by various techniques, and also often domain-specific training (especially in medical or legal contexts where the operator needs to know specialized vocabulary and procedures). In courtrooms and similar situations where the operator's voice would disturb the proceedings, he or she may sit in a soundproofed booth or wear a Stenomask or similar device. Speech recognition is sometimes a necessity for people who have difficulty interacting with their computers through a keyboard, for example, those with serious Carpal Tunnel Syndrome , impaired extremities, or other physical limitations. Speech recognition technology is used more and more for telephone applications like travel booking and information, financial account information, customer service call routing, and Directory Assistance . Using constrained grammar recognition (described below), such applications can achieve remarkably high accuracy. Research and development in speech recognition technology has continued to grow as the cost for implementing such voice-activated systems has dropped and the usefulness and efficiency of these systems has improved. For example, recognition systems optimized for telephone applications can often supply information about the confidence of a particular recognition, and if the confidence is low, it can trigger the application to prompt callers to confirm or repeat their request (for example "I heard you say 'billing', is that right?"). Furthermore, speech recognition has enabled the automation of certain applications that are not automatable using push-button interactive voice response (IVR) systems, like directory assistance and systems that allow callers to "dial" by speaking names listed in an electronic phone book. Nevertheless, speech recognition based systems remain the exception because push-button systems are still much cheaper to implement and operate.''' Speech recognition is also used for speech fluency evaluation and language instruction. KEY PROBLEMS
SOLUTIONS Modern general-purpose speech recognition systems are generally based on Hidden Markov Model s (HMMs). This is a statistical model which outputs a sequence of symbols or quantities. Having a model which gives us the probability of an observed sequence of acoustic data given one or another word (or word sequence) will enable us to work out the most likely word sequence by the application of Bayes' Rule : : |
|
|