AnyTranscription upgrades its speech recognition technology to offer improved speech recognition ser

Protected by Copyscape Unique Content Check
Published: 06th February 2017
Views: N/A

AnyTranscription has comprehensively upgraded its speech recognition technology, massively expanded its voice and text corpus resources to meet the demands of the transcription industry, and guaranteed decoding speeds via its weighted finite-state transducer (WFST) decoding network base. These upgrades allow AnyTranscription to provide more accurate transcription services in less time.
With the development of modern technology, we have been able to obtain a corpus of text or speech through a variety of methods. This provides abundant resources for the training of language models and acoustic models within speech recognition, making it possible to construct large-scale general-purpose language and acoustic models. In speech recognition, the matching and richness of training data is one of the most important factors to improve system performance; however, the labeling and analysis of a corpus requires long-term accumulation and precipitation.
With over ten years' experience in phonetic transcription services, AnyTranscription has accumulated a large text and speech corpus, and can construct large-scale general-purpose language and acoustic models which focus on the transcription industry. At the same time, AnyTranscription has adopted a WFST decoding network base, which can integrate these language models, lexicon and acoustics and share audio and text into a large decoding network, greatly improving the speed of decoding.
However, current speech recognition technology still has the following problems:
1. Recognition and understanding of natural language. First, we should deconstruct continuous speech into words, phonemes and so on. Second, we should establish rules to understand meaning.
2. Voice information. Speech patterns not only vary between different speakers, but also vary for one speaker. For example, speech information is different when a speaker is speaking casually from when he/she is speaking formally.
3. The ambiguity of speech. When a speaker is speaking, different words may sound similar.
4. The phonetic characteristics of a single letter or word are influenced by the context, which changes the stress, tone, volume, and pronunciation.
5. Environmental noise and interference have a serious impact on speech recognition, resulting in low recognition rates.
These problems have plagued customers in need of accurate transcripts, and are also the reason AnyTranscription has always adhered to using manual auditors. AnyTranscription provides manual review for two aspects of content:
1. Conducting a final transcription and review after speech recognition to verify the accuracy of the transcript.
2. Composing and writing the transcript in a format specified by the customer.
AnyTranscription focuses on improving customer satisfaction and constantly working on its own strengths. With the dual safeguards of speech recognition technology and manual review, customers can enjoy transcriptions with over 98% accuracy in a very short period of time and presented in the format they require.

This article is copyright


Report this article Ask About This Article


Loading...
More to Explore