How AI-powered Voice Transcription Is Growing In Today’s World

by | Published on Feb 25, 2020 | Audio Transcription

Share this:

AI-powered Voice Transcription Is Growing In Today’s World

Many organizations rely on professional transcription services for transcribing their seminars, conferences or meetings into error-free transcripts. A recent development in the transcription sector is the introduction of voice recognition software. It is a computer software program that can decode the human voice and convert your spoken words into text format. Today, many highly specialized areas of businesses use voice recognition software to capture error-free customer data that helps businesses to take the right decisions.

Voice recognition software has high speed typing capability and can quickly transcribe any audio or video recording. This helps businesses to benefit from huge productivity. It also has multiple operations that help anyone to dictate email to a new customer or even dictate to enter customer’s contact details. Some voice recognition software systems also offer cloud component that makes these systems more flexible and reliable for any business set up. Voice recognition software can also be of help to the legal section in a business organization. This innovative system helps save time and improves efficiency.

AI-powered Voice Transcription App is an AI-powered transcription app that is also useful for note takers. This new app has received a strategic investment from Japan’s leading mobile operator NTT DOCOMO Inc. DOCOMO and Otter are teaming up for expanding Otter into the Japanese market. DOCOMO will be integrating Otter with its own AI-based translation service subsidiary, Mirai Translation, to provide accurate English transcripts, which are then translated into Japanese. An investment of approximately $10 million was made by  DOCOMO and Otter has raised $23 million in funding from NTT DOCOMO Ventures, Fusion Fund, GGV Capital, Draper Dragon Fund, Duke University Innovation Fund, Harris Barton Asset Management, Slow Ventures, Horizons Ventures and others.

Otter’s service was launched in 2018 and it helps users to search voice conversations as easily as they can today search their email or their text. Otter CEO and founder Sam Liang, along with a team from Google, Facebook, Nuance, Stanford, Duke, MIT and Cambridge, developed a technology specifically designed to capture conversations like meetings, interviews, presentations, lectures and more. This is a different technology than the voice assistants Google Assistant, Siri, Alexa etc.

After its launch, Otter has expanded its product to millions of users and now offers both an Otter for Teams and enterprise tier. The objective between this collaboration is to bring the Otter enterprise collaboration service to the Japanese market. has similar partnership with US businesses, including Zoom Video Communications and Dropbox. As a result of new partnership, Otter’s Voice Meeting Notes application is being used on trial basis in Berlitz Corporation’s English language classes in Japan. Students are also using Otter to transcribe their classes and review their lessons.

DOCOMO is featuring Otter during demonstrations at DOCOMO Open House 2020 taking place in the Tokyo Big Sight exhibition complex January 23 and 24, 2020. Here Otter transcribes the English language presentation in real time and then it is translated to Japanese using DOCOMO‘s translation machine. Otter is hiring new engineers to improve AI technologies in speech recognition diarization, speaker identification and automatic summarization. Speaker diarization is a parameter that tells the software to identify the different speakers in the recording. This feature can identify when speakers change and label by number the different voices. Each word is tagged with a number assigned to individual speakers. The words spoken by the same speaker will have the same number.

Although this new software helps to save time and ensure quick voice transcription using AI, it will require human assistance to ensure accuracy. AI speech recognition has come a long way. But it has a drawback that this software only learns after a mistake is made and corrected. This technology will take time to understand nuances and slang that is often found in audio files and this affects the accuracy rate of transcripts. Human transcription is done with real people who listen to the audio and transcribe it to accurate notes. You can expect a much higher level of accuracy in transcripts created by professional transcription company. Machine-made scripts need to be edited and verified by an experienced transcriptionist. Advances in technology are always welcome because of the speed, efficiency and convenience these offer. However, human intervention may still be needed to complement these advancements.

Related Posts