The Foibles of Speech Recognition
In this day and age, more and more of what we are doing is becoming automated. One example would be banking. You don’t have to go to a bank anymore to deposit or transfer money. You can do that from an app or just log on to their website. Some banks don’t even have physical buildings. Human interaction and the component of business is becoming more and more limited. Continue reading “Speech Recognition Software Falls Short on Transcription”
So, how is Siri doing on your iPhone. Would you happily replace her with your secretary?
Personally, I won’t, because there are just too many ‘misses’ and ‘trouble spots’ that I wouldn’t want in my business.
The case is almost the same when you count upon software to transcribe your audio files instead of their ‘time-consuming’ human counterparts. Unfortunately, despite several attempts, science has not yet come up with a software solution that would act like Aladdin’s magic lamp. And from what it seems, the genie isn’t coming out any time soon. Why? The reasons are many.
The English language can be very tricky and hence very difficult to master especially when the learner in question is a transcription-software. Homophones pose a problem that most software find impossible to overcome. For instance, will it be sale or sail, no or know, fair, or fare? The list continues. Unlike us humans who are blest with critical analyzing skills, software cannot comprehend the difference. Plus, making these finer differentiations may be very difficult without a context, which might not appear until further into the conversation.
The problem aggravates when the software needs to transcribe an interview or a dialogue involving many speakers. It is easy to guess why. Each of us has a unique style of speaking. This speech distinction becomes far complex as this personal style of speaking is influenced and shaped heavily by our geographical location, our culture, and our upbringing, to name a few. It is impossible to ‘teach’ so accurate a speech recognition to any software.
Audio quality is yet another issue. And a very important one. Any speech recognition and transcription software would need a clear piece of audio. Anyone in the transcription business would know that an impeccable audio file is a rare phenomenon.
Talking about the accuracy rate of a human transcriptionist versus a software-driven one, Xuedong Huang, a senior scientist at Microsoft says, “If you have people transcribe conversational speech over the telephone, the error rate is around 4 percent. If you put all the systems together—IBM and Google and Microsoft and all the best combined—amazingly the error rate will be around 8 percent.”
Now the real question is, would you settle for something that is twice as bad as humans? We know the answer. That is why we offer transcription service that is among the best in the industry. Start uploading your files now!
Subtitling or captioning is an important aspect of an audio-visual experience. It helps your content reach a wider audience, especially people who are not familiar with the language or are hard of hearing. If you are an online marketer, transcripts or subtitles have a very important role to play from SEO point of view as well. Needless to say that a subtitle can only be useful when it is accurate, else all your effort despite your noble intention is bound to fail.
Why is accuracy important?
Accurate transcription is often the first step to an impeccable subtitling, because the transcripts are used for creating and synchronising the captions. According to FCC, “Captions must match the spoken words in the dialogue, in their original language (English or Spanish), to the fullest extent possible.” Correctness of a transcript does not only refer to its grammar, spelling, and punctuation, but also includes non-speech sounds such as a music playing, an off-screen whistle if that is important to the plot, people clapping, and the likes. In fact, the FCC’s new caption quality standards make it essential to capture non-speech sounds.
Transcription service Vs. automatic speech recognition
When you upload a video on YouTube, you can use their automatic captioning function to add subtitles to your video. While this might look like the best solution as far as cost is considered, sadly, that is not the case. These kinds of programs come with questionable accuracy levels, which apart from causing you embarrassment, might also be detrimental to the SEO strategy of your video. In fact, Google can even mark your website as ‘pure spam’, a term it uses to define a “site that appears to use aggressive spam techniques such as automatically generated gibberish” among many other things. Since reports say that a transcript is only 60 to 70 percent correct when using an automatic speech recognition technique, your captions will have 1 in every 3 words wrong, and might appear as gibberish to Google.
Hiring the best transcription service is beneficial at many levels. If you are looking for one, look no further than Scribie. With years of experience and a team of experts, we offer a quality that makes us proud and our customers happy. Start uploading your files today!