Why Do We Still Need Humans For Transcribing Speech

siri errorsSo, how is Siri doing on your iPhone. Would you happily replace her with your secretary?

Personally, I won’t, because there are just too many ‘misses’ and ‘trouble spots’ that I wouldn’t want in my business.

The case is almost the same when you count upon software to transcribe your audio files instead of their ‘time-consuming’ human counterparts. Unfortunately, despite several attempts, science has not yet come up with a software solution that would act like Aladdin’s magic lamp. And from what it seems, the genie isn’t coming out any time soon. Why? The reasons are many.

The English language can be very tricky and hence very difficult to master especially when the learner in question is a transcription-software. Homophones pose a problem that most software find impossible to overcome. For instance, will it be sale or sail, no or know, fair, or fare? The list continues. Unlike us humans who are blest with critical analyzing skills, software cannot comprehend the difference. Plus, making these finer differentiations may be very difficult without a context, which might not appear until further into the conversation.
The problem aggravates when the software needs to transcribe an interview or a dialogue involving many speakers. It is easy to guess why. Each of us has a unique style of speaking. This speech distinction becomes far complex as this personal style of speaking is influenced and shaped heavily by our geographical location, our culture, and our upbringing, to name a few. It is impossible to ‘teach’ so accurate a speech recognition to any software.

Audio quality is yet another issue. And a very important one. Any speech recognition and transcription software would need a clear piece of audio. Anyone in the transcription business would know that an impeccable audio file is a rare phenomenon.

Talking about the accuracy rate of a human transcriptionist versus a software-driven one, Xuedong Huang, a senior scientist at Microsoft says, “If you have people transcribe conversational speech over the telephone, the error rate is around 4 percent. If you put all the systems together—IBM and Google and Microsoft and all the best combined—amazingly the error rate will be around 8 percent.”

Now the real question is, would you settle for something that is twice as bad as humans? We know the answer. That is why we offer transcription service that is among the best in the industry. Start uploading your files now!

2 Comments

  • Rafa says:

    The interesting thing is that “humans” also make mistakes. Check your title dear blogger.

  • Tracy says:

    There is no perfect anything, dude, get real 🙂 I am a transcriptionist and have been for 30+ years. I’m pretty sure I could outwit Siri.

Leave a Reply