Mturk For Audio Transcription

We talked a bit about our platform in the previous post. We are a new kind of platform and the best way to describe us is a Mechanical Turk like system designed specifically for audio transcription. If you are not familiar with Mechanical Turk, it is a service by Amazon where you can post tasks which are meant to be done only by humans. Think image captioning, categorization of articles, audio transcription, etc. Mturk has been used for projects ranging from art to search and rescue operations.

Where Mturk is generic, we are specialized. Our transcription system has been designed ground-up to produce the best possible quality transcript with the least amount of effort. The files are worked on by a number of transcribers (at least 20 for an hour of audio) and checked and re-checked for quality. On an average it takes us around 2 hours to complete a 1 hour file, which betters the industry standard of 4 hours by 50%.

But proofreading is the step which really differentiates us. It is the penultimate step in our transcription process and it improves the quality of the transcript significantly. Quality is the hardest problem to solve in audio transcription. Audio transcription is a laborious task. It takes a lot of time and effort, and if done by a single person there will be mistakes. And if you ask a random person on the internet to do it for you, you have no idea what to expect. They can, and will, type anything and give it back to you.

To solve this problem, we first split the file into smaller parts and then ensure that each part file passes through at least 3 people; a transcriber/reviewer, a proofreader and a QA. Our proofreaders are experienced transcriptionists themselves who are handpicked from the pool of our certified transcribers. We train them and give them tools so that they can work efficiently. We further do QC and do not deliver the file till we are satisfied with the quality. We also use automated tools to help us spot mistakes.

For our customers, we provide a simple and easy interface. On Mturk you have to manually create HITs, choose the workers and then pay them once their work is done. On, you just have to upload your files and place the order. Then sit back and watch as parts of your file gets transcribed. You can check the Work-In-Progress transcript anytime from your account. Once the file is done, you are notified and can download the file in Word, PDF, ODT and TXT formats. We also provide time-coding and speaker tracking by default and a host of other options.

We had a choice when we started work on our transcription system; use Mturk or build our own. We decided to do the latter because we found that the workflow that we had in mind was a bit too involved for Mturk. Building our own also gave us the flexibility and control to tweak the system as and when we desired and over time our system has evolved to a point where it works quite efficiently. We believe that the audio transcription system we have designed produces far better results than an audio file transcribed on Mturk. But don’t take our word for it; try it out for yourself today.

One Comment

  • Joselle Mallada says:

    I’m amazed and very interested in this job. I want to give it a try.

Leave a Reply