Posts Tagged ‘typing’

Transcription System: Transcription & Reviews

Wednesday, April 18th, 2012

This is a series of posts on our human-powered audio transcription system. The following are links to the previous parts: Overview, Workflow, Certification.

The Transcription & Review subsystem is where the bulk of the work gets done. In transcription the file is played back and typed into something called a raw transcript. This is the first pass transcripts where the incomprehensible parts are marked with blanks. In review this raw transcript is checked against the audio mistakes. Timestamps and speaker tracking is also added during review. The output of both these steps produces a fairly accurate textual representation of the audio file.

In our workflow, we first break up the files into smaller parts. Our certified transcribers — the one’s who have successfully cleared the Transcription Test – can then login to their account and select these part files. Another innovation of our system is that we don’t actually assign files to them. Instead they are asked to choose from the files available. They can preview the file and check the quality before choosing. This creates a competition which in turn ensures that files get done quickly.

For performance monitoring we use a five point grading system; A+/Excellent to D/Poor. The files are graded after the review. Another small innovation of our system is the Diff Preview which shows the changes made during the review. It helps the reviewer to assess the quality of the raw transcript and grade accordingly. Based on the grades a Transcriber can be promoted to a Reviewer. There is a disputes and arbitration system in place too to investigate unfair grading.

Another innovative aspect of our system is it ensures a file is worked on by multiple transcribers and reviewers. The average for a 1 hour file is 15-20. More eyes and more ears on the file does wonders for the transcript quality. During Proofreading all the inconsistencies caused by this methodology are corrected. We will talk about more about Proofreading in the next part fo the series.

Till then if you want a high quality transcript of your audio file which has been checked multiple times by different people, then check out our transcription service today.

The next part of the series is available here.

Transcription System: Workflow

Sunday, April 15th, 2012

This is a series on Scribie.com’s audio transcription system. The first part which provides an overview is here

Our workflow consists of five steps.

File Splitting -> Transcription -> Review -> Proofreading -> Delivery

We start by splitting the file into smaller parts. The file is split at the 6 minute boundary which produces one or more files of duration 6 minutes or shorter. This is the first little innovation of our transcription process. File splitting breaks down the work into smaller manageable chunks. It helps in many ways. The file can be worked on parallelly by number of transcribers. A huge amount of effort is not wasted if one part has to be re-done. Additionally, we can track the progress precisely.

Transcription is the typing part. On an average it takes around 15-20 minutes to transcribe a 6 minute file. For a lot of our transcribers–who are mostly home-based freelancers–this is not a huge investment of time. Therefore splitting increases the likely hood that the file will be transcribed quickly. In fact on an average it takes around 1 to 1.5 hours to complete the transcription part of a one hour file!

The accuracy of the transcript is very low at this stage; typically around 50 to 80%. Therefore we do a review. The transcript is checked against the audio and all mistakes are corrected. Time-coding and speaker tracking is also added at this stage. Review usually takes 5 to 8 minutes of effort. But it takes longer for all the parts to get reviewed because we have fewer reviewers than transcribers. This is by design since we promote only our best transcribers to reviewers. The review drastically improves the accuracy.

Once all parts are transcribed and reviewed, we can combine them together and prepare the final transcript. However one more round of review is required here. That’s because, since different parts are worked on by different people, there are bound to be inconsistencies. Proofreading is done by a one person who goes through all the parts together and corrects them. The proofreader is an employee of CGBiz LLC (our company). They are the best of the best we have. We train them and pay them a monthly salary rather than an hourly rate.

The transcript is almost done now. However things might not be perfect even now. The proofreader can make mistakes, some more research may be required for certain terms, etc. So before the delivery we do some random checks. We try to gauge whether the quality is indeed at the level we want it to be. We also use keyword analysis (tf-idf to be precise) to identify out-of-context terms and inconsistencies. We review it again if we are not happy with it. Over time we have found that a small percentage of files require re-review; around 2%. Those are generally the most difficult of files.

Once we are satisfied that the transcript is perfect, as best as it can be, we deliver the file. The file is converted into MS Word, Adobe PDF, OpenOffice Text and plain text formats and we notify the customer that the transcript is available for download.

All of the above happens in 1 day and is managed by our transcription system. We charge only $0.99 per minute of the audio for it. So if you want to get a high quality transcript quickly, please do try out our transcription service today.

The next part of the series talks about the Certification Subsystem.