Search our FAQ for answers to anything you might ask.

Make use of the best online transcription service


Get started with using our service/product to transcribe your files

Customer Guide
Transcription Community Guide


Find the answer to any question related to working for us as a transcriber

Transcriber Guide

Manual Transcription

Manual transcription is done by our certified transcribers and we guarantee 99% accuracy for it.
You have to upload your files and pay by the audio minute. The payment has to be made in advance. The work will be distribute to our transcribers and the transcript will be delivered to your account once all the 4-steps of our process is complete. You can check the transcript online with our editor, make any changes if required and download the Word document for your use. We will also make any changes at your request.
For a 1 hour file, the manual transcript will cost $48. Please use the cost estimator to get the exact cost or request a custom quote.
We support all major credit cards and PayPal.
Yes, we support billing accounts for larger projects. A contract has to be executed to set up a billing account. We only support ACH payments for billing currently.
Yes, we offer subscription plans. Know More
Yes, we provide the transcript in Word format along with Open Document Text, PDF and plain text formats.
Yes. We mark each speaker with their initials (if names are provided/available) or as Speaker 1, Speaker 2, Speaker 3 etc., and provide an option to enable/disable it. However, the speaker tracking is best effort for more than 4 speakers and may be off in places.
Yes. Our blog has many political speeches that we have transcribed over the years. You can also try out out editor from there.
For manual transcripts, we guarantee 99% accuracy, unless the audio quality is very poor or the accents are hard.
We support only English at the moment. Support for other languages is on our roadmap. Please join our newsletter to get notified once we release it.
The transcripts are prepared by our freelance transcribers who correct the automated transcripts by following a formal 4-step process.
We support almost all open audio/video file formats. We do not support any propreitary formats such as WebEx ARF or Olympus DSS. However you can convert those files into an mp3 or mp4 file using converters provided by your vendor and upload those to Scribie.
We recommend Audacity. Its a free open source program and has a wide support for different file types.
No, since we still have to pay our transcribers for the full duration of the file. However you can edit out the Non-English parts and save on the cost.
Yes, we restrict access strictly on a need to know basis. Our transcribers are also bound by the NDA in our terms of service. We also take a number of measures to ensure that your data remains secure. We absolutely do not sell or share your data with any third parties.
Absolutely. Please contact us with your agreement and we will get back to you in 1-2 business days.
Yes, the turn around time includes weekends and holidays.
Our transcribers work 24x7. Live chat, phone and email support is available on the weekdays and only email support is available on the weekends.

Additional Charges

High difficult level files require more QA effort to ensure 99% accuracy. Therefore we ask for additional payment so that we can pay our transcribers more for such files.
It varies from $0.50/minute to $2.00/minute of audio, depending upon the number of issues identified.
We use a mix of algorithms and human checks to identify the audio issues. We first predict percentage of corrections required on the automated transcript and then manually check any file with over 20% corrections.
We check for the following issues:
Issue Example
Ambient Noise hiss, line noise, static, background music/voices
Noisy Environment street, bar, restaurant or other loud noises in background
Distant Speakers faint, distant voices
Accented Speakers British, Australian, Indian, Hispanic, any other non-American
Audio Breaks bad phone line, audio gaps
Disturbances loud typing sounds, rustling, wind howling, breathing sounds
Distortion volume distortion, shrill voices, clipping, artifacts
Unclear Speakers muttering, volume variation, frequent overlaps
Echo reverberation, same voice can be heard twice
Quality low sampling/bit rate, bad conference line, recorded off speakers
Diction slurring, rapid speaking, unnatural pronunciation
Muffled hidden or obstructed microphone, vintage tapes
Blank only music, only background conversation, only non-English
Yes, of course. Our decision regarding the issues is final and binding as per our terms of service and any disagreement will be resolved with a full refund.
Most our transcribers are comfortable with American accents. Proficiency in other accents is a highly valuable and sought after skill amongst transcribers and we have pay extra for that.
No. Charges have to explicitly approved for us to continue working on the file. You can also opt to cancel the order automatically in case of any additional charges are levied.
The best way to avoid is to ensure that recording is clean and high quality. Please check our Recording Guide for our recommendations.
Most probably the file was borderline and the issues were not significant enough to cause concern.
Our transcribers tend to avoid files with audio issues as it may affect their grade. These files are selected only when all the other files have been taken and therefore we have to allow it to stay in the queue longer.
Most services do not charge by the difficulty level. This is a policy is unique to Scribie as our focus is high accuracy rather than speed or cost.
Since we pay our transcribers higher for difficult files, the accuracy will be higher compared to other services. If you do not require a high accuracy transcript, or if speed/cost is more important, then you may get a better service elsewhere.
Our best effort transcript is the draft transcript we provide within 1 day. However, since we grade our transcribers by the mistakes, we have to eventually correct them. Otherwise it is unfair to them and harmful to our system.
No, you have to wait for us to screen the file and identify the issues so that we can determine the rate to be applied.

Automated Transcription

Automated transcription is a transcript generated by our AI. It has an accuracy of 80-95% and may requires corrections.
You have to upload your files and pay by the audio minute. The payment has to be made in advance. The file will be transcribed by our app and you can use our editor to make corrections afterwards.
A majority of the automated transcripts are delivered within 30 minutes. It may require around 2 hours if our queue is overloaded, but that is rare.
The accuracy level of automated transcripts is highly dependent on the quality of the audio file. Clean and non-Accented files produce the best result with close to 90% accuracy. The accuracy keeps going down with more and more issues with the audio.
We are able to provide an estimate of the accuracy level of the transcripts. The estimate is based on a machine learning algorithm which analyzes the automated transcript to provide the estimate. In our tests, the accuracy estimate is correct to +/- 5% for a 99 of the files.
Yes, you can place the manual transcript order anytime. Our transcribers will make the corrections to the automated transcript.
We offer a 15% discount for the manual transcript order if the automated transcript estimated accuracy 80% or more. For other files, we can offer a credit.
No, our automated service is an as-is service and we do not offer any refunds. However, we can offer a credit which you can use for your subsequent orders.
We have built a deep learning based speech and language model based on our own dataset. We have been in business since 2008 and have a large real world dataset. Therefore our models tend to perform better for real world scenarios.
Yes, currently our speech model's WER is 5% on the LibrisSpeech dataset and around 11% on the TED dataset.

Speech Recognition

If you have audio files and are looking to build your own speech recognition engine, then we can transcribe the files for you turn it into a dataset. We can also augment your dataset with our own data to add variety into it. As we are a accuracy focussed service, the dataset is guaranteed to be of high quality and will result in a robust model.
We can work with as little as 100 hours of audio files to create a dataset of 1000+ hours by mixing our own data with yours.
The audio files are in WAV format and the transcripts are in plain text format. We also provide a manifest to map the audio files to their corresponding transcripts. The audio files are in 16KHz, mono format. We also provide the full transcripts of your files.
It depends on the amount of data you have. We have to first transcribe your files and then mix it with our data. Typically, a 100 hour dataset will require around 1 week to fully transcribe and convert into a 1000+ hours dataset.
It depends on your requirements. Please contact us to get a quote.

Speech & Language Modeling

If you are looking to build a speech recognition engine with your own data, we can mix it with our own dataset and build a speech and language model for you. You can save a lot of engineering time and effort and speed up your project by using this service.
Yes, as the performance will be better for your types of files. E.g., if your data is mostly conference calls, the speech recognition engine will perform better if it is trained on your own dataset.
Our models are based on DeepSpeech 2. However, we can also work with your codebase and your network.
It depends on where you want to train it and your network. For our network and our dataset, it takes around 3 weeks to train the model.
We use KenLM for language modeling. However, we can also build the language model on your codebase.
It depends on the requirements. Typically the cost will be north of $100K. Please contact us for a quote.