Audio/Video Transcription API

REST API Documentation

Introduction

The Scribie.com HTTP REST API provides a programmable interface to our transcription service. The core functionality provided by our web interface is made available through this API. An application can upload files to our servers, pay for transcription, check the transcript progress, fetch the transcripts and delete the files.

This document describes the resources and methods which are exposed by the API. The target audience for this document are server side application developers. The curl examples are used throughout this document to illustrate the usage of the API.

This API exposes two resources: files and file. The file resource refers to an individual file whereas the files resource is a collection of file's. The GET requests can be used to retrieve the resource and any meta data associated with it. The POST request modifies the state of the resource.

The endpoint URL for the API is https://api.scribie.com/v1/. Only HTTPS is supported. HTTP requests will be redirected to HTTPS. The API is currently in version 1. The encoding used is application/json; charset=utf-8, except for the files POST request which is multipart/form-data.

A sandbox environment is available for testing at the endpoint URL https://api-sandbox.scribie.com/v1/. The data is retained for 24 hours only.

An account on Scribie.com with a saved payment method is required in order to use this API. Please sign-up if you do not have an account yet.

Only email support is available for the API. Please send your questions to api@scribie.com.

Authentication

This API uses HTTP Basic authentication with the API key as the user. The API key can be generated from the account settings page. All requests must be authenticated.

Example Request

curl -X GET --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/files

The API key should be kept secret in order to prevent misuse of the account.

Errors

This API returns errors via HTTP response codes. In general, codes in the 2xx range indicate success, codes in the 4xx range indicate an error that resulted from the provided information (e.g. a required parameter was missing, or file type is unsupported, etc.). The 500 error code is returned if an unexpected application error occurs during processing. Additionally a error string is provided as a hint in the response body.

Example Response

{"error":"incorrect api key"}

Error Reference

CodeDescriptionAPIResolution
400Bad requestfiles POSTThe file parameter is missing
Options parameter is incorrectly formatted
401UnauthorizedAllThe API Key is missing or was not found
404Not foundfile GET
Incorrect file id
File was deleted
405Method not allowedAllThe error message has a list of allowed methods
409Conflictfile DELETEThe transcript progress is >= 60% and file cannot be deleted at this stage, try again after 100%
413Request entity too largefiles POSTFile size is > 10 GB
415Unsupported file typefiles POST
file GET
The error message has a list of supported file types
500Internal server errorAllContact support with the error message

Files Resource

Methods allowed: GET, POST

This resource represents a collection of the file resource. The GET method returns a list of file identifiers. The identifier can be used to send requests to the individual file resource.

Example Request

curl -X GET --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/files

Example Response

[  
   {  
      "id":"5a5290c74787b392e31bfc81124ec11c9abf622c"
   },
   {  
      "id":"f8c0523d6b799b79ea9e40034df294094a7b00cc"
   }
   {...},
   {...}
]

This POST request is used to create a new file. The following pre-conditions should be met for this request to be processed.

  • The request must contain a file parameter in the body.
  • The content type must be multipart/form-data.
  • Optionally an options parameter can be sent. The options parameter must be a URL encoded JSON string, if it is sent.
  • The file size must be less than 10 GB.
  • The file format must be one of the following: mp3, wav, wma, wmv, avi, flv, mpg, mpeg, mp4, m4a, m4v, mov, ogg, webm, aif, aiff, amr, 3gp, 3ga, mts, ogv, aac, mkv, mxf, opus. The format is determined from the MIME type of the file.

The response code is 201 if the file creation was successful. The credit card is charged towards the transcription cost on successful creation.

Example Request

curl -X POST --user 764118cada1d4d7c899999434c00e761ac24b5c3: -F file=@test.wav \
  https://api.scribie.com/v1/files

Example Response

{  
   "id":"5a5290c74787b392e31bfc81124ec11c9abf622c"
}

The default transcript order options is applied in case the options parameter is not detected, or if the parsing fails. The options parameter can be encoded as shown in the following example.

$ node
> encodeURIComponent(JSON.stringify({
... "transcript_type" : "budget",
... "strict_verbatim": "off",
... "subtitle_file": "off",
... "time_coding": "on",
... "speaker_tracking": "on",
... "speaker_tracking_format": "initials",
... "spelling_style": "american",
... "transcript_template": "blank single line spaced",
... "speaker_names" : [ "John Doe", "Jane Doe"],
... }));
'%7B%22transcript_type%22%3A%22budget%22%2C%22strict_verbatim%22%3A%22off%22%2C%22subtitle_file%22%3A%22off%22%2C%22time_coding%22%3A%22on%22%2C%22speaker_tracking%22%3A%22on%22%2C%22speaker_tracking_format%22%3A%22initials%22%2C%22spelling_style%22%3A%22american%22%2C%22transcript_template%22%3A%22blank%20single%20line%20spaced%22%2C%22speaker_names%22%3A%5B%22John%20Doe%22%2C%22Jane%20Doe%22%5D%7D'

This URL encoded options string can be then passed in the POST request.

curl -X POST --user 764118cada1d4d7c899999434c00e761ac24b5c3: -F file=@test.wav -F options=%7B%22transcript_type%22%3A%22budget%22%2C%22strict_verbatim%22%3A%22off%22%2C%22subtitle_file%22%3A%22off%22%2C%22time_coding%22%3A%22on%22%2C%22speaker_tracking%22%3A%22on%22%2C%22speaker_tracking_format%22%3A%22initials%22%2C%22spelling_style%22%3A%22american%22%2C%22transcript_template%22%3A%22blank%20single%20line%20spaced%22%2C%22speaker_names%22%3A%5B%22John%20Doe%22%2C%22Jane%20Doe%22%5D%7D https://api.scribie.com/v1/files

Options Reference

FieldValuesDefault
transcript_type"budget"
"regular"
"rush"
"regular"
strict_verbatim"on"
"off"
"off"
subtitle_file"on"
"off"
"off"
time_coding"on"
"off"
"on"
speaker_tracking"on"
"off"
"on"
speaker_tracking_format"initials"
"full names"
"initials"
spelling_style"american"
"british"
"canadian"
"australian"
"american"
transcript_template"scribie single line spaced"
"scribie double line spaced"
"blank single line spaced"
"blank double line spaced"
"scribie single line spaced"
speaker_namesArray of names, eg. ["John Doe", "Jane Doe"]None

The options are case insensitive. The default options can be overridden from the settings page. Some of these options are charged. Please check the Pricing and the service details page for more information.

File Resource

Methods allowed: HEAD, GET, DELETE

This resource represents an individual file. The HEAD method returns meta data in the X-Scribie-Metadata header as a JSON string. All dates in the meta data are in the ISO 8601 format. It also contains the options which were applied. The scheduled delivery date is returned in the delivery parameter. The progress_percent parameter is a measure of the completeness of the transcript.

Example Request

curl -I -X HEAD --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/file/5a5290c74787b392e31bfc81124ec11c9abf622c

Example Response

HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Tue, 20 Jan 2015 07:17:19 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 183
Connection: keep-alive
X-Powered-By: Express
X-Scribie-Metadata: {"name":"test","created":"2015-01-20T12:47:06Z","seconds":"2","bytes":"13824","cost":"0.03","discount":"0","total":0.03,"transaction_id":"c684ty","refund_amount":"0","currency":"USD","delivery":"2015-01-27T12:47:11Z","delay_reason":null,"duplicates":null,"options":{"transcript_type":"budget","time_coding":"on","strict_verbatim":"off","subtitle_file":"off","speaker_tracking":"on","speaker_tracking_format":"full names","spelling_style":"american","speaker_names":["John Doe","Jane Doe"],"transcript_template":"blank single line spaced"},"progress_percent":0}
Content-Disposition: attachment; filename="test.txt"
Strict-Transport-Security: max-age=63072000

GET requests to this resource returns the transcript. We provide transcripts in .doc, .pdf, .odt and .txt formats, and .sbv and .srt formats if the subtitle_file option is enabled. The extension can be appended to the file identifier to retrieve the desired format. The default format is .txt. If the progress is less than 100% then the Work-In-Progress transcript is returned in .txt format.

Example Request

curl -X GET --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/file/5a5290c74787b392e31bfc81124ec11c9abf622c.txt

Example Response

NOTICE
Work-In-Progress Transcript
This transcript is not yet complete. It may have missing sections, higher number of mistakes, blanks and speaker tracking inconsistencies.

----
00:01 Austin Tuan: Hello.

00:05 Rajiv Poddar: Hello. 

00:07 AT: Yes. Rajiv can you hear me. 

00:10 RP: Yeah, I can hear you. 

00:11 AT: Hi. This is Austin Taun from Taipei, Taiwan. 

00:17 RP: Hi Austin. How are you. 

00:19 AT: I'm fine. Okay, so today, Rajiv could you first introduce yourself a little bit.

...

The DELETE request removes the file from our server and all the meta data associated with it. The file can be deleted only if the progress is below 60%, or when the transcript is complete, i.e., 100%. In the former case, a refund is also issued. The refund amount proportional to the progress and is issued after 1 business day. The response body is empty if the operation is successful.

Example Request

curl -X DELETE --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/file/5a5290c74787b392e31bfc81124ec11c9abf622c

A subsequent GET will return a 404 not found error.

Webhook

A webhook URL can be configured from the settings page. On any progress update, the file id is POST'ed to the URL. The request is JSON encoded. A subequent HEAD on the id returns the updated progress value in the X-Scribie-Metadata header.

Sample POST Body

{  
   "id":"5a5290c74787b392e31bfc81124ec11c9abf622c"
}

Usage Example

Assuming that your Scribie.com account is set up (signed up, saved a credit card and generated an API key), the following steps have to be performed in order get the transcript of test.wav file.

Step 1: POST the file to the files resource. This returns the id of the file which should be saved.

curl -X POST --user 764118cada1d4d7c899999434c00e761ac24b5c3: -F file=@test.wav \
  https://api.scribie.com/v1/files

Step 2: Send a GET request to the file with the id, parse the X-Scribie-Metadata header in the response and check the progress_percent field. If progress is less than 100 then the Work-In-Progress transcript will be returned, otherwise the final transcript in text format will be returned. If the Webhook is configured, this request should be sent immediately after receiving the Webhook POST.

curl -I -X GET --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/file/<id returned in step 1>

Step 3: When the progress_perecent is 100, you can send a GET request for the Word Document as well and save it somewhere.

curl -X GET --user 764118cada1d4d7c899999434c00e761ac24b5c3: \
  https://api.scribie.com/v1/file/<id returned in step 1>.doc -o test.doc

That's it! These three requests are equivalent to uploading a file, ordering the transcript and then downloading the transcript through our web interface.

Support

We provide only email support for the API. Please send an email to api@scribie.com for any questions or assistance.